Passed
Pull Request — master (#207)
by
unknown
02:09
created

Selector::flattenOptions()   A

Complexity

Conditions 3
Paths 3

Size

Total Lines 10
Code Lines 5

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 6
CRAP Score 3

Importance

Changes 0
Metric Value
cc 3
eloc 5
nc 3
nop 1
dl 0
loc 10
ccs 6
cts 6
cp 1
crap 3
rs 10
c 0
b 0
f 0
1
<?php
2
3
declare(strict_types=1);
4
5
namespace PHPHtmlParser\Selector;
6
7
use PHPHtmlParser\Dom\AbstractNode;
8
use PHPHtmlParser\Dom\Collection;
9
use PHPHtmlParser\Dom\InnerNode;
10
use PHPHtmlParser\Dom\LeafNode;
11
use PHPHtmlParser\Exceptions\ChildNotFoundException;
12
13
/**
14
 * Class Selector
15
 *
16
 * @package PHPHtmlParser
17
 */
18
class Selector
19
{
20
21
    /**
22
     * @var array
23
     */
24
    protected $selectors = [];
25
26
    /**
27
     * @var bool
28
     */
29
    private $depthFirst = false;
30
31
    /**
32
     * Constructs with the selector string
33
     * @param string          $selector
34
     * @param ParserInterface $parser
35
     */
36 315
    public function __construct(string $selector, ParserInterface $parser)
37
    {
38 315
        $this->selectors = $parser->parseSelectorString($selector);
39 315
    }
40
41
    /**
42
     * Returns the selectors that where found in __construct
43
     * @return array
44
     */
45 12
    public function getSelectors()
46
    {
47 12
        return $this->selectors;
48
    }
49
50
    /**
51
     * @param bool $status
52
     * @return void
53
     */
54 252
    public function setDepthFirstFind(bool $status): void
55
    {
56 252
        $this->depthFirst = $status;
57 252
    }
58
59
    /**
60
     * Attempts to find the selectors starting from the given
61
     * node object.
62
     * @param AbstractNode $node
63
     * @return Collection
64
     * @throws ChildNotFoundException
65
     */
66 138
    public function find(AbstractNode $node): Collection
67
    {
68 138
        $collection = new Collection();
69 138
        foreach($this->findGenerator($node) as $item) {
70 129
            $collection[] = $item;
71
        }
72 138
        return $collection;
73
    }
74
75
    /**
76
     * Attempts to find the selectors starting from the given
77
     * node object.
78
     * @param AbstractNode $node
79
     * @return \Generator
80
     * @throws ChildNotFoundException
81
     */
82 303
    public function findGenerator(AbstractNode $node): \Generator
83
    {
84 303
        foreach ($this->selectors as $selector) {
85 303
            $nodes = [$node];
86 303
            if (count($selector) == 0) {
87
                continue;
88
            }
89
90 303
            $options = [];
91 303
            for ($i=0; $i<count($selector); $i++) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
92 303
                $rule = $selector[$i];
93 303
                $isLast = ($i == count($selector) - 1);
94
95 303
                if ($rule['alterNext']) {
96 3
                    $options[] = $this->alterNext($rule);
97 3
                    continue;
98
                }
99 303
                $next = [];
100
101 303
                foreach($this->seek($nodes, $rule, $options) as $foundNode) {
102 246
                    if (!$isLast) {
103 51
                        $next[] = $foundNode;
104
                    } else {
105 246
                        yield $foundNode;
106
                    }
107
                }
108 291
                $nodes = $next;
109
                // clear the options
110 291
                $options = [];
111
            }
112
        }
113 291
    }
114
115
    /**
116
     * Attempts to find all children that match the rule
117
     * given.
118
     * @param array $nodes
119
     * @param array $rule
120
     * @param array $options
121
     * @return \Generator
122
     * @throws ChildNotFoundException
123
     */
124 303
    protected function seek(array $nodes, array $rule, array $options): \Generator
125
    {
126
        // XPath index
127 303
        if (array_key_exists('tag', $rule) && array_key_exists('key', $rule)
128 303
          && is_numeric($rule['key'])
129
        ) {
130 3
            $count = 0;
131
            /** @var AbstractNode $node */
132 3
            foreach ($nodes as $node) {
133 3
                if ($rule['tag'] == '*'
134 3
                  || $rule['tag'] == $node->getTag()
135 3
                    ->name()
136
                ) {
137 3
                    ++$count;
138 3
                    if ($count == $rule['key']) {
139
                        // found the node we wanted
140 3
                        yield $node;
141
                    }
142
                }
143
            }
144
145 3
            return;
146
        }
147
148 300
        $options = $this->flattenOptions($options);
149
150
        /** @var InnerNode $node */
151 300
        foreach ($nodes as $node) {
152
            // check if we are a leaf
153 300
            if ($node instanceof LeafNode || !$node->hasChildren()
154
            ) {
155 12
                continue;
156
            }
157
158 300
            $children = [];
159 300
            $child = $node->firstChild();
160 300
            while (!is_null($child)) {
161
                // wild card, grab all
162 300
                if ($rule['tag'] == '*' && is_null($rule['key'])) {
163 15
                    yield $child;
164 15
                    $child = $this->getNextChild($node, $child);
165 15
                    continue;
166
                }
167
168 300
                $pass = $this->checkTag($rule, $child);
169 300
                if ($pass && !is_null($rule['key'])) {
170 99
                    $pass = $this->checkKey($rule, $child);
171
                }
172 300
                if ($pass && !is_null($rule['key']) && !is_null($rule['value'])
173 300
                  && $rule['value'] != '*'
174
                ) {
175 96
                    $pass = $this->checkComparison($rule, $child);
176
                }
177
178 300
                if ($pass) {
179
                    // it passed all checks
180 243
                    yield $child;
181
                } else {
182
                    // this child failed to be matched
183 276
                    if ($child instanceof InnerNode && $child->hasChildren()
184
                    ) {
185 249
                        if ($this->depthFirst) {
186 231
                            if (!isset($options['checkGrandChildren'])
187 231
                              || $options['checkGrandChildren']
188
                            ) {
189
                                // we have a child that failed but are not leaves.
190 231
                                $matches = $this->seek([$child], $rule, $options);
191 231
                                foreach ($matches as $match) {
192 131
                                    yield $match;
193
                                }
194
                            }
195
                        } else {
196
                            // we still want to check its children
197 72
                            $children[] = $child;
198
                        }
199
                    }
200
                }
201
202 297
                $child = $this->getNextChild($node, $child);
203
            }
204
205 294
            if ((!isset($options['checkGrandChildren'])
206 294
                || $options['checkGrandChildren'])
207 294
              && count($children) > 0
208
            ) {
209
                // we have children that failed but are not leaves.
210 69
                $matches = $this->seek($children, $rule, $options);
211 69
                foreach ($matches as $match) {
212 138
                    yield $match;
213
                }
214
            }
215
        }
216 294
    }
217
218
    /**
219
     * Attempts to match the given arguments with the given operator.
220
     * @param string $operator
221
     * @param string $pattern
222
     * @param string $value
223
     * @return bool
224
     */
225 96
    protected function match(
226
      string $operator,
227
      string $pattern,
228
      string $value
229
    ): bool {
230 96
        $value = strtolower($value);
231 96
        $pattern = strtolower($pattern);
232 64
        switch ($operator) {
233 96
            case '=':
234 96
                return $value === $pattern;
235
            case '!=':
236
                return $value !== $pattern;
237
            case '^=':
238
                return preg_match('/^' . preg_quote($pattern, '/') . '/',
239
                    $value) == 1;
240
            case '$=':
241
                return preg_match('/' . preg_quote($pattern, '/') . '$/',
242
                    $value) == 1;
243
            case '*=':
244
                if ($pattern[0] == '/') {
245
                    return preg_match($pattern, $value) == 1;
246
                }
247
248
                return preg_match("/" . $pattern . "/i", $value) == 1;
249
        }
250
251
        return false;
252
    }
253
254
    /**
255
     * Attempts to figure out what the alteration will be for
256
     * the next element.
257
     * @param array $rule
258
     * @return array
259
     */
260 3
    protected function alterNext(array $rule): array
261
    {
262 3
        $options = [];
263 3
        if ($rule['tag'] == '>') {
264 3
            $options['checkGrandChildren'] = false;
265
        }
266
267 3
        return $options;
268
    }
269
270
    /**
271
     * Flattens the option array.
272
     * @param array $optionsArray
273
     * @return array
274
     */
275 300
    protected function flattenOptions(array $optionsArray)
276
    {
277 300
        $options = [];
278 300
        foreach ($optionsArray as $optionArray) {
279 3
            foreach ($optionArray as $key => $option) {
280 3
                $options[$key] = $option;
281
            }
282
        }
283
284 300
        return $options;
285
    }
286
287
    /**
288
     * Returns the next child or null if no more children.
289
     * @param AbstractNode $node
290
     * @param AbstractNode $currentChild
291
     * @return AbstractNode|null
292
     */
293 297
    protected function getNextChild(
294
      AbstractNode $node,
295
      AbstractNode $currentChild
296
    ) {
297
        try {
298 297
            $child = null;
299 297
            if ($node instanceof InnerNode) {
300
                // get next child
301 297
                $child = $node->nextChild($currentChild->id());
302
            }
303 294
        } catch (ChildNotFoundException $e) {
304
            // no more children
305 294
            unset($e);
306 294
            $child = null;
307
        }
308
309 297
        return $child;
310
    }
311
312
    /**
313
     * Checks tag condition from rules against node.
314
     * @param array        $rule
315
     * @param AbstractNode $node
316
     * @return bool
317
     */
318 300
    protected function checkTag(array $rule, AbstractNode $node): bool
319
    {
320 300
        if (!empty($rule['tag']) && $rule['tag'] != $node->getTag()->name()
321 300
          && $rule['tag'] != '*'
322
        ) {
323 264
            return false;
324
        }
325
326 243
        return true;
327
    }
328
329
    /**
330
     * Checks key condition from rules against node.
331
     * @param array        $rule
332
     * @param AbstractNode $node
333
     * @return bool
334
     */
335 99
    protected function checkKey(array $rule, AbstractNode $node): bool
336
    {
337 99
        if (!is_array($rule['key'])) {
338 96
            if ($rule['noKey']) {
339
                if (!is_null($node->getAttribute($rule['key']))) {
0 ignored issues
show
Bug introduced by
Are you sure the usage of $node->getAttribute($rule['key']) targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
introduced by
The condition is_null($node->getAttribute($rule['key'])) is always true.
Loading history...
340
                    return false;
341
                }
342
            } else {
343 96
                if ($rule['key'] != 'plaintext'
344 96
                  && !$node->hasAttribute($rule['key'])
345
                ) {
346 96
                    return false;
347
                }
348
            }
349
        } else {
350 3
            if ($rule['noKey']) {
351
                foreach ($rule['key'] as $key) {
352
                    if (!is_null($node->getAttribute($key))) {
0 ignored issues
show
Bug introduced by
Are you sure the usage of $node->getAttribute($key) targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
353
                        return false;
354
                    }
355
                }
356
            } else {
357 3
                foreach ($rule['key'] as $key) {
358 3
                    if ($key != 'plaintext'
359 3
                      && !$node->hasAttribute($key)
360
                    ) {
361 1
                        return false;
362
                    }
363
                }
364
            }
365
        }
366
367 99
        return true;
368
    }
369
370
    /**
371
     * Checks comparison condition from rules against node.
372
     * @param array        $rule
373
     * @param AbstractNode $node
374
     * @return bool
375
     */
376 96
    public function checkComparison(array $rule, AbstractNode $node): bool
377
    {
378 96
        if ($rule['key'] == 'plaintext') {
379
            // plaintext search
380
            $nodeValue = $node->text();
381
            $result = $this->checkNodeValue($nodeValue, $rule, $node);
382
        } else {
383
            // normal search
384 96
            if (!is_array($rule['key'])) {
385 93
                $nodeValue = $node->getAttribute($rule['key']);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $nodeValue is correct as $node->getAttribute($rule['key']) targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
386 93
                $result = $this->checkNodeValue($nodeValue, $rule, $node);
387
            } else {
388 3
                $result = true;
389 3
                foreach ($rule['key'] as $index => $key) {
390 3
                    $nodeValue = $node->getAttribute($key);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $nodeValue is correct as $node->getAttribute($key) targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
391 3
                    $result = $result &&
392 3
                        $this->checkNodeValue($nodeValue, $rule, $node, $index);
393
                }
394
            }
395
        }
396
397 96
        return $result;
398
    }
399
400
    /**
401
     * @param string|null  $nodeValue
402
     * @param array        $rule
403
     * @param AbstractNode $node
404
     * @param int|null     $index
405
     * @return bool
406
     */
407 96
    private function checkNodeValue(
408
        ?string $nodeValue,
409
        array $rule,
410
        AbstractNode $node,
411
        ?int $index = null
412
    ) : bool {
413 96
        $check = false;
414
        if (
415 96
            array_key_exists('value', $rule) && !is_array($rule['value']) && 
416 96
            !is_null($nodeValue) &&
417 96
            array_key_exists('operator', $rule) && is_string($rule['operator']) &&
418 96
            array_key_exists('value', $rule) && is_string($rule['value'])
419
        ) {
420 51
            $check = $this->match($rule['operator'], $rule['value'], $nodeValue);
421
        }
422
423
        // handle multiple classes
424 96
        $key = $rule['key'];
425
        if (
426 96
            !$check && 
427 96
            $key == 'class' &&
428 96
            array_key_exists('value', $rule) && is_array($rule['value'])
429
        ) {
430 51
            $nodeClasses = explode(' ', $node->getAttribute('class') ?? '');
0 ignored issues
show
Bug introduced by
Are you sure the usage of $node->getAttribute('class') targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is used.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
if ($a->getObject()) {

The method getObject() can return nothing but null, so it makes no sense to use the return value.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
431 51
            foreach ($rule['value'] as $value) {
432 51
                foreach ($nodeClasses as $class) {
433
                    if ( 
434 51
                        !empty($class) &&
435 51
                        array_key_exists('operator', $rule) && is_string($rule['operator'])
436
                    ) {
437 51
                        $check = $this->match($rule['operator'], $value, $class);
438
                    }
439 51
                    if ($check) {
440 51
                        break;
441
                    }
442
                }
443 51
                if (!$check) {
444 43
                    break;
445
                }
446
            }
447
        } elseif (
448 54
            !$check && 
449 54
            is_array($key) && 
450 54
            !is_null($nodeValue) &&
451 54
            array_key_exists('operator', $rule) && is_string($rule['operator']) &&
452 54
            array_key_exists('value', $rule) && is_string($rule['value'][$index])
453
        ) {
454 3
            $check = $this->match($rule['operator'], $rule['value'][$index], $nodeValue);
455
        }
456
457 96
        return $check;
458
    }
459
}
460