GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Completed
Push — createfragment ( 423399 )
by Christoph
01:35
created

HtmlPageCrawler::createFragmentCrawler()   B

Complexity

Conditions 8
Paths 8

Size

Total Lines 24

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 4
CRAP Score 35

Importance

Changes 0
Metric Value
dl 0
loc 24
ccs 4
cts 16
cp 0.25
rs 8.4444
c 0
b 0
f 0
cc 8
nc 8
nop 1
crap 35
1
<?php
2
namespace Wa72\HtmlPageDom;
3
4
use Symfony\Component\DomCrawler\Crawler;
5
6
/**
7
 * Extends \Symfony\Component\DomCrawler\Crawler by adding tree manipulation functions
8
 * for HTML documents inspired by jQuery such as html(), css(), append(), prepend(), before(),
9
 * addClass(), removeClass()
10
 *
11
 * @author Christoph Singer
12
 * @license MIT
13
 *
14
 */
15
class HtmlPageCrawler extends Crawler
0 ignored issues
show
Complexity introduced by
This class has 46 public methods and attributes which exceeds the configured maximum of 45.

The number of this metric differs depending on the chosen design (inheritance vs. composition). For inheritance, the number should generally be a bit lower.

A high number indicates a reusable class. It might also make the class harder to change without breaking other classes though.

Loading history...
Complexity introduced by
This class has 1067 lines of code which exceeds the configured maximum of 1000.

Really long classes often contain too much logic and violate the single responsibility principle.

We suggest to take a look at the “Code” section for options on how to refactor this code.

Loading history...
Complexity introduced by
This class has a complexity of 168 which exceeds the configured maximum of 50.

The class complexity is the sum of the complexity of all methods. A very high value is usually an indication that your class does not follow the single reponsibility principle and does more than one job.

Some resources for further reading:

You can also find more detailed suggestions for refactoring in the “Code” section of your repository.

Loading history...
16
{
17
    /**
18
     * the (internal) root element name used when importing html fragments
19
     * */
20
    const FRAGMENT_ROOT_TAGNAME = '_root';
21
22
    /**
23
     * Get an HtmlPageCrawler object from a HTML string, DOMNode, DOMNodeList or HtmlPageCrawler
24
     *
25
     * Attention: Changed behavior since 2.0:
26
     *  - If no content is passed, an empty HTML skeleton document will be created.
27
     *  - It is no longer possible to create Crawler objects on an HTML fragment this way. Use createFragment() instead.
28
     *
29
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList|array|null $content
30
     * @return HtmlPageCrawler
31
     * @api
32
     */
33 3
    public static function create($content = null)
34
    {
35 3
        if ($content instanceof HtmlPageCrawler) {
36
            return $content;
37
        } else {
38 3
            if ($content === null) {
39 1
                $content = '<html><body></body></html>';
40
            }
41 3
            return new HtmlPageCrawler($content);
42
        }
43
    }
44
45
    /**
46
     * Adds the specified class(es) to each element in the set of matched elements.
47
     *
48
     * @param string $name One or more space-separated classes to be added to the class attribute of each matched element.
49
     * @return HtmlPageCrawler $this for chaining
50
     * @api
51
     */
52 1
    public function addClass($name)
53
    {
54 1
        foreach ($this as $node) {
55 1
            if ($node instanceof \DOMElement) {
56
                /** @var \DOMElement $node */
57 1
                $classes = preg_split('/\s+/s', $node->getAttribute('class'));
58 1
                $found = false;
59 1
                $count = count($classes);
60 1
                for ($i = 0; $i < $count; $i++) {
61 1
                    if ($classes[$i] == $name) {
62 1
                        $found = true;
63
                    }
64
                }
65 1
                if (!$found) {
66 1
                    $classes[] = $name;
67 1
                    $node->setAttribute('class', trim(join(' ', $classes)));
68
                }
69
            }
70
        }
71 1
        return $this;
72
    }
73
74
    /**
75
     * Insert content, specified by the parameter, after each element in the set of matched elements.
76
     *
77
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content
78
     * @return HtmlPageCrawler $this for chaining
79
     * @api
80
     */
81 1
    public function after($content)
82
    {
83 1
        $content = self::createFragmentCrawler($content);
84 1
        $newnodes = array();
85 1
        foreach ($this as $i => $node) {
86
            /** @var \DOMNode $node */
87 1
            $refnode = $node->nextSibling;
88 1
            foreach ($content as $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
89
                /** @var \DOMNode $newnode */
90 1
                $newnode = static::importNewnode($newnode, $node, $i);
91 1
                if ($refnode === null) {
92 1
                    $node->parentNode->appendChild($newnode);
93
                } else {
94
                    $node->parentNode->insertBefore($newnode, $refnode);
95
                }
96 1
                $newnodes[] = $newnode;
97
            }
98
        }
99 1
        $content->clear();
100 1
        $content->add($newnodes);
101 1
        return $this;
102
    }
103
104
    /**
105
     * Insert HTML content as child nodes of each element after existing children
106
     *
107
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content HTML code fragment or DOMNode to append
108
     * @return HtmlPageCrawler $this for chaining
109
     * @api
110
     */
111 2
    public function append($content)
112
    {
113 2
        $content = self::createFragmentCrawler($content);
114 2
        $newnodes = array();
115 2
        foreach ($this as $i => $node) {
116
            /** @var \DOMNode $node */
117 2
            foreach ($content as $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
118
                /** @var \DOMNode $newnode */
119 2
                $newnode = static::importNewnode($newnode, $node, $i);
120 2
                $node->appendChild($newnode);
121 2
                $newnodes[] = $newnode;
122
            }
123
        }
124 2
        $content->clear();
125 2
        $content->add($newnodes);
126 2
        return $this;
127
    }
128
129
    /**
130
     * Insert every element in the set of matched elements to the end of the target.
131
     *
132
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $element
133
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler A new Crawler object containing all elements appended to the target elements
134
     * @api
135
     */
136 1
    public function appendTo($element)
137
    {
138 1
        $e = self::createFragmentCrawler($element);
139 1
        $newnodes = array();
140 1
        foreach ($e as $i => $node) {
0 ignored issues
show
Bug introduced by
The expression $e of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
141
            /** @var \DOMNode $node */
142 1
            foreach ($this as $newnode) {
143
                /** @var \DOMNode $newnode */
144 1
                if ($node !== $newnode) {
145 1
                    $newnode = static::importNewnode($newnode, $node, $i);
146 1
                    $node->appendChild($newnode);
147
                }
148 1
                $newnodes[] = $newnode;
149
            }
150
        }
151 1
        return self::createFragmentCrawler($newnodes);
152
    }
153
154
    /**
155
     * Returns the attribute value of the first node of the list, or sets an attribute on each element
156
     *
157
     * @see HtmlPageCrawler::getAttribute()
158
     * @see HtmlPageCrawler::setAttribute
159
     *
160
     * @param string $name
161
     * @param null|string $value
162
     * @return null|string|HtmlPageCrawler
163
     * @api
164
     */
165 1
    public function attr($name, $value = null)
166
    {
167 1
        if ($value === null) {
168 1
            return $this->getAttribute($name);
169
        } else {
170
            return $this->setAttribute($name, $value);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this->setAttribute($name, $value); (Wa72\HtmlPageDom\HtmlPageCrawler) is incompatible with the return type of the parent method Symfony\Component\DomCrawler\Crawler::attr of type string|null.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
171
        }
172
    }
173
174
    /**
175
     * Sets an attribute on each element
176
     *
177
     * @param string $name
178
     * @param string $value
179
     * @return HtmlPageCrawler $this for chaining
180
     */
181 2
    public function setAttribute($name, $value)
182
    {
183 2
        foreach ($this as $node) {
184 2
            if ($node instanceof \DOMElement) {
185
                /** @var \DOMElement $node */
186 2
                $node->setAttribute($name, $value);
187
            }
188
        }
189 2
        return $this;
190
    }
191
192
    /**
193
     * Returns the attribute value of the first node of the list.
194
     *
195
     * @param string $name The attribute name
196
     * @return string|null The attribute value or null if the attribute does not exist
197
     * @throws \InvalidArgumentException When current node is empty
198
     *
199
     */
200 1
    public function getAttribute($name)
201
    {
202 1
        if (!count($this)) {
203
            throw new \InvalidArgumentException('The current node list is empty.');
204
        }
205 1
        $node = $this->getNode(0);
206 1
        return $node->hasAttribute($name) ? $node->getAttribute($name) : null;
207
    }
208
209
    /**
210
     * Insert content, specified by the parameter, before each element in the set of matched elements.
211
     *
212
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content
213
     * @return HtmlPageCrawler $this for chaining
214
     * @api
215
     */
216 1
    public function before($content)
217
    {
218 1
        $content = self::createFragmentCrawler($content);
219 1
        $newnodes = array();
220 1
        foreach ($this as $i => $node) {
221
            /** @var \DOMNode $node */
222 1
            foreach ($content as $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
223
                /** @var \DOMNode $newnode */
224 1
                if ($node !== $newnode) {
225 1
                    $newnode = static::importNewnode($newnode, $node, $i);
226 1
                    $node->parentNode->insertBefore($newnode, $node);
227 1
                    $newnodes[] = $newnode;
228
                }
229
            }
230
        }
231 1
        $content->clear();
232 1
        $content->add($newnodes);
233 1
        return $this;
234
    }
235
236
    /**
237
     * Create a deep copy of the set of matched elements.
238
     *
239
     * Equivalent to clone() in jQuery (clone is not a valid PHP function name)
240
     *
241
     * @return HtmlPageCrawler
242
     * @api
243
     */
244
    public function makeClone()
245
    {
246
        return clone $this;
247
    }
248
249
    public function __clone()
250
    {
251
        $newnodes = array();
252
        foreach ($this as $node) {
253
            /** @var \DOMNode $node */
254
            $newnodes[] = $node->cloneNode(true);
255
        }
256
        $this->clear();
257
        $this->add($newnodes);
258
    }
259
260
    /**
261
     * Get one CSS style property of the first element or set it for all elements in the list
262
     *
263
     * Function is here for compatibility with jQuery; it is the same as getStyle() and setStyle()
264
     *
265
     * @see HtmlPageCrawler::getStyle()
266
     * @see HtmlPageCrawler::setStyle()
267
     *
268
     * @param string $key The name of the style property
269
     * @param null|string $value The CSS value to set, or NULL to get the current value
270
     * @return HtmlPageCrawler|string If no param is provided, returns the CSS styles of the first element
271
     * @api
272
     */
273 1
    public function css($key, $value = null)
274
    {
275 1
        if (null === $value) {
276 1
            return $this->getStyle($key);
277
        } else {
278 1
            return $this->setStyle($key, $value);
279
        }
280
    }
281
282
    /**
283
     * get one CSS style property of the first element
284
     *
285
     * @param string $key name of the property
286
     * @return string|null value of the property
287
     */
288 1
    public function getStyle($key)
289
    {
290 1
        $styles = Helpers::cssStringToArray($this->getAttribute('style'));
291 1
        return (isset($styles[$key]) ? $styles[$key] : null);
292
    }
293
294
    /**
295
     * set one CSS style property for all elements in the list
296
     *
297
     * @param string $key name of the property
298
     * @param string $value value of the property
299
     * @return HtmlPageCrawler $this for chaining
300
     */
301 1
    public function setStyle($key, $value)
302
    {
303 1
        foreach ($this as $node) {
304 1
            if ($node instanceof \DOMElement) {
305
                /** @var \DOMElement $node */
306 1
                $styles = Helpers::cssStringToArray($node->getAttribute('style'));
307 1
                if ($value != '') {
308 1
                    $styles[$key] = $value;
309 1
                } elseif (isset($styles[$key])) {
310 1
                    unset($styles[$key]);
311
                }
312 1
                $node->setAttribute('style', Helpers::cssArrayToString($styles));
313
            }
314
        }
315 1
        return $this;
316
    }
317
318
    /**
319
     * Removes all child nodes and text from all nodes in set
320
     *
321
     * Equivalent to jQuery's empty() function which is not a valid function name in PHP
322
     * @return HtmlPageCrawler $this
323
     * @api
324
     */
325 1
    public function makeEmpty()
326
    {
327 1
        foreach ($this as $node) {
328 1
            $node->nodeValue = '';
329
        }
330 1
        return $this;
331
    }
332
333
    /**
334
     * Determine whether any of the matched elements are assigned the given class.
335
     *
336
     * @param string $name
337
     * @return bool
338
     * @api
339
     */
340 1
    public function hasClass($name)
341
    {
342 1
        foreach ($this as $node) {
343 1
            if ($node instanceof \DOMElement && $class = $node->getAttribute('class')) {
344 1
                $classes = preg_split('/\s+/s', $class);
345 1
                if (in_array($name, $classes)) {
346 1
                    return true;
347
                }
348
            }
349
        }
350 1
        return false;
351
    }
352
353
    /**
354
     * Set the HTML contents of each element
355
     *
356
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content HTML code fragment
357
     * @return HtmlPageCrawler $this for chaining
358
     */
359 2
    public function setInnerHtml($content)
360
    {
361 2
        $content = self::createFragmentCrawler($content);
362 2
        foreach ($this as $node) {
363 2
            $node->nodeValue = '';
364 2
            foreach ($content as $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
365
                /** @var \DOMNode $node */
366
                /** @var \DOMNode $newnode */
367 2
                $newnode = static::importNewnode($newnode, $node);
368 2
                $node->appendChild($newnode);
369
            }
370
        }
371 2
        return $this;
372
    }
373
374
    /**
375
     * Insert every element in the set of matched elements after the target.
376
     *
377
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $element
378
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler A new Crawler object containing all elements appended to the target elements
379
     * @api
380
     */
381 1
    public function insertAfter($element)
382
    {
383 1
        $e = self::createFragmentCrawler($element);
384 1
        $newnodes = array();
385 1
        foreach ($e as $i => $node) {
0 ignored issues
show
Bug introduced by
The expression $e of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
386
            /** @var \DOMNode $node */
387 1
            $refnode = $node->nextSibling;
388 1
            foreach ($this as $newnode) {
389
                /** @var \DOMNode $newnode */
390 1
                $newnode = static::importNewnode($newnode, $node, $i);
391 1
                if ($refnode === null) {
392 1
                    $node->parentNode->appendChild($newnode);
393
                } else {
394
                    $node->parentNode->insertBefore($newnode, $refnode);
395
                }
396 1
                $newnodes[] = $newnode;
397
            }
398
        }
399 1
        return self::createFragmentCrawler($newnodes);
400
    }
401
402
    /**
403
     * Insert every element in the set of matched elements before the target.
404
     *
405
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $element
406
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler A new Crawler object containing all elements appended to the target elements
407
     * @api
408
     */
409 1
    public function insertBefore($element)
410
    {
411 1
        $e = self::createFragmentCrawler($element);
412 1
        $newnodes = array();
413 1
        foreach ($e as $i => $node) {
0 ignored issues
show
Bug introduced by
The expression $e of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
414
            /** @var \DOMNode $node */
415 1
            foreach ($this as $newnode) {
416
                /** @var \DOMNode $newnode */
417 1
                $newnode = static::importNewnode($newnode, $node, $i);
418 1
                if ($newnode !== $node) {
419 1
                    $node->parentNode->insertBefore($newnode, $node);
420
                }
421 1
                $newnodes[] = $newnode;
422
            }
423
        }
424 1
        return self::createFragmentCrawler($newnodes);
425
    }
426
427
    /**
428
     * Insert content, specified by the parameter, to the beginning of each element in the set of matched elements.
429
     *
430
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content HTML code fragment
431
     * @return HtmlPageCrawler $this for chaining
432
     * @api
433
     */
434 2
    public function prepend($content)
435
    {
436 2
        $content = self::createFragmentCrawler($content);
437 2
        $newnodes = array();
438 2
        foreach ($this as $i => $node) {
439 2
            $refnode = $node->firstChild;
440
            /** @var \DOMNode $node */
441 2
            foreach ($content as $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
442
                /** @var \DOMNode $newnode */
443 2
                $newnode = static::importNewnode($newnode, $node, $i);
444 2
                if ($refnode === null) {
445
                    $node->appendChild($newnode);
446 2
                } else if ($refnode !== $newnode) {
447 2
                    $node->insertBefore($newnode, $refnode);
448
                }
449 2
                $newnodes[] = $newnode;
450
            }
451
        }
452 2
        $content->clear();
453 2
        $content->add($newnodes);
454 2
        return $this;
455
    }
456
457
    /**
458
     * Insert every element in the set of matched elements to the beginning of the target.
459
     *
460
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $element
461
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler A new Crawler object containing all elements prepended to the target elements
462
     * @api
463
     */
464 1
    public function prependTo($element)
465
    {
466 1
        $e = self::createFragmentCrawler($element);
467 1
        $newnodes = array();
468 1
        foreach ($e as $i => $node) {
0 ignored issues
show
Bug introduced by
The expression $e of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
469 1
            $refnode = $node->firstChild;
470
            /** @var \DOMNode $node */
471 1
            foreach ($this as $newnode) {
472
                /** @var \DOMNode $newnode */
473 1
                $newnode = static::importNewnode($newnode, $node, $i);
474 1
                if ($newnode !== $node) {
475 1
                    if ($refnode === null) {
476 1
                        $node->appendChild($newnode);
477
                    } else {
478
                        $node->insertBefore($newnode, $refnode);
479
                    }
480
                }
481 1
                $newnodes[] = $newnode;
482
            }
483
        }
484 1
        return self::createFragmentCrawler($newnodes);
485
    }
486
487
    /**
488
     * Remove the set of matched elements from the DOM.
489
     *
490
     * (as opposed to Crawler::clear() which detaches the nodes only from Crawler
491
     * but leaves them in the DOM)
492
     *
493
     * @api
494
     */
495 1
    public function remove()
496
    {
497 1
        foreach ($this as $node) {
498
            /**
499
             * @var \DOMNode $node
500
             */
501 1
            if ($node->parentNode instanceof \DOMElement) {
502 1
                $node->parentNode->removeChild($node);
503
            }
504
        }
505 1
        $this->clear();
506 1
    }
507
508
    /**
509
     * Remove an attribute from each element in the set of matched elements.
510
     *
511
     * Alias for removeAttribute for compatibility with jQuery
512
     *
513
     * @param string $name
514
     * @return HtmlPageCrawler
515
     * @api
516
     */
517
    public function removeAttr($name)
518
    {
519
        return $this->removeAttribute($name);
520
    }
521
522
    /**
523
     * Remove an attribute from each element in the set of matched elements.
524
     *
525
     * @param string $name
526
     * @return HtmlPageCrawler
527
     */
528
    public function removeAttribute($name)
529
    {
530
        foreach ($this as $node) {
531
            if ($node instanceof \DOMElement) {
532
                /** @var \DOMElement $node */
533
                if ($node->hasAttribute($name)) {
534
                    $node->removeAttribute($name);
535
                }
536
            }
537
        }
538
        return $this;
539
    }
540
541
    /**
542
     * Remove a class from each element in the list
543
     *
544
     * @param string $name
545
     * @return HtmlPageCrawler $this for chaining
546
     * @api
547
     */
548 1
    public function removeClass($name)
549
    {
550 1
        foreach ($this as $node) {
551 1
            if ($node instanceof \DOMElement) {
552
                /** @var \DOMElement $node */
553 1
                $classes = preg_split('/\s+/s', $node->getAttribute('class'));
554 1
                $count = count($classes);
555 1
                for ($i = 0; $i < $count; $i++) {
556 1
                    if ($classes[$i] == $name) {
557 1
                        unset($classes[$i]);
558
                    }
559
                }
560 1
                $node->setAttribute('class', trim(join(' ', $classes)));
561
            }
562
        }
563 1
        return $this;
564
    }
565
566
    /**
567
     * Replace each target element with the set of matched elements.
568
     *
569
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $element
570
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler A new Crawler object containing all elements appended to the target elements
571
     * @api
572
     */
573
    public function replaceAll($element)
574
    {
575
        $e = self::createFragmentCrawler($element);
576
        $newnodes = array();
577
        foreach ($e as $i => $node) {
0 ignored issues
show
Bug introduced by
The expression $e of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
578
            /** @var \DOMNode $node */
579
            $parent = $node->parentNode;
580
            $refnode  = $node->nextSibling;
581
            foreach ($this as $j => $newnode) {
582
                /** @var \DOMNode $newnode */
583
                $newnode = static::importNewnode($newnode, $node, $i);
584
                if ($j == 0) {
585
                    $parent->replaceChild($newnode, $node);
586
                } else {
587
                    $parent->insertBefore($newnode, $refnode);
588
                }
589
                $newnodes[] = $newnode;
590
            }
591
        }
592
        return self::createFragmentCrawler($newnodes);
593
    }
594
595
    /**
596
     * Replace each element in the set of matched elements with the provided new content and return the set of elements that was removed.
597
     *
598
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content
599
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
600
     * @api
601
     */
602
    public function replaceWith($content)
603
    {
604
        $content = self::createFragmentCrawler($content);
605
        $newnodes = array();
606
        foreach ($this as $i => $node) {
607
            /** @var \DOMNode $node */
608
            $parent = $node->parentNode;
609
            $refnode  = $node->nextSibling;
610
            foreach ($content as $j => $newnode) {
0 ignored issues
show
Bug introduced by
The expression $content of type object<Wa72\HtmlPageDom\HtmlPageCrawler>|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
611
                /** @var \DOMNode $newnode */
612
                $newnode = static::importNewnode($newnode, $node, $i);
613
                if ($j == 0) {
614
                    $parent->replaceChild($newnode, $node);
615
                } else {
616
                    $parent->insertBefore($newnode, $refnode);
617
                }
618
                $newnodes[] = $newnode;
619
            }
620
        }
621
        $content->clear();
622
        $content->add($newnodes);
623
        return $this;
624
    }
625
626
    /**
627
     * Get the combined text contents of each element in the set of matched elements, including their descendants.
628
     * This is what the jQuery text() function does, contrary to the Crawler::text() method that returns only
629
     * the text of the first node.
630
     *
631
     * @return string
632
     * @api
633
     */
634
    public function getCombinedText()
635
    {
636
        $text = '';
637
        foreach ($this as $node) {
638
            /** @var \DOMNode $node */
639
            $text .= $node->nodeValue;
640
        }
641
        return $text;
642
    }
643
644
    /**
645
     * Set the text contents of the matched elements.
646
     *
647
     * @param string $text
648
     * @return HtmlPageCrawler
649
     * @api
650
     */
651
    public function setText($text)
652
    {
653
        $text = htmlspecialchars($text);
654
        foreach ($this as $node) {
655
            /** @var \DOMNode $node */
656
            $node->nodeValue = $text;
657
        }
658
        return $this;
659
    }
660
661
    /**
662
     * Add or remove one or more classes from each element in the set of matched elements, depending the class’s presence.
663
     *
664
     * @param string $classname One or more classnames separated by spaces
665
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
666
     * @api
667
     */
668
    public function toggleClass($classname)
669
    {
670
        $classes = explode(' ', $classname);
671
        foreach ($this as $i => $node) {
672
            $c = self::createFragmentCrawler($node);
673
            /** @var \DOMNode $node */
674
            foreach ($classes as $class) {
675
                if ($c->hasClass($class)) {
676
                    $c->removeClass($class);
677
                } else {
678
                    $c->addClass($class);
679
                }
680
            }
681
        }
682
        return $this;
683
    }
684
685
    /**
686
     * Remove the parents of the set of matched elements from the DOM, leaving the matched elements in their place.
687
     *
688
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
689
     * @api
690
     */
691
    public function unwrap()
692
    {
693
        $parents = array();
694
        foreach($this as $i => $node) {
0 ignored issues
show
Coding Style introduced by
Expected 1 space after FOREACH keyword; 0 found
Loading history...
695
            $parents[] = $node->parentNode;
696
        }
697
698
        self::createFragmentCrawler($parents)->unwrapInner();
699
        return $this;
700
    }
701
702
    /**
703
     * Remove the matched elements, but promote the children to take their place.
704
     *
705
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
706
     * @api
707
     */
708
    public function unwrapInner()
709
    {
710
        foreach($this as $i => $node) {
0 ignored issues
show
Coding Style introduced by
Expected 1 space after FOREACH keyword; 0 found
Loading history...
711
            if (!$node->parentNode instanceof \DOMElement) {
712
                throw new \InvalidArgumentException('DOMElement does not have a parent DOMElement node.');
713
            }
714
715
            /** @var \DOMNode[] $children */
716
            $children = iterator_to_array($node->childNodes);
717
            foreach ($children as $child) {
718
                $node->parentNode->insertBefore($child, $node);
719
            }
720
721
            $node->parentNode->removeChild($node);
722
        }
723
    }
724
725
726
    /**
727
     * Wrap an HTML structure around each element in the set of matched elements
728
     *
729
     * The HTML structure must contain only one root node, e.g.:
730
     * Works: <div><div></div></div>
731
     * Does not work: <div></div><div></div>
732
     *
733
     * @param string|HtmlPageCrawler|\DOMNode $wrappingElement
734
     * @return HtmlPageCrawler $this for chaining
735
     * @api
736
     */
737 1
    public function wrap($wrappingElement)
738
    {
739 1
        $content = self::createFragmentCrawler($wrappingElement);
740 1
        $newnodes = array();
741 1
        foreach ($this as $i => $node) {
742
            /** @var \DOMNode $node */
743 1
            $newnode = $content->getNode(0);
744
            /** @var \DOMNode $newnode */
745
//            $newnode = static::importNewnode($newnode, $node, $i);
746 1
            if ($newnode->ownerDocument !== $node->ownerDocument) {
747
                $newnode = $node->ownerDocument->importNode($newnode, true);
748
            } else {
749 1
                if ($i > 0) {
750 1
                    $newnode = $newnode->cloneNode(true);
751
                }
752
            }
753 1
            $oldnode = $node->parentNode->replaceChild($newnode, $node);
754 1
            while ($newnode->hasChildNodes()) {
755 1
                $elementFound = false;
756 1
                foreach ($newnode->childNodes as $child) {
757 1
                    if ($child instanceof \DOMElement) {
758 1
                        $newnode = $child;
759 1
                        $elementFound = true;
760 1
                        break;
761
                    }
762
                }
763 1
                if (!$elementFound) {
764 1
                    break;
765
                }
766
            }
767 1
            $newnode->appendChild($oldnode);
768 1
            $newnodes[] = $newnode;
769
        }
770 1
        $content->clear();
771 1
        $content->add($newnodes);
772 1
        return $this;
773
    }
774
775
    /**
776
     * Wrap an HTML structure around all elements in the set of matched elements.
777
     *
778
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content
779
     * @throws \LogicException
780
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
781
     * @api
782
     */
783
    public function wrapAll($content)
784
    {
785
        $content = self::createFragmentCrawler($content);
786
        $parent = $this->getNode(0)->parentNode;
787
        foreach ($this as $i => $node) {
788
            /** @var \DOMNode $node */
789
            if ($node->parentNode !== $parent) {
790
                throw new \LogicException('Nodes to be wrapped with wrapAll() must all have the same parent');
791
            }
792
        }
793
794
        $newnode = $content->getNode(0);
795
        /** @var \DOMNode $newnode */
796
        $newnode = static::importNewnode($newnode, $parent);
797
798
        $newnode = $parent->insertBefore($newnode,$this->getNode(0));
799
        $content->clear();
800
        $content->add($newnode);
801
802
        while ($newnode->hasChildNodes()) {
803
            $elementFound = false;
804
            foreach ($newnode->childNodes as $child) {
805
                if ($child instanceof \DOMElement) {
806
                    $newnode = $child;
807
                    $elementFound = true;
808
                    break;
809
                }
810
            }
811
            if (!$elementFound) {
812
                break;
813
            }
814
        }
815
        foreach ($this as $i => $node) {
816
            /** @var \DOMNode $node */
817
            $newnode->appendChild($node);
818
        }
819
        return $this;
820
    }
821
822
    /**
823
     * Wrap an HTML structure around the content of each element in the set of matched elements.
824
     *
825
     * @param string|HtmlPageCrawler|\DOMNode|\DOMNodeList $content
826
     * @return \Wa72\HtmlPageDom\HtmlPageCrawler $this for chaining
827
     * @api
828
     */
829
    public function wrapInner($content)
830
    {
831
        foreach ($this as $i => $node) {
832
            /** @var \DOMNode $node */
833
            self::createFragmentCrawler($node->childNodes)->wrapAll($content);
834
        }
835
        return $this;
836
    }
837
838
    /**
839
     * Get the HTML code fragment of all elements and their contents.
840
     *
841
     * If the first node contains a complete HTML document return only
842
     * the full code of this document.
843
     *
844
     * @return string HTML code (fragment)
845
     * @api
846
     */
847 5
    public function saveHTML()
848
    {
849 5
        if ($this->isHtmlDocument()) {
850 1
            return $this->getDOMDocument()->saveHTML();
851
        } else {
852 5
            $doc = new \DOMDocument('1.0', 'UTF-8');
853 5
            $root = $doc->appendChild($doc->createElement('_root'));
854 5
            foreach ($this as $node) {
855 5
                $root->appendChild($doc->importNode($node, true));
856
            }
857 5
            $html = trim($doc->saveHTML());
858 5
            return preg_replace('@^<'.self::FRAGMENT_ROOT_TAGNAME.'[^>]*>|</'.self::FRAGMENT_ROOT_TAGNAME.'>$@', '', $html);
859
        }
860
    }
861
862 2
    public function __toString()
863
    {
864 2
        return $this->saveHTML();
865
    }
866
867
    /**
868
     * checks whether the first node contains a complete html document
869
     * (as opposed to a document fragment)
870
     *
871
     * @return boolean
872
     */
873 5
    public function isHtmlDocument()
874
    {
875 5
        $node = $this->getNode(0);
876 5
        if ($node instanceof \DOMElement
877 5
            && $node->ownerDocument instanceof \DOMDocument
878 5
            && $node->ownerDocument->documentElement === $node
879 5
            && $node->nodeName == 'html'
880
        ) {
881 1
            return true;
882
        } else {
883 5
            return false;
884
        }
885
    }
886
887
    /**
888
     * get ownerDocument of the first element
889
     *
890
     * @return \DOMDocument|null
891
     */
892 2
    public function getDOMDocument()
893
    {
894 2
        $node = $this->getNode(0);
895 2
        $r = null;
896 2
        if ($node instanceof \DOMElement
897 2
            && $node->ownerDocument instanceof \DOMDocument
898
        ) {
899 2
            $r = $node->ownerDocument;
900
        }
901 2
        return $r;
902
    }
903
904
    /**
905
     * Filters the list of nodes with a CSS selector.
906
     *
907
     * @param string $selector
908
     * @return HtmlPageCrawler
909
     */
910 4
    public function filter($selector)
911
    {
912 4
        return parent::filter($selector);
913
    }
914
915
    /**
916
     * Filters the list of nodes with an XPath expression.
917
     *
918
     * @param string $xpath An XPath expression
919
     *
920
     * @return HtmlPageCrawler A new instance of Crawler with the filtered list of nodes
921
     *
922
     * @api
923
     */
924 2
    public function filterXPath($xpath)
925
    {
926 2
        return parent::filterXPath($xpath);
927
    }
928
929
    /**
930
     * Adds HTML/XML content to the HtmlPageCrawler object (but not to the DOM of an already attached node).
931
     *
932
     * Function overriden from Crawler because HTML fragments are always added as complete documents there
933
     *
934
     *
935
     * @param string      $content A string to parse as HTML/XML
936
     * @param null|string $type    The content type of the string
937
     *
938
     * @return null|void
939
     */
940 4
    public function addContent($content, $type = null)
941
    {
942 4
        if (empty($type)) {
943 4
            $type = 0 === strpos($content, '<?xml') ? 'application/xml' : 'text/html';
944
        }
945
946
        // DOM only for HTML/XML content
947 4
        if (!preg_match('/(x|ht)ml/i', $type, $xmlMatches)) {
948
            return;
949
        }
950 4
        if (substr($type, 0, 9) == 'text/html' && !preg_match('/<html\b[^>]*>/i', $content)) {
951
            // string contains no <html> Tag => no complete document but an HTML fragment!
952 1
            throw new \InvalidArgumentException('It is not possible to create a HtmlPageCrawler instance on a HTML fragment. Use createFragment() instead.');
953
        } else {
954 3
            parent::addContent($content, $type);
955
        }
956 3
    }
957
958
//    public function addHtmlFragment($content, $charset = 'UTF-8')
959
//    {
960
//        $d = new \DOMDocument('1.0', $charset);
961
//        $d->preserveWhiteSpace = false;
962
//        $root = $d->appendChild($d->createElement(self::FRAGMENT_ROOT_TAGNAME));
963
//        $bodynode = Helpers::getBodyNodeFromHtmlFragment($content, $charset);
964
//        foreach ($bodynode->childNodes as $child) {
965
//            $inode = $root->appendChild($d->importNode($child, true));
966
//            if ($inode) {
967
//                $this->addNode($inode);
968
//            }
969
//        }
970
//    }
971
972
    /**
973
     * @param string $content
974
     * @param string $charset
975
     * @return HtmlPageCrawler
976
     * @throws \Exception
977
     */
978 2
    public function createFragment($content, $charset = 'UTF-8')
979
    {
980 2
        $d = $this->getDOMDocument();
981 2
        if (!$d) {
982
            throw new \Exception('It is not possible to create a HTML fragement on an empty HtmlPageCrawler instance. Add a complete HTML document first.');
983
        }
984 2
        $crawler = new static();
985 2
        $bodynode = Helpers::getBodyNodeFromHtmlFragment($content, $charset);
986 2
        foreach ($bodynode->childNodes as $child) {
987 2
            $inode = $d->importNode($child, true);
988 2
            if ($inode) {
989 2
                $crawler->addNode($inode);
990
            }
991
        }
992 2
        return $crawler;
993
    }
994
995 2
    protected function createFragmentCrawler($content)
996
    {
997 2
        if ($content instanceof HtmlPageCrawler) {
998
            /** @var HtmlPageCrawler $content */
999
            if ($content->getDOMDocument() !== $this->getDOMDocument()) {
1000
                throw new \InvalidArgumentException('It is not possible to add nodes from another document.');
1001
            }
1002
            return $content;
1003 2
        } elseif (is_string($content)) {
1004 2
            return $this->createFragment($content);
1005
        } elseif ($content instanceof \DOMNode) {
1006
            /** @var \DOMNode $content */
1007
            if ($content->ownerDocument !== $this->getDOMDocument()) {
1008
                throw new \InvalidArgumentException('It is not possible to add nodes from another document.');
1009
            }
1010
            return new self($content);
1011
        } elseif ($content instanceof \DOMNodeList) {
1012
            /** @var \DOMNodeList $content */
1013
            if ($content->item(0)->ownerDocument !== $this->getDOMDocument()) {
1014
                throw new \InvalidArgumentException('It is not possible to add nodes from another document.');
1015
            }
1016
            return new self($content);
1017
        }
1018
    }
1019
1020
    /**
1021
     * Adds a node to the current list of nodes.
1022
     *
1023
     * This method uses the appropriate specialized add*() method based
1024
     * on the type of the argument.
1025
     *
1026
     * Overwritten from parent to allow Crawler to be added
1027
     *
1028
     * @param null|\DOMNodeList|array|\DOMNode|Crawler $node A node
1029
     *
1030
     * @api
1031
     */
1032 17
    public function add($node)
1033
    {
1034 17
        if ($node instanceof Crawler) {
1035
            foreach ($node as $childnode) {
1036
                $this->addNode($childnode);
1037
            }
1038
        } else {
1039 17
            parent::add($node);
1040
        }
1041 16
    }
1042
1043
    /**
1044
     * @param \DOMNode $newnode
1045
     * @param \DOMNode $referencenode
1046
     * @param int $clone
1047
     * @return \DOMNode
1048
     */
1049 2
    protected static function importNewnode(\DOMNode $newnode, \DOMNode $referencenode, $clone = 0) {
1050 2
        if ($newnode->ownerDocument !== $referencenode->ownerDocument) {
1051
            $referencenode->ownerDocument->preserveWhiteSpace = false;
1052
            $newnode = $referencenode->ownerDocument->importNode($newnode, true);
1053
        } else {
1054 2
            if ($clone > 0) {
1055
                $newnode = $newnode->cloneNode(true);
1056
            }
1057
        }
1058 2
        return $newnode;
1059
    }
1060
1061
    /**
1062
     * Checks whether the first node in the set is disconnected (has no parent node)
1063
     *
1064
     * @return bool
1065
     */
1066
    public function isDisconnected()
1067
    {
1068
        $parent = $this->getNode(0)->parentNode;
1069
        return ($parent == null || $parent->tagName == self::FRAGMENT_ROOT_TAGNAME);
1070
    }
1071
1072
    public function __get($name)
1073
    {
1074
        switch ($name) {
1075
            case 'count':
1076
            case 'length':
1077
                return count($this);
1078
        }
1079
        throw new \Exception('No such property ' . $name);
1080
    }
1081
}
1082