Completed
Push — master ( d45b97...b40fc2 )
by Lars
01:55
created

SimpleHtmlDom::cleanHtmlWrapper()   D

Complexity

Conditions 18
Paths 52

Size

Total Lines 81

Duplication

Lines 24
Ratio 29.63 %

Code Coverage

Tests 31
CRAP Score 19.3826

Importance

Changes 0
Metric Value
dl 24
loc 81
ccs 31
cts 37
cp 0.8378
rs 4.8666
c 0
b 0
f 0
cc 18
nc 52
nop 2
crap 19.3826

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/** @noinspection PhpHierarchyChecksInspection */
8
class SimpleHtmlDom extends AbstractSimpleHtmlDom implements \IteratorAggregate, SimpleHtmlDomInterface
9
{
10
    /**
11
     * @param \DOMElement|\DOMNode $node
12
     */
13 107
    public function __construct(\DOMNode $node)
14
    {
15 107
        $this->node = $node;
16 107
    }
17
18
    /**
19
     * @param string $name
20
     * @param array  $arguments
21
     *
22
     * @throws \BadMethodCallException
23
     *
24
     * @return SimpleHtmlDomInterface|string|null
25
     */
26 9
    public function __call($name, $arguments)
27
    {
28 9
        $name = \strtolower($name);
29
30 9
        if (isset(self::$functionAliases[$name])) {
31 9
            return \call_user_func_array([$this, self::$functionAliases[$name]], $arguments);
32
        }
33
34
        throw new \BadMethodCallException('Method does not exist');
35
    }
36
37
    /**
38
     * Find list of nodes with a CSS selector.
39
     *
40
     * @param string   $selector
41
     * @param int|null $idx
42
     *
43
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface
44
     */
45 26
    public function find(string $selector, $idx = null)
46
    {
47 26
        return $this->getHtmlDomParser()->find($selector, $idx);
48
    }
49
50
    /**
51
     * Returns an array of attributes.
52
     *
53
     * @return array|null
54
     */
55 2 View Code Duplication
    public function getAllAttributes()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
56
    {
57 2
        if ($this->node->hasAttributes()) {
58 2
            $attributes = [];
59 2
            foreach ($this->node->attributes as $attr) {
60 2
                $attributes[$attr->name] = HtmlDomParser::putReplacedBackToPreserveHtmlEntities($attr->value);
61
            }
62
63 2
            return $attributes;
64
        }
65
66 1
        return null;
67
    }
68
69
    /**
70
     * Return attribute value.
71
     *
72
     * @param string $name
73
     *
74
     * @return string
75
     */
76 14
    public function getAttribute(string $name): string
77
    {
78 14
        if ($this->node instanceof \DOMElement) {
79 14
            return HtmlDomParser::putReplacedBackToPreserveHtmlEntities(
80 14
                $this->node->getAttribute($name)
81
            );
82
        }
83
84
        return '';
85
    }
86
87
    /**
88
     * Determine if an attribute exists on the element.
89
     *
90
     * @param string $name
91
     *
92
     * @return bool
93
     */
94 2
    public function hasAttribute(string $name): bool
95
    {
96 2
        if (!$this->node instanceof \DOMElement) {
97
            return false;
98
        }
99
100 2
        return $this->node->hasAttribute($name);
101
    }
102
103
    /**
104
     * Get dom node's outer html.
105
     *
106
     * @param bool $multiDecodeNewHtmlEntity
107
     *
108
     * @return string
109
     */
110 27
    public function html(bool $multiDecodeNewHtmlEntity = false): string
111
    {
112 27
        return $this->getHtmlDomParser()->html($multiDecodeNewHtmlEntity);
113
    }
114
115
    /**
116
     * Get dom node's inner html.
117
     *
118
     * @param bool $multiDecodeNewHtmlEntity
119
     *
120
     * @return string
121
     */
122 12
    public function innerHtml(bool $multiDecodeNewHtmlEntity = false): string
123
    {
124 12
        return $this->getHtmlDomParser()->innerHtml($multiDecodeNewHtmlEntity);
125
    }
126
127
    /**
128
     * Remove attribute.
129
     *
130
     * @param string $name <p>The name of the html-attribute.</p>
131
     *
132
     * @return SimpleHtmlDomInterface
133
     */
134 2
    public function removeAttribute(string $name): SimpleHtmlDomInterface
135
    {
136 2
        if (\method_exists($this->node, 'removeAttribute')) {
137 2
            $this->node->removeAttribute($name);
0 ignored issues
show
Bug introduced by
The method removeAttribute does only exist in DOMElement, but not in DOMNode.

It seems like the method you are trying to call exists only in some of the possible types.

Let’s take a look at an example:

class A
{
    public function foo() { }
}

class B extends A
{
    public function bar() { }
}

/**
 * @param A|B $x
 */
function someFunction($x)
{
    $x->foo(); // This call is fine as the method exists in A and B.
    $x->bar(); // This method only exists in B and might cause an error.
}

Available Fixes

  1. Add an additional type-check:

    /**
     * @param A|B $x
     */
    function someFunction($x)
    {
        $x->foo();
    
        if ($x instanceof B) {
            $x->bar();
        }
    }
    
  2. Only allow a single type to be passed if the variable comes from a parameter:

    function someFunction(B $x) { /** ... */ }
    
Loading history...
138
        }
139
140 2
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDo...erface::removeAttribute of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
141
    }
142
143
    /**
144
     * Replace child node.
145
     *
146
     * @param string $string
147
     *
148
     * @return SimpleHtmlDomInterface
149
     */
150 7 View Code Duplication
    protected function replaceChildWithString(string $string): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
151
    {
152 7
        if (!empty($string)) {
153 6
            $newDocument = new HtmlDomParser($string);
154
155 6
            $tmpDomString = $this->normalizeStringForComparision($newDocument);
156 6
            $tmpStr = $this->normalizeStringForComparision($string);
157 6
            if ($tmpDomString !== $tmpStr) {
158
                throw new \RuntimeException(
159
                    'Not valid HTML fragment!' . "\n" .
160
                    $tmpDomString . "\n" .
161
                    $tmpStr
162
                );
163
            }
164
        }
165
166 7
        if (\count($this->node->childNodes) > 0) {
167 7
            foreach ($this->node->childNodes as $node) {
168 7
                $this->node->removeChild($node);
169
            }
170
        }
171
172 7
        if (!empty($newDocument)) {
173 6
            $newDocument = $this->cleanHtmlWrapper($newDocument);
174 6
            $ownerDocument = $this->node->ownerDocument;
175
            if (
176 6
                $ownerDocument !== null
177
                &&
178 6
                $newDocument->getDocument()->documentElement !== null
179
            ) {
180 6
                $newNode = $ownerDocument->importNode($newDocument->getDocument()->documentElement, true);
181
                /** @noinspection UnusedFunctionResultInspection */
182 6
                $this->node->appendChild($newNode);
183
            }
184
        }
185
186 7
        return $this;
187
    }
188
189
    /**
190
     * Replace this node.
191
     *
192
     * @param string $string
193
     *
194
     * @return SimpleHtmlDomInterface
195
     */
196 4
    protected function replaceNodeWithString(string $string): SimpleHtmlDomInterface
197
    {
198 4
        if (empty($string)) {
199 2
            $this->node->parentNode->removeChild($this->node);
200
201 2
            return $this;
202
        }
203
204 3
        $newDocument = new HtmlDomParser($string);
205
206 3
        $tmpDomOuterTextString = $this->normalizeStringForComparision($newDocument);
207 3
        $tmpStr = $this->normalizeStringForComparision($string);
208 3 View Code Duplication
        if ($tmpDomOuterTextString !== $tmpStr) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
209
            throw new \RuntimeException(
210
                'Not valid HTML fragment!' . "\n"
211
                . $tmpDomOuterTextString . "\n" .
212
                $tmpStr
213
            );
214
        }
215
216 3
        $newDocument = $this->cleanHtmlWrapper($newDocument, true);
217 3
        $ownerDocument = $this->node->ownerDocument;
218
        if (
219 3
            $ownerDocument === null
220
            ||
221 3
            $newDocument->getDocument()->documentElement === null
222
        ) {
223
            return $this;
224
        }
225
226 3
        $newNode = $ownerDocument->importNode($newDocument->getDocument()->documentElement, true);
227
228 3
        $this->node->parentNode->replaceChild($newNode, $this->node);
229 3
        $this->node = $newNode;
230
231
        // Remove head element, preserving child nodes. (again)
232 View Code Duplication
        if (
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
233 3
            $this->node->parentNode instanceof \DOMElement
234
            &&
235 3
            $newDocument->getIsDOMDocumentCreatedWithoutHeadWrapper()
236
        ) {
237 2
            $html = $this->node->parentNode->getElementsByTagName('head')[0];
238 2
            if ($this->node->parentNode->ownerDocument !== null) {
239 2
                $fragment = $this->node->parentNode->ownerDocument->createDocumentFragment();
240 2
                if ($html !== null) {
241
                    /** @var \DOMNode $html */
242 1
                    while ($html->childNodes->length > 0) {
243 1
                        $tmpNode = $html->childNodes->item(0);
244 1
                        if ($tmpNode !== null) {
245
                            /** @noinspection UnusedFunctionResultInspection */
246 1
                            $fragment->appendChild($tmpNode);
247
                        }
248
                    }
249
                    /** @noinspection UnusedFunctionResultInspection */
250 1
                    $html->parentNode->replaceChild($fragment, $html);
251
                }
252
            }
253
        }
254
255 3
        return $this;
256
    }
257
258
    /**
259
     * Replace this node with text
260
     *
261
     * @param string $string
262
     *
263
     * @return SimpleHtmlDomInterface
264
     */
265 1 View Code Duplication
    protected function replaceTextWithString($string): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
266
    {
267 1
        if (empty($string)) {
268 1
            $this->node->parentNode->removeChild($this->node);
269
270 1
            return $this;
271
        }
272
273 1
        $ownerDocument = $this->node->ownerDocument;
274 1
        if ($ownerDocument !== null) {
275 1
            $newElement = $ownerDocument->createTextNode($string);
276 1
            $newNode = $ownerDocument->importNode($newElement, true);
277 1
            $this->node->parentNode->replaceChild($newNode, $this->node);
278 1
            $this->node = $newNode;
279
        }
280
281 1
        return $this;
282
    }
283
284
    /**
285
     * Set attribute value.
286
     *
287
     * @param string      $name       <p>The name of the html-attribute.</p>
288
     * @param string|null $value      <p>Set to NULL or empty string, to remove the attribute.</p>
289
     * @param bool        $strict     </p>
290
     *                                $value must be NULL, to remove the attribute,
291
     *                                so that you can set an empty string as attribute-value e.g. autofocus=""
292
     *                                </p>
293
     *
294
     * @return SimpleHtmlDomInterface
295
     */
296 10 View Code Duplication
    public function setAttribute(string $name, $value = null, bool $strict = false): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
297
    {
298
        if (
299 10
            ($strict && $value === null)
300
            ||
301 10
            (!$strict && empty($value))
302
        ) {
303
            /** @noinspection UnusedFunctionResultInspection */
304 2
            $this->removeAttribute($name);
305 10
        } elseif (\method_exists($this->node, 'setAttribute')) {
306
            /** @noinspection UnusedFunctionResultInspection */
307 10
            $this->node->setAttribute($name, $value);
0 ignored issues
show
Bug introduced by
The method setAttribute does only exist in DOMElement, but not in DOMNode.

It seems like the method you are trying to call exists only in some of the possible types.

Let’s take a look at an example:

class A
{
    public function foo() { }
}

class B extends A
{
    public function bar() { }
}

/**
 * @param A|B $x
 */
function someFunction($x)
{
    $x->foo(); // This call is fine as the method exists in A and B.
    $x->bar(); // This method only exists in B and might cause an error.
}

Available Fixes

  1. Add an additional type-check:

    /**
     * @param A|B $x
     */
    function someFunction($x)
    {
        $x->foo();
    
        if ($x instanceof B) {
            $x->bar();
        }
    }
    
  2. Only allow a single type to be passed if the variable comes from a parameter:

    function someFunction(B $x) { /** ... */ }
    
Loading history...
308
        }
309
310 10
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::setAttribute of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
311
    }
312
313
    /**
314
     * Get dom node's plain text.
315
     *
316
     * @return string
317
     */
318 17
    public function text(): string
319
    {
320 17
        return $this->getHtmlDomParser()->fixHtmlOutput($this->node->textContent);
321
    }
322
323
    /**
324
     * Change the name of a tag in a "DOMNode".
325
     *
326
     * @param \DOMNode $node
327
     * @param string   $name
328
     *
329
     * @return \DOMElement|false
330
     *                          <p>DOMElement a new instance of class DOMElement or false
331
     *                          if an error occured.</p>
332
     */
333 4 View Code Duplication
    protected function changeElementName(\DOMNode $node, string $name)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
334
    {
335 4
        $ownerDocument = $node->ownerDocument;
336 4
        if ($ownerDocument) {
337 4
            $newNode = $ownerDocument->createElement($name);
338
        } else {
339
            return false;
340
        }
341
342 4
        foreach ($node->childNodes as $child) {
343 4
            $child = $ownerDocument->importNode($child, true);
344
            /** @noinspection UnusedFunctionResultInspection */
345 4
            $newNode->appendChild($child);
346
        }
347
348 4
        foreach ($node->attributes as $attrName => $attrNode) {
349
            /** @noinspection UnusedFunctionResultInspection */
350
            $newNode->setAttribute($attrName, $attrNode);
351
        }
352
353
        /** @noinspection UnusedFunctionResultInspection */
354 4
        $newNode->ownerDocument->replaceChild($newNode, $node);
355
356 4
        return $newNode;
357
    }
358
359
    /**
360
     * Returns children of node.
361
     *
362
     * @param int $idx
363
     *
364
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface|null
365
     */
366 2 View Code Duplication
    public function childNodes(int $idx = -1)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
367
    {
368 2
        $nodeList = $this->getIterator();
369
370 2
        if ($idx === -1) {
371 2
            return $nodeList;
372
        }
373
374 2
        return $nodeList[$idx] ?? null;
375
    }
376
377
    /**
378
     * Find nodes with a CSS selector.
379
     *
380
     * @param string $selector
381
     *
382
     * @return SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface
383
     */
384
    public function findMulti(string $selector): SimpleHtmlDomNodeInterface
385
    {
386
        return $this->find($selector, null);
387
    }
388
389
    /**
390
     * Find one node with a CSS selector.
391
     *
392
     * @param string $selector
393
     *
394
     * @return SimpleHtmlDomInterface
395
     */
396 1
    public function findOne(string $selector): SimpleHtmlDomInterface
397
    {
398 1
        return $this->find($selector, 0);
399
    }
400
401
    /**
402
     * Returns the first child of node.
403
     *
404
     * @return SimpleHtmlDomInterface|null
405
     */
406 4 View Code Duplication
    public function firstChild()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
407
    {
408
        /** @var \DOMNode|null $node */
409 4
        $node = $this->node->firstChild;
410
411 4
        if ($node === null) {
412 1
            return null;
413
        }
414
415 4
        return new static($node);
416
    }
417
418
    /**
419
     * Return elements by .class.
420
     *
421
     * @param string $class
422
     *
423
     * @return SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface
424
     */
425
    public function getElementByClass(string $class): SimpleHtmlDomNodeInterface
426
    {
427
        return $this->findMulti(".${class}");
428
    }
429
430
    /**
431
     * Return element by #id.
432
     *
433
     * @param string $id
434
     *
435
     * @return SimpleHtmlDomInterface
436
     */
437 1
    public function getElementById(string $id): SimpleHtmlDomInterface
438
    {
439 1
        return $this->findOne("#${id}");
440
    }
441
442
    /**
443
     * Return element by tag name.
444
     *
445
     * @param string $name
446
     *
447
     * @return SimpleHtmlDomInterface
448
     */
449 1 View Code Duplication
    public function getElementByTagName(string $name): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
450
    {
451 1
        if ($this->node instanceof \DOMElement) {
452 1
            $node = $this->node->getElementsByTagName($name)->item(0);
453
        } else {
454
            $node = null;
455
        }
456
457 1
        if ($node === null) {
458
            return new SimpleHtmlDomBlank();
459
        }
460
461 1
        return new static($node);
462
    }
463
464
    /**
465
     * Returns elements by #id.
466
     *
467
     * @param string   $id
468
     * @param int|null $idx
469
     *
470
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface
471
     */
472
    public function getElementsById(string $id, $idx = null)
473
    {
474
        return $this->find("#${id}", $idx);
475
    }
476
477
    /**
478
     * Returns elements by tag name.
479
     *
480
     * @param string   $name
481
     * @param int|null $idx
482
     *
483
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface
484
     */
485 1 View Code Duplication
    public function getElementsByTagName(string $name, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
486
    {
487 1
        if ($this->node instanceof \DOMElement) {
488 1
            $nodesList = $this->node->getElementsByTagName($name);
489
        } else {
490
            $nodesList = [];
491
        }
492
493 1
        $elements = new SimpleHtmlDomNode();
494
495 1
        foreach ($nodesList as $node) {
496 1
            $elements[] = new static($node);
497
        }
498
499
        // return all elements
500 1
        if ($idx === null) {
501 1
            if (\count($elements) === 0) {
502
                return new SimpleHtmlDomNodeBlank();
503
            }
504
505 1
            return $elements;
506
        }
507
508
        // handle negative values
509
        if ($idx < 0) {
510
            $idx = \count($elements) + $idx;
511
        }
512
513
        // return one element
514
        return $elements[$idx] ?? new SimpleHtmlDomBlank();
515
    }
516
517
    /**
518
     * Create a new "HtmlDomParser"-object from the current context.
519
     *
520
     * @return HtmlDomParser
521
     */
522 72
    public function getHtmlDomParser(): HtmlDomParser
523
    {
524 72
        return new HtmlDomParser($this);
525
    }
526
527
    /**
528
     * @return \DOMNode
529
     */
530 73
    public function getNode(): \DOMNode
531
    {
532 73
        return $this->node;
533
    }
534
535
    /**
536
     * Nodes can get partially destroyed in which they're still an
537
     * actual DOM node (such as \DOMElement) but almost their entire
538
     * body is gone, including the `nodeType` attribute.
539
     *
540
     * @return bool true if node has been destroyed
541
     */
542
    public function isRemoved(): bool
543
    {
544
        return !isset($this->node->nodeType);
545
    }
546
547
    /**
548
     * Returns the last child of node.
549
     *
550
     * @return SimpleHtmlDomInterface|null
551
     */
552 4 View Code Duplication
    public function lastChild()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
553
    {
554
        /** @var \DOMNode|null $node */
555 4
        $node = $this->node->lastChild;
556
557 4
        if ($node === null) {
558 1
            return null;
559
        }
560
561 4
        return new static($node);
562
    }
563
564
    /**
565
     * Returns the next sibling of node.
566
     *
567
     * @return SimpleHtmlDomInterface|null
568
     */
569 1 View Code Duplication
    public function nextSibling()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
570
    {
571
        /** @var \DOMNode|null $node */
572 1
        $node = $this->node->nextSibling;
573
574 1
        if ($node === null) {
575 1
            return null;
576
        }
577
578 1
        return new static($node);
579
    }
580
581
    /**
582
     * Returns the parent of node.
583
     *
584
     * @return SimpleHtmlDomInterface
585
     */
586 1
    public function parentNode(): SimpleHtmlDomInterface
587
    {
588 1
        return new static($this->node->parentNode);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return new static($this->node->parentNode); (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::parentNode of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
589
    }
590
591
    /**
592
     * Returns the previous sibling of node.
593
     *
594
     * @return SimpleHtmlDomInterface|null
595
     */
596 1 View Code Duplication
    public function previousSibling()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
597
    {
598
        /** @var \DOMNode|null $node */
599 1
        $node = $this->node->previousSibling;
600
601 1
        if ($node === null) {
602 1
            return null;
603
        }
604
605 1
        return new static($node);
606
    }
607
608
    /**
609
     * @param string|string[]|null $value <p>
610
     *                                    null === get the current input value
611
     *                                    text === set a new input value
612
     *                                    </p>
613
     *
614
     * @return string|string[]|null
615
     */
616 1 View Code Duplication
    public function val($value = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
617
    {
618 1
        if ($value === null) {
619
            if (
620 1
                $this->tag === 'input'
0 ignored issues
show
Documentation introduced by
The property tag does not exist on object<voku\helper\SimpleHtmlDom>. Since you implemented __get, maybe consider adding a @property annotation.

Since your code implements the magic getter _get, this function will be called for any read access on an undefined variable. You can add the @property annotation to your class or interface to document the existence of this variable.

<?php

/**
 * @property int $x
 * @property int $y
 * @property string $text
 */
class MyLabel
{
    private $properties;

    private $allowedProperties = array('x', 'y', 'text');

    public function __get($name)
    {
        if (isset($properties[$name]) && in_array($name, $this->allowedProperties)) {
            return $properties[$name];
        } else {
            return null;
        }
    }

    public function __set($name, $value)
    {
        if (in_array($name, $this->allowedProperties)) {
            $properties[$name] = $value;
        } else {
            throw new \LogicException("Property $name is not defined.");
        }
    }

}

If the property has read access only, you can use the @property-read annotation instead.

Of course, you may also just have mistyped another name, in which case you should fix the error.

See also the PhpDoc documentation for @property.

Loading history...
621
                &&
622
                (
623 1
                    $this->getAttribute('type') === 'text'
624
                    ||
625 1
                    !$this->hasAttribute('type')
626
                )
627
            ) {
628 1
                return $this->getAttribute('value');
629
            }
630
631
            if (
632 1
                $this->hasAttribute('checked')
633
                &&
634 1
                \in_array($this->getAttribute('type'), ['checkbox', 'radio'], true)
635
            ) {
636 1
                return $this->getAttribute('value');
637
            }
638
639 1
            if ($this->node->nodeName === 'select') {
640
                $valuesFromDom = [];
641
                $options = $this->getElementsByTagName('option');
642
                if ($options instanceof SimpleHtmlDomNode) {
643
                    foreach ($options as $option) {
644
                        if ($this->hasAttribute('checked')) {
645
                            /** @noinspection UnnecessaryCastingInspection */
646
                            $valuesFromDom[] = (string) $option->getAttribute('value');
647
                        }
648
                    }
649
                }
650
651
                if (\count($valuesFromDom) === 0) {
652
                    return null;
653
                }
654
655
                return $valuesFromDom;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $valuesFromDom; (array) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::val of type string|string[]|null.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
656
            }
657
658 1
            if ($this->node->nodeName === 'textarea') {
659 1
                return $this->node->nodeValue;
660
            }
661
        } else {
662
            /** @noinspection NestedPositiveIfStatementsInspection */
663 1
            if (\in_array($this->getAttribute('type'), ['checkbox', 'radio'], true)) {
664 1
                if ($value === $this->getAttribute('value')) {
665
                    /** @noinspection UnusedFunctionResultInspection */
666 1
                    $this->setAttribute('checked', 'checked');
667
                } else {
668
                    /** @noinspection UnusedFunctionResultInspection */
669 1
                    $this->removeAttribute('checked');
670
                }
671 1
            } elseif ($this->node instanceof \DOMElement && $this->node->nodeName === 'select') {
672
                foreach ($this->node->getElementsByTagName('option') as $option) {
673
                    /** @var \DOMElement $option */
674
                    if ($value === $option->getAttribute('value')) {
675
                        /** @noinspection UnusedFunctionResultInspection */
676
                        $option->setAttribute('selected', 'selected');
677
                    } else {
678
                        $option->removeAttribute('selected');
679
                    }
680
                }
681 1
            } elseif ($this->node->nodeName === 'input' && \is_string($value)) {
682
                // Set value for input elements
683
                /** @noinspection UnusedFunctionResultInspection */
684 1
                $this->setAttribute('value', $value);
685 1
            } elseif ($this->node->nodeName === 'textarea' && \is_string($value)) {
686 1
                $this->node->nodeValue = $value;
687
            }
688
        }
689
690 1
        return null;
691
    }
692
693
    /**
694
     * @param HtmlDomParser $newDocument
695
     * @param bool          $removeExtraHeadTag
696
     *
697
     * @return HtmlDomParser
698
     */
699 9
    protected function cleanHtmlWrapper(HtmlDomParser $newDocument, $removeExtraHeadTag = false): HtmlDomParser
700
    {
701
        if (
702 9
            $newDocument->getIsDOMDocumentCreatedWithoutHtml()
703
            ||
704 9
            $newDocument->getIsDOMDocumentCreatedWithoutHtmlWrapper()
705
        ) {
706
707
            // Remove doc-type node.
708 9
            if ($newDocument->getDocument()->doctype !== null) {
709
                /** @noinspection UnusedFunctionResultInspection */
710
                $newDocument->getDocument()->doctype->parentNode->removeChild($newDocument->getDocument()->doctype);
711
            }
712
713
            // Remove html element, preserving child nodes.
714 9
            $html = $newDocument->getDocument()->getElementsByTagName('html')->item(0);
715 9
            $fragment = $newDocument->getDocument()->createDocumentFragment();
716 9
            if ($html !== null) {
717 6
                while ($html->childNodes->length > 0) {
718 6
                    $tmpNode = $html->childNodes->item(0);
719 6
                    if ($tmpNode !== null) {
720
                        /** @noinspection UnusedFunctionResultInspection */
721 6
                        $fragment->appendChild($tmpNode);
722
                    }
723
                }
724
                /** @noinspection UnusedFunctionResultInspection */
725 6
                $html->parentNode->replaceChild($fragment, $html);
726
            }
727
728
            // Remove body element, preserving child nodes.
729 9
            $body = $newDocument->getDocument()->getElementsByTagName('body')->item(0);
730 9
            $fragment = $newDocument->getDocument()->createDocumentFragment();
731 9
            if ($body instanceof \DOMElement) {
732 4
                while ($body->childNodes->length > 0) {
733 4
                    $tmpNode = $body->childNodes->item(0);
734 4
                    if ($tmpNode !== null) {
735
                        /** @noinspection UnusedFunctionResultInspection */
736 4
                        $fragment->appendChild($tmpNode);
737
                    }
738
                }
739
                /** @noinspection UnusedFunctionResultInspection */
740 4
                $body->parentNode->replaceChild($fragment, $body);
741
742
                // At this point DOMDocument still added a "<p>"-wrapper around our string,
743
                // so we replace it with "<simpleHtmlDomP>" and delete this at the ending ...
744 4
                $item = $newDocument->getDocument()->getElementsByTagName('p')->item(0);
745 4
                if ($item !== null) {
746
                    /** @noinspection UnusedFunctionResultInspection */
747 4
                    $this->changeElementName($item, 'simpleHtmlDomP');
748
                }
749
            }
750
        }
751
752
        // Remove head element, preserving child nodes.
753 View Code Duplication
        if (
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
754 9
            $removeExtraHeadTag
755
            &&
756 9
            $this->node->parentNode instanceof \DOMElement
757
            &&
758 9
            $newDocument->getIsDOMDocumentCreatedWithoutHeadWrapper()
759
        ) {
760 2
            $html = $this->node->parentNode->getElementsByTagName('head')[0];
761 2
            if ($this->node->parentNode->ownerDocument !== null) {
762 2
                $fragment = $this->node->parentNode->ownerDocument->createDocumentFragment();
763 2
                if ($html !== null) {
764
                    /** @var \DOMNode $html */
765
                    while ($html->childNodes->length > 0) {
766
                        $tmpNode = $html->childNodes->item(0);
767
                        if ($tmpNode !== null) {
768
                            /** @noinspection UnusedFunctionResultInspection */
769
                            $fragment->appendChild($tmpNode);
770
                        }
771
                    }
772
                    /** @noinspection UnusedFunctionResultInspection */
773
                    $html->parentNode->replaceChild($fragment, $html);
774
                }
775
            }
776
        }
777
778 9
        return $newDocument;
779
    }
780
781
    /**
782
     * Retrieve an external iterator.
783
     *
784
     * @see  http://php.net/manual/en/iteratoraggregate.getiterator.php
785
     *
786
     * @return SimpleHtmlDomNode
787
     *                           <p>
788
     *                              An instance of an object implementing <b>Iterator</b> or
789
     *                              <b>Traversable</b>
790
     *                           </p>
791
     */
792 3 View Code Duplication
    public function getIterator(): SimpleHtmlDomNodeInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
793
    {
794 3
        $elements = new SimpleHtmlDomNode();
795 3
        if ($this->node->hasChildNodes()) {
796 3
            foreach ($this->node->childNodes as $node) {
797 3
                $elements[] = new static($node);
798
            }
799
        }
800
801 3
        return $elements;
802
    }
803
804
    /**
805
     * Get dom node's inner html.
806
     *
807
     * @param bool $multiDecodeNewHtmlEntity
808
     *
809
     * @return string
810
     */
811
    public function innerXml(bool $multiDecodeNewHtmlEntity = false): string
812
    {
813
        return $this->getHtmlDomParser()->innerXml($multiDecodeNewHtmlEntity);
814
    }
815
816
    /**
817
     * Normalize the given input for comparision.
818
     *
819
     * @param HtmlDomParser|string $input
820
     *
821
     * @return string
822
     */
823 9
    private function normalizeStringForComparision($input): string
824
    {
825 9
        if ($input instanceof HtmlDomParser) {
826 9
            $string = $input->outerText();
827
828 9
            if ($input->getIsDOMDocumentCreatedWithoutHeadWrapper()) {
829
                /** @noinspection HtmlRequiredTitleElement */
830 9
                $string = \str_replace(['<head>', '</head>'], '', $string);
831
            }
832
        } else {
833 9
            $string = (string) $input;
834
        }
835
836
        return
837 9
            \urlencode(
838 9
                \urldecode(
839 9
                    \trim(
840 9
                        \str_replace(
841
                            [
842 9
                                ' ',
843
                                "\n",
844
                                "\r",
845
                                '/>',
846
                            ],
847
                            [
848 9
                                '',
849
                                '',
850
                                '',
851
                                '>',
852
                            ],
853 9
                            \strtolower($string)
854
                        )
855
                    )
856
                )
857
            );
858
    }
859
}
860