Completed
Push — master ( 17e133...01a57c )
by Lars
03:36 queued 01:26
created

SimpleHtmlDom   F

Complexity

Total Complexity 130

Size/Duplication

Total Lines 906
Duplicated Lines 41.5 %

Coupling/Cohesion

Components 1
Dependencies 5

Test Coverage

Coverage 82.26%

Importance

Changes 0
Metric Value
dl 376
loc 906
ccs 218
cts 265
cp 0.8226
rs 1.694
c 0
b 0
f 0
wmc 130
lcom 1
cbo 5

40 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 4 1
A __call() 0 10 2
A find() 0 4 1
A getAllAttributes() 13 13 3
A hasAttributes() 0 4 1
A getAttribute() 0 10 2
A hasAttribute() 0 8 2
A html() 0 4 1
A innerHtml() 0 4 1
A removeAttribute() 0 8 2
B replaceChildWithString() 45 45 9
A replaceTextWithString() 18 18 3
A setAttribute() 16 16 6
A text() 0 4 1
A changeElementName() 25 25 4
A childNodes() 10 10 2
A findMulti() 0 4 1
A findMultiOrFalse() 0 4 1
A findOne() 0 4 1
A findOneOrFalse() 0 4 1
A firstChild() 11 11 2
A getElementByClass() 0 4 1
A getElementById() 0 4 1
A getElementByTagName() 14 14 3
A getElementsById() 0 4 1
B getElementsByTagName() 31 31 6
A getHtmlDomParser() 0 4 1
A getNode() 0 4 1
A isRemoved() 0 4 1
A lastChild() 11 11 2
A nextSibling() 11 11 2
A nextNonWhitespaceSibling() 16 16 4
A parentNode() 0 4 1
A previousSibling() 11 11 2
C replaceNodeWithString() 30 62 11
D val() 78 78 24
C cleanHtmlWrapper() 25 71 15
A getIterator() 11 11 3
A innerXml() 0 4 1
A normalizeStringForComparision() 0 36 3

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like SimpleHtmlDom often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use SimpleHtmlDom, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/**
8
 * @noinspection PhpHierarchyChecksInspection
9
 *
10
 * {@inheritdoc}
11
 *
12
 * @implements \IteratorAggregate<int, \DOMNode>
13
 */
14
class SimpleHtmlDom extends AbstractSimpleHtmlDom implements \IteratorAggregate, SimpleHtmlDomInterface
15
{
16
    /**
17
     * @param \DOMElement|\DOMNode $node
18
     */
19 154
    public function __construct(\DOMNode $node)
20
    {
21 154
        $this->node = $node;
22 154
    }
23
24
    /**
25
     * @param string $name
26
     * @param array  $arguments
27
     *
28
     * @throws \BadMethodCallException
29
     *
30
     * @return SimpleHtmlDomInterface|string|null
31
     */
32 10
    public function __call($name, $arguments)
33
    {
34 10
        $name = \strtolower($name);
35
36 10
        if (isset(self::$functionAliases[$name])) {
37 10
            return \call_user_func_array([$this, self::$functionAliases[$name]], $arguments);
38
        }
39
40
        throw new \BadMethodCallException('Method does not exist');
41
    }
42
43
    /**
44
     * Find list of nodes with a CSS selector.
45
     *
46
     * @param string   $selector
47
     * @param int|null $idx
48
     *
49
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleHtmlDomInterface|S...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 74. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
50
     */
51 27
    public function find(string $selector, $idx = null)
52
    {
53 27
        return $this->getHtmlDomParser()->find($selector, $idx);
54
    }
55
56
    /**
57
     * Returns an array of attributes.
58
     *
59
     * @return string[]|null
60
     */
61 3 View Code Duplication
    public function getAllAttributes()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
62
    {
63 3
        if ($this->node->hasAttributes()) {
64 3
            $attributes = [];
65 3
            foreach ($this->node->attributes as $attr) {
66 3
                $attributes[$attr->name] = HtmlDomParser::putReplacedBackToPreserveHtmlEntities($attr->value);
67
            }
68
69 3
            return $attributes;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $attributes; (array) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDo...rface::getAllAttributes of type string[]|null.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
70
        }
71
72 1
        return null;
73
    }
74
75
    /**
76
     * @return bool
77
     */
78
    public function hasAttributes(): bool
79
    {
80
        return $this->node->hasAttributes();
81
    }
82
83
    /**
84
     * Return attribute value.
85
     *
86
     * @param string $name
87
     *
88
     * @return string
89
     */
90 25
    public function getAttribute(string $name): string
91
    {
92 25
        if ($this->node instanceof \DOMElement) {
93 25
            return HtmlDomParser::putReplacedBackToPreserveHtmlEntities(
94 25
                $this->node->getAttribute($name)
95
            );
96
        }
97
98
        return '';
99
    }
100
101
    /**
102
     * Determine if an attribute exists on the element.
103
     *
104
     * @param string $name
105
     *
106
     * @return bool
107
     */
108 2
    public function hasAttribute(string $name): bool
109
    {
110 2
        if (!$this->node instanceof \DOMElement) {
111
            return false;
112
        }
113
114 2
        return $this->node->hasAttribute($name);
115
    }
116
117
    /**
118
     * Get dom node's outer html.
119
     *
120
     * @param bool $multiDecodeNewHtmlEntity
121
     *
122
     * @return string
123
     */
124 33
    public function html(bool $multiDecodeNewHtmlEntity = false): string
125
    {
126 33
        return $this->getHtmlDomParser()->html($multiDecodeNewHtmlEntity);
127
    }
128
129
    /**
130
     * Get dom node's inner html.
131
     *
132
     * @param bool $multiDecodeNewHtmlEntity
133
     *
134
     * @return string
135
     */
136 22
    public function innerHtml(bool $multiDecodeNewHtmlEntity = false): string
137
    {
138 22
        return $this->getHtmlDomParser()->innerHtml($multiDecodeNewHtmlEntity);
139
    }
140
141
    /**
142
     * Remove attribute.
143
     *
144
     * @param string $name <p>The name of the html-attribute.</p>
145
     *
146
     * @return SimpleHtmlDomInterface
147
     */
148 2
    public function removeAttribute(string $name): SimpleHtmlDomInterface
149
    {
150 2
        if (\method_exists($this->node, 'removeAttribute')) {
151 2
            $this->node->removeAttribute($name);
0 ignored issues
show
Bug introduced by
The method removeAttribute does only exist in DOMElement, but not in DOMNode.

It seems like the method you are trying to call exists only in some of the possible types.

Let’s take a look at an example:

class A
{
    public function foo() { }
}

class B extends A
{
    public function bar() { }
}

/**
 * @param A|B $x
 */
function someFunction($x)
{
    $x->foo(); // This call is fine as the method exists in A and B.
    $x->bar(); // This method only exists in B and might cause an error.
}

Available Fixes

  1. Add an additional type-check:

    /**
     * @param A|B $x
     */
    function someFunction($x)
    {
        $x->foo();
    
        if ($x instanceof B) {
            $x->bar();
        }
    }
    
  2. Only allow a single type to be passed if the variable comes from a parameter:

    function someFunction(B $x) { /** ... */ }
    
Loading history...
152
        }
153
154 2
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDo...erface::removeAttribute of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
155
    }
156
157
    /**
158
     * Replace child node.
159
     *
160
     * @param string $string
161
     *
162
     * @return SimpleHtmlDomInterface
163
     */
164 8 View Code Duplication
    protected function replaceChildWithString(string $string): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
165
    {
166 8
        if (!empty($string)) {
167 7
            $newDocument = new HtmlDomParser($string);
168
169 7
            $tmpDomString = $this->normalizeStringForComparision($newDocument);
170 7
            $tmpStr = $this->normalizeStringForComparision($string);
171 7
            if ($tmpDomString !== $tmpStr) {
172
                throw new \RuntimeException(
173
                    'Not valid HTML fragment!' . "\n" .
174
                    $tmpDomString . "\n" .
175
                    $tmpStr
176
                );
177
            }
178
        }
179
180
        /** @var \DOMNode[] $remove_nodes */
181 8
        $remove_nodes = [];
182 8
        if ($this->node->childNodes->length > 0) {
183
            // INFO: We need to fetch the nodes first, before we can delete them, because of missing references in the dom,
184
            // if we delete the elements on the fly.
185 8
            foreach ($this->node->childNodes as $node) {
186 8
                $remove_nodes[] = $node;
187
            }
188
        }
189 8
        foreach ($remove_nodes as $remove_node) {
190 8
            $this->node->removeChild($remove_node);
191
        }
192
193 8
        if (!empty($newDocument)) {
194 7
            $newDocument = $this->cleanHtmlWrapper($newDocument);
195 7
            $ownerDocument = $this->node->ownerDocument;
196
            if (
197 7
                $ownerDocument !== null
198
                &&
199 7
                $newDocument->getDocument()->documentElement !== null
200
            ) {
201 7
                $newNode = $ownerDocument->importNode($newDocument->getDocument()->documentElement, true);
202
                /** @noinspection UnusedFunctionResultInspection */
203 7
                $this->node->appendChild($newNode);
204
            }
205
        }
206
207 8
        return $this;
208
    }
209
210
    /**
211
     * Replace this node.
212
     *
213
     * @param string $string
214
     *
215
     * @return SimpleHtmlDomInterface
216
     */
217 6
    protected function replaceNodeWithString(string $string): SimpleHtmlDomInterface
218
    {
219 6
        if (empty($string)) {
220 2
            $this->node->parentNode->removeChild($this->node);
221
222 2
            return $this;
223
        }
224
225 5
        $newDocument = new HtmlDomParser($string);
226
227 5
        $tmpDomOuterTextString = $this->normalizeStringForComparision($newDocument);
228 5
        $tmpStr = $this->normalizeStringForComparision($string);
229 5 View Code Duplication
        if ($tmpDomOuterTextString !== $tmpStr) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
230
            throw new \RuntimeException(
231
                'Not valid HTML fragment!' . "\n"
232
                . $tmpDomOuterTextString . "\n" .
233
                $tmpStr
234
            );
235
        }
236
237 5
        $newDocument = $this->cleanHtmlWrapper($newDocument, true);
238 5
        $ownerDocument = $this->node->ownerDocument;
239
        if (
240 5
            $ownerDocument === null
241
            ||
242 5
            $newDocument->getDocument()->documentElement === null
243
        ) {
244
            return $this;
245
        }
246
247 5
        $newNode = $ownerDocument->importNode($newDocument->getDocument()->documentElement, true);
248
249 5
        $this->node->parentNode->replaceChild($newNode, $this->node);
250 5
        $this->node = $newNode;
251
252
        // Remove head element, preserving child nodes. (again)
253 View Code Duplication
        if (
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
254 5
            $this->node->parentNode instanceof \DOMElement
255
            &&
256 5
            $newDocument->getIsDOMDocumentCreatedWithoutHeadWrapper()
257
        ) {
258 3
            $html = $this->node->parentNode->getElementsByTagName('head')[0];
259
260 3
            if ($this->node->parentNode->ownerDocument !== null) {
261 3
                $fragment = $this->node->parentNode->ownerDocument->createDocumentFragment();
262 3
                if ($html !== null) {
263
                    /** @var \DOMNode $html */
264 1
                    while ($html->childNodes->length > 0) {
265 1
                        $tmpNode = $html->childNodes->item(0);
266 1
                        if ($tmpNode !== null) {
267
                            /** @noinspection UnusedFunctionResultInspection */
268 1
                            $fragment->appendChild($tmpNode);
269
                        }
270
                    }
271
                    /** @noinspection UnusedFunctionResultInspection */
272 1
                    $html->parentNode->replaceChild($fragment, $html);
273
                }
274
            }
275
        }
276
277 5
        return $this;
278
    }
279
280
    /**
281
     * Replace this node with text
282
     *
283
     * @param string $string
284
     *
285
     * @return SimpleHtmlDomInterface
286
     */
287 1 View Code Duplication
    protected function replaceTextWithString($string): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
288
    {
289 1
        if (empty($string)) {
290 1
            $this->node->parentNode->removeChild($this->node);
291
292 1
            return $this;
293
        }
294
295 1
        $ownerDocument = $this->node->ownerDocument;
296 1
        if ($ownerDocument !== null) {
297 1
            $newElement = $ownerDocument->createTextNode($string);
298 1
            $newNode = $ownerDocument->importNode($newElement, true);
299 1
            $this->node->parentNode->replaceChild($newNode, $this->node);
300 1
            $this->node = $newNode;
301
        }
302
303 1
        return $this;
304
    }
305
306
    /**
307
     * Set attribute value.
308
     *
309
     * @param string      $name       <p>The name of the html-attribute.</p>
310
     * @param string|null $value      <p>Set to NULL or empty string, to remove the attribute.</p>
311
     * @param bool        $strict     </p>
312
     *                                $value must be NULL, to remove the attribute,
313
     *                                so that you can set an empty string as attribute-value e.g. autofocus=""
314
     *                                </p>
315
     *
316
     * @return SimpleHtmlDomInterface
317
     */
318 15 View Code Duplication
    public function setAttribute(string $name, $value = null, bool $strict = false): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
319
    {
320
        if (
321 15
            ($strict && $value === null)
322
            ||
323 15
            (!$strict && empty($value))
324
        ) {
325
            /** @noinspection UnusedFunctionResultInspection */
326 2
            $this->removeAttribute($name);
327 15
        } elseif (\method_exists($this->node, 'setAttribute')) {
328
            /** @noinspection UnusedFunctionResultInspection */
329 15
            $this->node->setAttribute($name, $value);
0 ignored issues
show
Bug introduced by
The method setAttribute does only exist in DOMElement, but not in DOMNode.

It seems like the method you are trying to call exists only in some of the possible types.

Let’s take a look at an example:

class A
{
    public function foo() { }
}

class B extends A
{
    public function bar() { }
}

/**
 * @param A|B $x
 */
function someFunction($x)
{
    $x->foo(); // This call is fine as the method exists in A and B.
    $x->bar(); // This method only exists in B and might cause an error.
}

Available Fixes

  1. Add an additional type-check:

    /**
     * @param A|B $x
     */
    function someFunction($x)
    {
        $x->foo();
    
        if ($x instanceof B) {
            $x->bar();
        }
    }
    
  2. Only allow a single type to be passed if the variable comes from a parameter:

    function someFunction(B $x) { /** ... */ }
    
Loading history...
330
        }
331
332 15
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::setAttribute of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
333
    }
334
335
    /**
336
     * Get dom node's plain text.
337
     *
338
     * @return string
339
     */
340 28
    public function text(): string
341
    {
342 28
        return $this->getHtmlDomParser()->fixHtmlOutput($this->node->textContent);
343
    }
344
345
    /**
346
     * Change the name of a tag in a "DOMNode".
347
     *
348
     * @param \DOMNode $node
349
     * @param string   $name
350
     *
351
     * @return \DOMElement|false
352
     *                          <p>DOMElement a new instance of class DOMElement or false
353
     *                          if an error occured.</p>
354
     */
355 9 View Code Duplication
    protected function changeElementName(\DOMNode $node, string $name)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
356
    {
357 9
        $ownerDocument = $node->ownerDocument;
358 9
        if ($ownerDocument) {
359 9
            $newNode = $ownerDocument->createElement($name);
360
        } else {
361
            return false;
362
        }
363
364 9
        foreach ($node->childNodes as $child) {
365 9
            $child = $ownerDocument->importNode($child, true);
366
            /** @noinspection UnusedFunctionResultInspection */
367 9
            $newNode->appendChild($child);
368
        }
369
370 9
        foreach ($node->attributes as $attrName => $attrNode) {
371
            /** @noinspection UnusedFunctionResultInspection */
372
            $newNode->setAttribute($attrName, $attrNode);
373
        }
374
375
        /** @noinspection UnusedFunctionResultInspection */
376 9
        $newNode->ownerDocument->replaceChild($newNode, $node);
377
378 9
        return $newNode;
379
    }
380
381
    /**
382
     * Returns children of node.
383
     *
384
     * @param int $idx
385
     *
386
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface|null
387
     */
388 2 View Code Duplication
    public function childNodes(int $idx = -1)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
389
    {
390 2
        $nodeList = $this->getIterator();
391
392 2
        if ($idx === -1) {
393 2
            return $nodeList;
394
        }
395
396 2
        return $nodeList[$idx] ?? null;
397
    }
398
399
    /**
400
     * Find nodes with a CSS selector.
401
     *
402
     * @param string $selector
403
     *
404
     * @return SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleHtmlDomInterface[]...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 51. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
405
     */
406 1
    public function findMulti(string $selector): SimpleHtmlDomNodeInterface
407
    {
408 1
        return $this->getHtmlDomParser()->findMulti($selector);
409
    }
410
411
    /**
412
     * Find nodes with a CSS selector or false, if no element is found.
413
     *
414
     * @param string $selector
415
     *
416
     * @return false|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type false|SimpleHtmlDomInter...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 57. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
417
     */
418 1
    public function findMultiOrFalse(string $selector)
419
    {
420 1
        return $this->getHtmlDomParser()->findMultiOrFalse($selector);
421
    }
422
423
    /**
424
     * Find one node with a CSS selector.
425
     *
426
     * @param string $selector
427
     *
428
     * @return SimpleHtmlDomInterface
429
     */
430 3
    public function findOne(string $selector): SimpleHtmlDomInterface
431
    {
432 3
        return $this->getHtmlDomParser()->findOne($selector);
433
    }
434
435
    /**
436
     * Find one node with a CSS selector or false, if no element is found.
437
     *
438
     * @param string $selector
439
     *
440
     * @return false|SimpleHtmlDomInterface
441
     */
442 1
    public function findOneOrFalse(string $selector)
443
    {
444 1
        return $this->getHtmlDomParser()->findOneOrFalse($selector);
445
    }
446
447
    /**
448
     * Returns the first child of node.
449
     *
450
     * @return SimpleHtmlDomInterface|null
451
     */
452 4 View Code Duplication
    public function firstChild()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
453
    {
454
        /** @var \DOMNode|null $node */
455 4
        $node = $this->node->firstChild;
456
457 4
        if ($node === null) {
458 1
            return null;
459
        }
460
461 4
        return new static($node);
462
    }
463
464
    /**
465
     * Return elements by ".class".
466
     *
467
     * @param string $class
468
     *
469
     * @return SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleHtmlDomInterface[]...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 51. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
470
     */
471
    public function getElementByClass(string $class): SimpleHtmlDomNodeInterface
472
    {
473
        return $this->findMulti(".${class}");
474
    }
475
476
    /**
477
     * Return element by #id.
478
     *
479
     * @param string $id
480
     *
481
     * @return SimpleHtmlDomInterface
482
     */
483
    public function getElementById(string $id): SimpleHtmlDomInterface
484
    {
485 1
        return $this->findOne("#${id}");
486
    }
487
488
    /**
489
     * Return element by tag name.
490
     *
491
     * @param string $name
492
     *
493
     * @return SimpleHtmlDomInterface
494
     */
495 View Code Duplication
    public function getElementByTagName(string $name): SimpleHtmlDomInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
496
    {
497 1
        if ($this->node instanceof \DOMElement) {
498 1
            $node = $this->node->getElementsByTagName($name)->item(0);
499
        } else {
500
            $node = null;
501
        }
502
503 1
        if ($node === null) {
504
            return new SimpleHtmlDomBlank();
505
        }
506
507 1
        return new static($node);
508
    }
509
510
    /**
511
     * Returns elements by "#id".
512
     *
513
     * @param string   $id
514
     * @param int|null $idx
515
     *
516
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleHtmlDomInterface|S...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 74. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
517
     */
518
    public function getElementsById(string $id, $idx = null)
519
    {
520
        return $this->find("#${id}", $idx);
521
    }
522
523
    /**
524
     * Returns elements by tag name.
525
     *
526
     * @param string   $name
527
     * @param int|null $idx
528
     *
529
     * @return SimpleHtmlDomInterface|SimpleHtmlDomInterface[]|SimpleHtmlDomNodeInterface<SimpleHtmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleHtmlDomInterface|S...SimpleHtmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 74. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
530
     */
531 View Code Duplication
    public function getElementsByTagName(string $name, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
532
    {
533 1
        if ($this->node instanceof \DOMElement) {
534 1
            $nodesList = $this->node->getElementsByTagName($name);
535
        } else {
536
            $nodesList = [];
537
        }
538
539 1
        $elements = new SimpleHtmlDomNode();
540
541 1
        foreach ($nodesList as $node) {
542 1
            $elements[] = new static($node);
543
        }
544
545
        // return all elements
546 1
        if ($idx === null) {
547 1
            if (\count($elements) === 0) {
548
                return new SimpleHtmlDomNodeBlank();
549
            }
550
551 1
            return $elements;
552
        }
553
554
        // handle negative values
555
        if ($idx < 0) {
556
            $idx = \count($elements) + $idx;
557
        }
558
559
        // return one element
560
        return $elements[$idx] ?? new SimpleHtmlDomBlank();
561
    }
562
563
    /**
564
     * Create a new "HtmlDomParser"-object from the current context.
565
     *
566
     * @return HtmlDomParser
567
     */
568
    public function getHtmlDomParser(): HtmlDomParser
569
    {
570 97
        return new HtmlDomParser($this);
571
    }
572
573
    /**
574
     * @return \DOMNode
575
     */
576
    public function getNode(): \DOMNode
577
    {
578 98
        return $this->node;
579
    }
580
581
    /**
582
     * Nodes can get partially destroyed in which they're still an
583
     * actual DOM node (such as \DOMElement) but almost their entire
584
     * body is gone, including the `nodeType` attribute.
585
     *
586
     * @return bool true if node has been destroyed
587
     */
588
    public function isRemoved(): bool
589
    {
590
        return !isset($this->node->nodeType);
591
    }
592
593
    /**
594
     * Returns the last child of node.
595
     *
596
     * @return SimpleHtmlDomInterface|null
597
     */
598 View Code Duplication
    public function lastChild()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
599
    {
600
        /** @var \DOMNode|null $node */
601 4
        $node = $this->node->lastChild;
602
603 4
        if ($node === null) {
604 1
            return null;
605
        }
606
607 4
        return new static($node);
608
    }
609
610
    /**
611
     * Returns the next sibling of node.
612
     *
613
     * @return SimpleHtmlDomInterface|null
614
     */
615 View Code Duplication
    public function nextSibling()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
616
    {
617
        /** @var \DOMNode|null $node */
618 1
        $node = $this->node->nextSibling;
619
620 1
        if ($node === null) {
621 1
            return null;
622
        }
623
624 1
        return new static($node);
625
    }
626
627
    /**
628
     * Returns the next sibling of node.
629
     *
630
     * @return SimpleHtmlDomInterface|null
631
     */
632 View Code Duplication
    public function nextNonWhitespaceSibling()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
633
    {
634
        /** @var \DOMNode|null $node */
635 1
        $node = $this->node->nextSibling;
636
637 1
        while ($node && !\trim($node->textContent)) {
638
            /** @var \DOMNode|null $node */
639 1
            $node = $node->nextSibling;
640
        }
641
642 1
        if ($node === null) {
643
            return null;
644
        }
645
646 1
        return new static($node);
647
    }
648
649
    /**
650
     * Returns the parent of node.
651
     *
652
     * @return SimpleHtmlDomInterface
653
     */
654
    public function parentNode(): SimpleHtmlDomInterface
655
    {
656 2
        return new static($this->node->parentNode);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return new static($this->node->parentNode); (voku\helper\SimpleHtmlDom) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::parentNode of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
657
    }
658
659
    /**
660
     * Returns the previous sibling of node.
661
     *
662
     * @return SimpleHtmlDomInterface|null
663
     */
664 View Code Duplication
    public function previousSibling()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
665
    {
666
        /** @var \DOMNode|null $node */
667 1
        $node = $this->node->previousSibling;
668
669 1
        if ($node === null) {
670 1
            return null;
671
        }
672
673 1
        return new static($node);
674
    }
675
676
    /**
677
     * @param string|string[]|null $value <p>
678
     *                                    null === get the current input value
679
     *                                    text === set a new input value
680
     *                                    </p>
681
     *
682
     * @return string|string[]|null
683
     */
684 View Code Duplication
    public function val($value = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
685
    {
686 1
        if ($value === null) {
687
            if (
688 1
                $this->tag === 'input'
0 ignored issues
show
Documentation introduced by
The property tag does not exist on object<voku\helper\SimpleHtmlDom>. Since you implemented __get, maybe consider adding a @property annotation.

Since your code implements the magic getter _get, this function will be called for any read access on an undefined variable. You can add the @property annotation to your class or interface to document the existence of this variable.

<?php

/**
 * @property int $x
 * @property int $y
 * @property string $text
 */
class MyLabel
{
    private $properties;

    private $allowedProperties = array('x', 'y', 'text');

    public function __get($name)
    {
        if (isset($properties[$name]) && in_array($name, $this->allowedProperties)) {
            return $properties[$name];
        } else {
            return null;
        }
    }

    public function __set($name, $value)
    {
        if (in_array($name, $this->allowedProperties)) {
            $properties[$name] = $value;
        } else {
            throw new \LogicException("Property $name is not defined.");
        }
    }

}

If the property has read access only, you can use the @property-read annotation instead.

Of course, you may also just have mistyped another name, in which case you should fix the error.

See also the PhpDoc documentation for @property.

Loading history...
689
                &&
690
                (
691 1
                    $this->getAttribute('type') === 'hidden'
692
                    ||
693 1
                    $this->getAttribute('type') === 'text'
694
                    ||
695 1
                    !$this->hasAttribute('type')
696
                )
697
            ) {
698 1
                return $this->getAttribute('value');
699
            }
700
701
            if (
702 1
                $this->hasAttribute('checked')
703
                &&
704 1
                \in_array($this->getAttribute('type'), ['checkbox', 'radio'], true)
705
            ) {
706 1
                return $this->getAttribute('value');
707
            }
708
709 1
            if ($this->node->nodeName === 'select') {
710
                $valuesFromDom = [];
711
                $options = $this->getElementsByTagName('option');
712
                if ($options instanceof SimpleHtmlDomNode) {
713
                    foreach ($options as $option) {
714
                        if ($this->hasAttribute('checked')) {
715
                            /** @noinspection UnnecessaryCastingInspection */
716
                            $valuesFromDom[] = (string) $option->getAttribute('value');
717
                        }
718
                    }
719
                }
720
721
                if (\count($valuesFromDom) === 0) {
722
                    return null;
723
                }
724
725
                return $valuesFromDom;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $valuesFromDom; (array) is incompatible with the return type declared by the interface voku\helper\SimpleHtmlDomInterface::val of type string|string[]|null.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
726
            }
727
728 1
            if ($this->node->nodeName === 'textarea') {
729 1
                return $this->node->nodeValue;
730
            }
731
        } else {
732
            /** @noinspection NestedPositiveIfStatementsInspection */
733 1
            if (\in_array($this->getAttribute('type'), ['checkbox', 'radio'], true)) {
734 1
                if ($value === $this->getAttribute('value')) {
735
                    /** @noinspection UnusedFunctionResultInspection */
736 1
                    $this->setAttribute('checked', 'checked');
737
                } else {
738
                    /** @noinspection UnusedFunctionResultInspection */
739 1
                    $this->removeAttribute('checked');
740
                }
741 1
            } elseif ($this->node instanceof \DOMElement && $this->node->nodeName === 'select') {
742
                foreach ($this->node->getElementsByTagName('option') as $option) {
743
                    /** @var \DOMElement $option */
744
                    if ($value === $option->getAttribute('value')) {
745
                        /** @noinspection UnusedFunctionResultInspection */
746
                        $option->setAttribute('selected', 'selected');
747
                    } else {
748
                        $option->removeAttribute('selected');
749
                    }
750
                }
751 1
            } elseif ($this->node->nodeName === 'input' && \is_string($value)) {
752
                // Set value for input elements
753
                /** @noinspection UnusedFunctionResultInspection */
754 1
                $this->setAttribute('value', $value);
755 1
            } elseif ($this->node->nodeName === 'textarea' && \is_string($value)) {
756 1
                $this->node->nodeValue = $value;
757
            }
758
        }
759
760 1
        return null;
761
    }
762
763
    /**
764
     * @param HtmlDomParser $newDocument
765
     * @param bool          $removeExtraHeadTag
766
     *
767
     * @return HtmlDomParser
768
     */
769
    protected function cleanHtmlWrapper(
770
        HtmlDomParser $newDocument,
771
        $removeExtraHeadTag = false
772
    ): HtmlDomParser {
773
        if (
774 12
            $newDocument->getIsDOMDocumentCreatedWithoutHtml()
775
            ||
776 12
            $newDocument->getIsDOMDocumentCreatedWithoutHtmlWrapper()
777
        ) {
778
779
            // Remove doc-type node.
780 12
            if ($newDocument->getDocument()->doctype !== null) {
781
                /** @noinspection UnusedFunctionResultInspection */
782
                $newDocument->getDocument()->doctype->parentNode->removeChild($newDocument->getDocument()->doctype);
783
            }
784
785
            // Replace html element, preserving child nodes -> but keep the html wrapper, otherwise we got other problems ...
786
            // so we replace it with "<simpleHtmlDomHtml>" and delete this at the ending.
787 12
            $item = $newDocument->getDocument()->getElementsByTagName('html')->item(0);
788 12
            if ($item !== null) {
789
                /** @noinspection UnusedFunctionResultInspection */
790 9
                $this->changeElementName($item, 'simpleHtmlDomHtml');
791
            }
792
793
            // Remove body element, preserving child nodes.
794 12
            $body = $newDocument->getDocument()->getElementsByTagName('body')->item(0);
795 12
            if ($body instanceof \DOMElement) {
796 7
                $fragment = $newDocument->getDocument()->createDocumentFragment();
797
798 7
                while ($body->childNodes->length > 0) {
799 7
                    $tmpNode = $body->childNodes->item(0);
800 7
                    if ($tmpNode !== null) {
801
                        /** @noinspection UnusedFunctionResultInspection */
802 7
                        $fragment->appendChild($tmpNode);
803
                    }
804
                }
805
806
                /** @noinspection UnusedFunctionResultInspection */
807 7
                $body->parentNode->replaceChild($fragment, $body);
808
            }
809
        }
810
811
        // Remove head element, preserving child nodes.
812 View Code Duplication
        if (
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
813 12
            $removeExtraHeadTag
814
            &&
815 12
            $this->node->parentNode instanceof \DOMElement
816
            &&
817 12
            $newDocument->getIsDOMDocumentCreatedWithoutHeadWrapper()
818
        ) {
819 3
            $html = $this->node->parentNode->getElementsByTagName('head')[0] ?? null;
820
821 3
            if ($html !== null && $this->node->parentNode->ownerDocument !== null) {
822
                $fragment = $this->node->parentNode->ownerDocument->createDocumentFragment();
823
824
                /** @var \DOMNode $html */
825
                while ($html->childNodes->length > 0) {
826
                    $tmpNode = $html->childNodes->item(0);
827
                    if ($tmpNode !== null) {
828
                        /** @noinspection UnusedFunctionResultInspection */
829
                        $fragment->appendChild($tmpNode);
830
                    }
831
                }
832
833
                /** @noinspection UnusedFunctionResultInspection */
834
                $html->parentNode->replaceChild($fragment, $html);
835
            }
836
        }
837
838 12
        return $newDocument;
839
    }
840
841
    /**
842
     * Retrieve an external iterator.
843
     *
844
     * @see  http://php.net/manual/en/iteratoraggregate.getiterator.php
845
     *
846
     * @return SimpleHtmlDomNode
847
     *                           <p>
848
     *                              An instance of an object implementing <b>Iterator</b> or
849
     *                              <b>Traversable</b>
850
     *                           </p>
851
     */
852 View Code Duplication
    public function getIterator(): SimpleHtmlDomNodeInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
853
    {
854 3
        $elements = new SimpleHtmlDomNode();
855 3
        if ($this->node->hasChildNodes()) {
856 3
            foreach ($this->node->childNodes as $node) {
857 3
                $elements[] = new static($node);
858
            }
859
        }
860
861 3
        return $elements;
862
    }
863
864
    /**
865
     * Get dom node's inner html.
866
     *
867
     * @param bool $multiDecodeNewHtmlEntity
868
     *
869
     * @return string
870
     */
871
    public function innerXml(bool $multiDecodeNewHtmlEntity = false): string
872
    {
873
        return $this->getHtmlDomParser()->innerXml($multiDecodeNewHtmlEntity);
874
    }
875
876
    /**
877
     * Normalize the given input for comparision.
878
     *
879
     * @param HtmlDomParser|string $input
880
     *
881
     * @return string
882
     */
883
    private function normalizeStringForComparision($input): string
884
    {
885 12
        if ($input instanceof HtmlDomParser) {
886 12
            $string = $input->outerText();
887
888 12
            if ($input->getIsDOMDocumentCreatedWithoutHeadWrapper()) {
889
                /** @noinspection HtmlRequiredTitleElement */
890 12
                $string = \str_replace(['<head>', '</head>'], '', $string);
891
            }
892
        } else {
893 12
            $string = (string) $input;
894
        }
895
896
        return
897 12
            \urlencode(
898 12
                \urldecode(
899 12
                    \trim(
900 12
                        \str_replace(
901
                            [
902 12
                                ' ',
903
                                "\n",
904
                                "\r",
905
                                '/>',
906
                            ],
907
                            [
908 12
                                '',
909
                                '',
910
                                '',
911
                                '>',
912
                            ],
913 12
                            \strtolower($string)
914
                        )
915
                    )
916
                )
917
            );
918
    }
919
}
920