Completed
Pull Request — master (#61)
by
unknown
02:29
created

XmlDomParser::reportXmlErrorsAsException()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 6

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 2
CRAP Score 1

Importance

Changes 0
Metric Value
cc 1
nc 1
nop 1
dl 0
loc 6
ccs 2
cts 2
cp 1
crap 1
rs 10
c 0
b 0
f 0
1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/**
8
 * @property-read string $plaintext
9
 *                                 <p>Get dom node's plain text.</p>
10
 *
11
 * @method static XmlDomParser file_get_xml($xml, $libXMLExtraOptions = null)
12
 *                                 <p>Load XML from file.</p>
13
 * @method static XmlDomParser str_get_xml($xml, $libXMLExtraOptions = null)
14
 *                                 <p>Load XML from string.</p>
15
 */
16
class XmlDomParser extends AbstractDomParser
17
{
18
    /**
19
     * @var callable|null
20
     *
21
     * @phpstan-var null|callable(string $cssSelectorString, string $xPathString, \DOMXPath, \voku\helper\XmlDomParser): string
22
     */
23
    private $callbackXPathBeforeQuery;
24
25
    /**
26
     * @var callable|null
27
     *
28
     * @phpstan-var null|callable(string $xmlString, \voku\helper\XmlDomParser): string
29
     */
30
    private $callbackBeforeCreateDom;
31
32
    /**
33
     * @var bool
34
     */
35
    private $autoRemoveXPathNamespaces = false;
36
37
    /**
38
     * @var bool
39
     */
40
    private $autoRegisterXPathNamespaces = false;
41
42
    /**
43
     * @var bool
44
     */
45
    private $reportXmlErrorsAsException = false;
46
47
    /**
48
     * @var string[]
49
     *
50
     * @phpstan-var array<string, string>
51
     */
52
    private $xPathNamespaces = [];
53
54
    /**
55
     * @param \DOMNode|SimpleXmlDomInterface|string $element HTML code or SimpleXmlDomInterface, \DOMNode
56
     */
57 9 View Code Duplication
    public function __construct($element = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
58
    {
59 9
        $this->document = new \DOMDocument('1.0', $this->getEncoding());
60
61
        // DOMDocument settings
62 9
        $this->document->preserveWhiteSpace = true;
63 9
        $this->document->formatOutput = true;
64
65 9
        if ($element instanceof SimpleXmlDomInterface) {
66 3
            $element = $element->getNode();
67
        }
68
69 9
        if ($element instanceof \DOMNode) {
70 3
            $domNode = $this->document->importNode($element, true);
71
72 3
            if ($domNode instanceof \DOMNode) {
73
                /** @noinspection UnusedFunctionResultInspection */
74 3
                $this->document->appendChild($domNode);
75
            }
76
77 3
            return;
78
        }
79
80 9
        if ($element !== null) {
81
            $this->loadXml($element);
82
        }
83 9
    }
84
85
    /**
86
     * @param string $name
87
     * @param array  $arguments
88
     *
89
     * @throws \BadMethodCallException
90
     * @throws \RuntimeException
91
     *
92
     * @return XmlDomParser
93
     */
94 3 View Code Duplication
    public static function __callStatic($name, $arguments)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
95
    {
96 3
        $arguments0 = $arguments[0] ?? '';
97
98 3
        $arguments1 = $arguments[1] ?? null;
99
100 3
        if ($name === 'str_get_xml') {
101 1
            $parser = new static();
102
103 1
            return $parser->loadXml($arguments0, $arguments1);
104
        }
105
106 2
        if ($name === 'file_get_xml') {
107 2
            $parser = new static();
108
109 2
            return $parser->loadXmlFile($arguments0, $arguments1);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $parser->loadXmlF...guments0, $arguments1); (self) is incompatible with the return type declared by the abstract method voku\helper\AbstractDomParser::__callStatic of type voku\helper\AbstractDomParser.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
110
        }
111
112
        throw new \BadMethodCallException('Method does not exist');
113
    }
114
115
    /** @noinspection MagicMethodsValidityInspection */
116
117
    /**
118
     * @param string $name
119
     *
120
     * @return string|null
121
     */
122
    public function __get($name)
123
    {
124
        $name = \strtolower($name);
125
126
        if ($name === 'plaintext') {
127
            return $this->text();
128
        }
129
130
        return null;
131
    }
132
133
    /**
134
     * @return string
135
     */
136 2
    public function __toString()
137
    {
138 2
        return $this->xml(false, false, true, 0);
139
    }
140
141
    /**
142
     * Create DOMDocument from XML.
143
     *
144
     * @param string   $xml
145
     * @param int|null $libXMLExtraOptions
146
     *
147
     * @return \DOMDocument
148
     */
149 9
    protected function createDOMDocument(string $xml, $libXMLExtraOptions = null): \DOMDocument
150
    {
151 9
        if ($this->callbackBeforeCreateDom) {
152 1
            $xml = \call_user_func($this->callbackBeforeCreateDom, $xml, $this);
153
        }
154
155
        // set error level
156 9
        $internalErrors = \libxml_use_internal_errors(true);
157 9
        if (\PHP_VERSION_ID < 80000) {
158 9
            $disableEntityLoader = \libxml_disable_entity_loader(true);
159
        }
160 9
        \libxml_clear_errors();
161
162 9
        $optionsXml = \LIBXML_DTDLOAD | \LIBXML_DTDATTR | \LIBXML_NONET;
163
164 9
        if (\defined('LIBXML_BIGLINES')) {
165 9
            $optionsXml |= \LIBXML_BIGLINES;
166
        }
167
168 9
        if (\defined('LIBXML_COMPACT')) {
169 9
            $optionsXml |= \LIBXML_COMPACT;
170
        }
171
172 9
        if ($libXMLExtraOptions !== null) {
173
            $optionsXml |= $libXMLExtraOptions;
174
        }
175
176 9
        $this->xPathNamespaces = []; // reset
177 9
        $matches = [];
178 9
        \preg_match_all('#xmlns:(?<namespaceKey>.*)=(["\'])(?<namespaceValue>.*)\\2#Ui', $xml, $matches);
179 9
        foreach ($matches['namespaceKey'] ?? [] as $index => $key) {
180 4
            if ($key) {
181 4
                $this->xPathNamespaces[\trim($key, ':')] = $matches['namespaceValue'][$index];
182
            }
183
        }
184
185 9
        if ($this->autoRemoveXPathNamespaces) {
186 3
            $xml = $this->removeXPathNamespaces($xml);
187
        }
188
189 9
        $xml = self::replaceToPreserveHtmlEntities($xml);
190
191 9
        $documentFound = false;
192 9
        $sxe = \simplexml_load_string($xml, \SimpleXMLElement::class, $optionsXml);
193 9
        $xmlErrors = \libxml_get_errors();
194 9 View Code Duplication
        if ($sxe !== false && \count($xmlErrors) === 0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
195 7
            $domElementTmp = \dom_import_simplexml($sxe);
196
            if (
197 7
                $domElementTmp
198
                &&
199 7
                $domElementTmp->ownerDocument instanceof \DOMDocument
200
            ) {
201 7
                $documentFound = true;
202 7
                $this->document = $domElementTmp->ownerDocument;
203
            }
204
        }
205
206 9 View Code Duplication
        if ($documentFound === false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
207
208
            // UTF-8 hack: http://php.net/manual/en/domdocument.loadhtml.php#95251
209 2
            $xmlHackUsed = false;
210
            /** @noinspection StringFragmentMisplacedInspection */
211 2
            if (\stripos('<?xml', $xml) !== 0) {
212 2
                $xmlHackUsed = true;
213 2
                $xml = '<?xml encoding="' . $this->getEncoding() . '" ?>' . $xml;
214
            }
215
216 2
            $documentFound = $this->document->loadXML($xml, $optionsXml);
217
218
            // remove the "xml-encoding" hack
219 2
            if ($xmlHackUsed) {
220 2
                foreach ($this->document->childNodes as $child) {
221
                    if ($child->nodeType === \XML_PI_NODE) {
222
                        /** @noinspection UnusedFunctionResultInspection */
223
                        $this->document->removeChild($child);
224
225
                        break;
226
                    }
227
                }
228
            }
229
        }
230
231
        if (
232 9
            $documentFound === false
233
            &&
234 9
            \count($xmlErrors) > 0
235
        ) {
236 2
            $errorStr = 'XML-Errors: ' . \print_r($xmlErrors, true) . ' in ' . \print_r($xml, true);
237
238 2
            if (!$this->reportXmlErrorsAsException) {
239 1
                \trigger_error($errorStr, \E_USER_WARNING);
240
            } else {
241 1
                throw new \InvalidArgumentException($errorStr);
242
            }
243
        }
244
245
        // set encoding
246 8
        $this->document->encoding = $this->getEncoding();
247
248
        // restore lib-xml settings
249 8
        \libxml_clear_errors();
250 8
        \libxml_use_internal_errors($internalErrors);
251 8
        if (\PHP_VERSION_ID < 80000 && isset($disableEntityLoader)) {
252 8
            \libxml_disable_entity_loader($disableEntityLoader);
253
        }
254
255 8
        return $this->document;
256
    }
257
258
    /**
259
     * Find list of nodes with a CSS or xPath selector.
260
     *
261
     * @param string   $selector
262
     * @param int|null $idx
263
     *
264
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
265
     */
266 4
    public function find(string $selector, $idx = null)
267
    {
268 4
        $xPathQuery = SelectorConverter::toXPath($selector, true);
269
270 4
        $xPath = new \DOMXPath($this->document);
271
272 4
        if ($this->autoRegisterXPathNamespaces) {
273
            foreach ($this->xPathNamespaces as $key => $value) {
274
                $xPath->registerNamespace($key, $value);
275
            }
276
        }
277
278 4
        if ($this->callbackXPathBeforeQuery) {
279 1
            $xPathQuery = \call_user_func($this->callbackXPathBeforeQuery, $selector, $xPathQuery, $xPath, $this);
280
        }
281
282 4
        $nodesList = $xPath->query($xPathQuery);
283
284 4
        $elements = new SimpleXmlDomNode();
285
286 4
        if ($nodesList) {
287 4
            foreach ($nodesList as $node) {
288 4
                $elements[] = new SimpleXmlDom($node);
289
            }
290
        }
291
292
        // return all elements
293 4
        if ($idx === null) {
294 4
            if (\count($elements) === 0) {
295 2
                return new SimpleXmlDomNodeBlank();
296
            }
297
298 4
            return $elements;
299
        }
300
301
        // handle negative values
302 1
        if ($idx < 0) {
303
            $idx = \count($elements) + $idx;
304
        }
305
306
        // return one element
307 1
        return $elements[$idx] ?? new SimpleXmlDomBlank();
308
    }
309
310
    /**
311
     * Find nodes with a CSS or xPath selector.
312
     *
313
     * @param string $selector
314
     *
315
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
316
     */
317 1
    public function findMulti(string $selector): SimpleXmlDomNodeInterface
318
    {
319 1
        return $this->find($selector, null);
320
    }
321
322
    /**
323
     * Find nodes with a CSS or xPath selector or false, if no element is found.
324
     *
325
     * @param string $selector
326
     *
327
     * @return false|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type false|SimpleXmlDomInterf...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 55. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
328
     */
329 1
    public function findMultiOrFalse(string $selector)
330
    {
331 1
        $return = $this->find($selector, null);
332
333 1
        if ($return instanceof SimpleXmlDomNodeBlank) {
334 1
            return false;
335
        }
336
337
        return $return;
338
    }
339
340
    /**
341
     * Find one node with a CSS or xPath selector.
342
     *
343
     * @param string $selector
344
     *
345
     * @return SimpleXmlDomInterface
346
     */
347 1
    public function findOne(string $selector): SimpleXmlDomInterface
348
    {
349 1
        return $this->find($selector, 0);
350
    }
351
352
    /**
353
     * Find one node with a CSS or xPath selector or false, if no element is found.
354
     *
355
     * @param string $selector
356
     *
357
     * @return false|SimpleXmlDomInterface
358
     */
359 1
    public function findOneOrFalse(string $selector)
360
    {
361 1
        $return = $this->find($selector, 0);
362
363 1
        if ($return instanceof SimpleXmlDomBlank) {
364 1
            return false;
365
        }
366
367
        return $return;
368
    }
369
370
    /**
371
     * @param string $content
372
     * @param bool   $multiDecodeNewHtmlEntity
373
     *
374
     * @return string
375
     */
376 5
    public function fixHtmlOutput(string $content, bool $multiDecodeNewHtmlEntity = false): string
377
    {
378 5
        $content = $this->decodeHtmlEntity($content, $multiDecodeNewHtmlEntity);
379
380 5
        return self::putReplacedBackToPreserveHtmlEntities($content);
381
    }
382
383
    /**
384
     * Return elements by ".class".
385
     *
386
     * @param string $class
387
     *
388
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
389
     */
390
    public function getElementByClass(string $class): SimpleXmlDomNodeInterface
391
    {
392
        return $this->findMulti(".${class}");
393
    }
394
395
    /**
396
     * Return element by #id.
397
     *
398
     * @param string $id
399
     *
400
     * @return SimpleXmlDomInterface
401
     */
402
    public function getElementById(string $id): SimpleXmlDomInterface
403
    {
404
        return $this->findOne("#${id}");
405
    }
406
407
    /**
408
     * Return element by tag name.
409
     *
410
     * @param string $name
411
     *
412
     * @return SimpleXmlDomInterface
413
     */
414
    public function getElementByTagName(string $name): SimpleXmlDomInterface
415
    {
416
        $node = $this->document->getElementsByTagName($name)->item(0);
417
418
        if ($node === null) {
419
            return new SimpleXmlDomBlank();
420
        }
421
422
        return new SimpleXmlDom($node);
423
    }
424
425
    /**
426
     * Returns elements by "#id".
427
     *
428
     * @param string   $id
429
     * @param int|null $idx
430
     *
431
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
432
     */
433
    public function getElementsById(string $id, $idx = null)
434
    {
435
        return $this->find("#${id}", $idx);
436
    }
437
438
    /**
439
     * Returns elements by tag name.
440
     *
441
     * @param string   $name
442
     * @param int|null $idx
443
     *
444
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
445
     */
446 View Code Duplication
    public function getElementsByTagName(string $name, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
447
    {
448
        $nodesList = $this->document->getElementsByTagName($name);
449
450
        $elements = new SimpleXmlDomNode();
451
452
        foreach ($nodesList as $node) {
453
            $elements[] = new SimpleXmlDom($node);
454
        }
455
456
        // return all elements
457
        if ($idx === null) {
458
            if (\count($elements) === 0) {
459
                return new SimpleXmlDomNodeBlank();
460
            }
461
462
            return $elements;
463
        }
464
465
        // handle negative values
466
        if ($idx < 0) {
467
            $idx = \count($elements) + $idx;
468
        }
469
470
        // return one element
471
        return $elements[$idx] ?? new SimpleXmlDomNodeBlank();
472
    }
473
474
    /**
475
     * Get dom node's outer html.
476
     *
477
     * @param bool $multiDecodeNewHtmlEntity
478
     *
479
     * @return string
480
     */
481
    public function html(bool $multiDecodeNewHtmlEntity = false): string
482
    {
483
        if (static::$callback !== null) {
484
            \call_user_func(static::$callback, [$this]);
485
        }
486
487
        $content = $this->document->saveHTML();
488
489
        if ($content === false) {
490
            return '';
491
        }
492
493
        return $this->fixHtmlOutput($content, $multiDecodeNewHtmlEntity);
494
    }
495
496
    /**
497
     * Load HTML from string.
498
     *
499
     * @param string   $html
500
     * @param int|null $libXMLExtraOptions
501
     *
502
     * @return self
503
     */
504
    public function loadHtml(string $html, $libXMLExtraOptions = null): DomParserInterface
505
    {
506
        $this->document = $this->createDOMDocument($html, $libXMLExtraOptions);
507
508
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtml of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
509
    }
510
511
    /**
512
     * Load HTML from file.
513
     *
514
     * @param string   $filePath
515
     * @param int|null $libXMLExtraOptions
516
     *
517
     * @throws \RuntimeException
518
     *
519
     * @return XmlDomParser
520
     */
521 View Code Duplication
    public function loadHtmlFile(string $filePath, $libXMLExtraOptions = null): DomParserInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
522
    {
523
        if (
524
            !\preg_match("/^https?:\/\//i", $filePath)
525
            &&
526
            !\file_exists($filePath)
527
        ) {
528
            throw new \RuntimeException("File ${filePath} not found");
529
        }
530
531
        try {
532
            if (\class_exists('\voku\helper\UTF8')) {
533
                $html = \voku\helper\UTF8::file_get_contents($filePath);
534
            } else {
535
                $html = \file_get_contents($filePath);
536
            }
537
        } catch (\Exception $e) {
538
            throw new \RuntimeException("Could not load file ${filePath}");
539
        }
540
541
        if ($html === false) {
542
            throw new \RuntimeException("Could not load file ${filePath}");
543
        }
544
545
        return $this->loadHtml($html, $libXMLExtraOptions);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this->loadHtml($..., $libXMLExtraOptions); (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtmlFile of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
546
    }
547
548
    /**
549
     * @param string $selector
550
     * @param int    $idx
551
     *
552
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
553
     */
554
    public function __invoke($selector, $idx = null)
555
    {
556
        return $this->find($selector, $idx);
557
    }
558
559
    /**
560
     * @param string $xml
561
     *
562
     * @return string
563
     */
564
    private function removeXPathNamespaces(string $xml): string
565
    {
566 3
        foreach ($this->xPathNamespaces as $key => $value) {
567 2
            $xml = \str_replace($key . ':', '', $xml);
568
        }
569
570 3
        return (string) \preg_replace('#xmlns:?.*=(["\'])(?:.*)\\1#Ui', '', $xml);
571
    }
572
573
    /**
574
     * Load XML from string.
575
     *
576
     * @param string   $xml
577
     * @param int|null $libXMLExtraOptions
578
     *
579
     * @return XmlDomParser
580
     */
581
    public function loadXml(string $xml, $libXMLExtraOptions = null): self
582
    {
583 9
        $this->document = $this->createDOMDocument($xml, $libXMLExtraOptions);
584
585 8
        return $this;
586
    }
587
588
    /**
589
     * Load XML from file.
590
     *
591
     * @param string   $filePath
592
     * @param int|null $libXMLExtraOptions
593
     *
594
     * @throws \RuntimeException
595
     *
596
     * @return XmlDomParser
597
     */
598 View Code Duplication
    public function loadXmlFile(string $filePath, $libXMLExtraOptions = null): self
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
599
    {
600
        if (
601 2
            !\preg_match("/^https?:\/\//i", $filePath)
602
            &&
603 2
            !\file_exists($filePath)
604
        ) {
605
            throw new \RuntimeException("File ${filePath} not found");
606
        }
607
608
        try {
609 2
            if (\class_exists('\voku\helper\UTF8')) {
610
                $xml = \voku\helper\UTF8::file_get_contents($filePath);
611
            } else {
612 2
                $xml = \file_get_contents($filePath);
613
            }
614
        } catch (\Exception $e) {
615
            throw new \RuntimeException("Could not load file ${filePath}");
616
        }
617
618 2
        if ($xml === false) {
619
            throw new \RuntimeException("Could not load file ${filePath}");
620
        }
621
622 2
        return $this->loadXml($xml, $libXMLExtraOptions);
623
    }
624
625
    /**
626
     * @param callable      $callback
627
     * @param \DOMNode|null $domNode
628
     *
629
     * @return void
630
     */
631
    public function replaceTextWithCallback($callback, \DOMNode $domNode = null)
632
    {
633 1
        if ($domNode === null) {
634 1
            $domNode = $this->document;
635
        }
636
637 1
        if ($domNode->hasChildNodes()) {
638 1
            $children = [];
639
640
            // since looping through a DOM being modified is a bad idea we prepare an array:
641 1
            foreach ($domNode->childNodes as $child) {
642 1
                $children[] = $child;
643
            }
644
645 1
            foreach ($children as $child) {
646 1
                if ($child->nodeType === \XML_TEXT_NODE) {
647
                    /** @noinspection PhpSillyAssignmentInspection */
648
                    /** @var \DOMText $child */
649 1
                    $child = $child;
0 ignored issues
show
Bug introduced by
Why assign $child to itself?

This checks looks for cases where a variable has been assigned to itself.

This assignement can be removed without consequences.

Loading history...
650
651 1
                    $oldText = self::putReplacedBackToPreserveHtmlEntities($child->wholeText);
652 1
                    $newText = $callback($oldText);
653 1
                    if ($domNode->ownerDocument) {
654 1
                        $newTextNode = $domNode->ownerDocument->createTextNode(self::replaceToPreserveHtmlEntities($newText));
655 1
                        $domNode->replaceChild($newTextNode, $child);
656
                    }
657
                } else {
658 1
                    $this->replaceTextWithCallback($callback, $child);
659
                }
660
            }
661
        }
662 1
    }
663
664
    /**
665
     * @param bool $autoRemoveXPathNamespaces
666
     *
667
     * @return $this
668
     */
669
    public function autoRemoveXPathNamespaces(bool $autoRemoveXPathNamespaces = true): self
670
    {
671 3
        $this->autoRemoveXPathNamespaces = $autoRemoveXPathNamespaces;
672
673 3
        return $this;
674
    }
675
676
    /**
677
     * @param bool $autoRegisterXPathNamespaces
678
     *
679
     * @return $this
680
     */
681
    public function autoRegisterXPathNamespaces(bool $autoRegisterXPathNamespaces = true): self
682
    {
683
        $this->autoRegisterXPathNamespaces = $autoRegisterXPathNamespaces;
684
685
        return $this;
686
    }
687
688
    /**
689
     * @param callable $callbackXPathBeforeQuery
690
     *
691
     * @phpstan-param callable(string $cssSelectorString, string $xPathString, \DOMXPath, \voku\helper\XmlDomParser): string $callbackXPathBeforeQuery
692
     *
693
     * @return $this
694
     */
695
    public function setCallbackXPathBeforeQuery(callable $callbackXPathBeforeQuery): self
696
    {
697 1
        $this->callbackXPathBeforeQuery = $callbackXPathBeforeQuery;
698
699 1
        return $this;
700
    }
701
702
    /**
703
     * @param callable $callbackBeforeCreateDom
704
     *
705
     * @phpstan-param callable(string $xmlString, \voku\helper\XmlDomParser): string $callbackBeforeCreateDom
706
     *
707
     * @return $this
708
     */
709
    public function setCallbackBeforeCreateDom(callable $callbackBeforeCreateDom): self
710
    {
711 1
        $this->callbackBeforeCreateDom = $callbackBeforeCreateDom;
712
713 1
        return $this;
714
    }
715
716
    /**
717
     * @param bool $reportXmlErrorsAsException
718
     *
719
     * @return $this
720
     */
721
    public function reportXmlErrorsAsException(bool $reportXmlErrorsAsException = true): self
722
    {
723 3
        $this->reportXmlErrorsAsException = $reportXmlErrorsAsException;
724
725 3
        return $this;
726
    }
727
}
728