Completed
Push — master ( 3dc188...ddb6ef )
by Lars
01:45
created

XmlDomParser::find()   B

Complexity

Conditions 7
Paths 16

Size

Total Lines 37

Duplication

Lines 37
Ratio 100 %

Code Coverage

Tests 16
CRAP Score 7.0099

Importance

Changes 0
Metric Value
cc 7
nc 16
nop 2
dl 37
loc 37
ccs 16
cts 17
cp 0.9412
crap 7.0099
rs 8.3946
c 0
b 0
f 0
1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/**
8
 * @property-read string $plaintext
9
 *                                 <p>Get dom node's plain text.</p>
10
 *
11
 * @method static XmlDomParser file_get_xml($xml, $libXMLExtraOptions = null)
12
 *                                 <p>Load XML from file.</p>
13
 * @method static XmlDomParser str_get_xml($xml, $libXMLExtraOptions = null)
14
 *                                 <p>Load XML from string.</p>
15
 */
16
class XmlDomParser extends AbstractDomParser
17
{
18
    /**
19
     * @var callable|null
20
     *
21
     * @phpstan-var null|callable(string $cssSelectorString, string $xPathString, \DOMXPath, \voku\helper\XmlDomParser): string
22
     */
23
    private $callbackXPathBeforeQuery;
24
25
    /**
26
     * @var callable|null
27
     *
28
     * @phpstan-var null|callable(string $xmlString, \voku\helper\XmlDomParser): string
29
     */
30
    private $callbackBeforeCreateDom;
31
32
    /**
33
     * @var bool
34
     */
35
    private $autoRemoveXPathNamespaces = false;
36
37
    /**
38
     * @var bool
39
     */
40
    private $reportXmlErrorsAsException = false;
41
42
    /**
43
     * @param \DOMNode|SimpleXmlDomInterface|string $element HTML code or SimpleXmlDomInterface, \DOMNode
44
     */
45 6 View Code Duplication
    public function __construct($element = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
46
    {
47 6
        $this->document = new \DOMDocument('1.0', $this->getEncoding());
48
49
        // DOMDocument settings
50 6
        $this->document->preserveWhiteSpace = true;
51 6
        $this->document->formatOutput = true;
52
53 6
        if ($element instanceof SimpleXmlDomInterface) {
54 1
            $element = $element->getNode();
55
        }
56
57 6
        if ($element instanceof \DOMNode) {
58 1
            $domNode = $this->document->importNode($element, true);
59
60 1
            if ($domNode instanceof \DOMNode) {
61
                /** @noinspection UnusedFunctionResultInspection */
62 1
                $this->document->appendChild($domNode);
63
            }
64
65 1
            return;
66
        }
67
68 6
        if ($element !== null) {
69
            $this->loadXml($element);
70
        }
71 6
    }
72
73
    /**
74
     * @param string $name
75
     * @param array  $arguments
76
     *
77
     * @throws \BadMethodCallException
78
     * @throws \RuntimeException
79
     *
80
     * @return XmlDomParser
81
     */
82 3 View Code Duplication
    public static function __callStatic($name, $arguments)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
83
    {
84 3
        $arguments0 = $arguments[0] ?? '';
85
86 3
        $arguments1 = $arguments[1] ?? null;
87
88 3
        if ($name === 'str_get_xml') {
89 1
            $parser = new static();
90
91 1
            return $parser->loadXml($arguments0, $arguments1);
92
        }
93
94 2
        if ($name === 'file_get_xml') {
95 2
            $parser = new static();
96
97 2
            return $parser->loadXmlFile($arguments0, $arguments1);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $parser->loadXmlF...guments0, $arguments1); (self) is incompatible with the return type declared by the abstract method voku\helper\AbstractDomParser::__callStatic of type voku\helper\AbstractDomParser.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
98
        }
99
100
        throw new \BadMethodCallException('Method does not exist');
101
    }
102
103
    /** @noinspection MagicMethodsValidityInspection */
104
105
    /**
106
     * @param string $name
107
     *
108
     * @return string|null
109
     */
110
    public function __get($name)
111
    {
112
        $name = \strtolower($name);
113
114
        if ($name === 'plaintext') {
115
            return $this->text();
116
        }
117
118
        return null;
119
    }
120
121
    /**
122
     * @return string
123
     */
124 2
    public function __toString()
125
    {
126 2
        return $this->xml(false, false, true, 0);
127
    }
128
129
    /**
130
     * Create DOMDocument from XML.
131
     *
132
     * @param string   $xml
133
     * @param int|null $libXMLExtraOptions
134
     *
135
     * @return \DOMDocument
136
     */
137 6
    protected function createDOMDocument(string $xml, $libXMLExtraOptions = null): \DOMDocument
138
    {
139 6
        if ($this->callbackBeforeCreateDom) {
140 1
            $xml = \call_user_func($this->callbackBeforeCreateDom, $xml, $this);
141
        }
142
143
        // set error level
144 6
        $internalErrors = \libxml_use_internal_errors(true);
145 6
        if (\PHP_VERSION_ID < 80000) {
146 6
            $disableEntityLoader = \libxml_disable_entity_loader(true);
147
        }
148 6
        \libxml_clear_errors();
149
150 6
        $optionsXml = \LIBXML_DTDLOAD | \LIBXML_DTDATTR | \LIBXML_NONET;
151
152 6
        if (\defined('LIBXML_BIGLINES')) {
153 6
            $optionsXml |= \LIBXML_BIGLINES;
154
        }
155
156 6
        if (\defined('LIBXML_COMPACT')) {
157 6
            $optionsXml |= \LIBXML_COMPACT;
158
        }
159
160 6
        if ($libXMLExtraOptions !== null) {
161
            $optionsXml |= $libXMLExtraOptions;
162
        }
163
164 6
        if ($this->autoRemoveXPathNamespaces) {
165 1
            $xml = $this->removeXPathNamespaces($xml);
166
        }
167
168 6
        $xml = self::replaceToPreserveHtmlEntities($xml);
169
170 6
        $documentFound = false;
171 6
        $sxe = \simplexml_load_string($xml, \SimpleXMLElement::class, $optionsXml);
172 6
        $xmlErrors = \libxml_get_errors();
173 6 View Code Duplication
        if ($sxe !== false && \count($xmlErrors) === 0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
174 4
            $domElementTmp = \dom_import_simplexml($sxe);
175
            if (
176 4
                $domElementTmp
177
                &&
178 4
                $domElementTmp->ownerDocument instanceof \DOMDocument
179
            ) {
180 4
                $documentFound = true;
181 4
                $this->document = $domElementTmp->ownerDocument;
182
            }
183
        }
184
185 6 View Code Duplication
        if ($documentFound === false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
186
187
            // UTF-8 hack: http://php.net/manual/en/domdocument.loadhtml.php#95251
188 2
            $xmlHackUsed = false;
189
            /** @noinspection StringFragmentMisplacedInspection */
190 2
            if (\stripos('<?xml', $xml) !== 0) {
191 2
                $xmlHackUsed = true;
192 2
                $xml = '<?xml encoding="' . $this->getEncoding() . '" ?>' . $xml;
193
            }
194
195 2
            $documentFound = $this->document->loadXML($xml, $optionsXml);
196
197
            // remove the "xml-encoding" hack
198 2
            if ($xmlHackUsed) {
199 2
                foreach ($this->document->childNodes as $child) {
200
                    if ($child->nodeType === \XML_PI_NODE) {
201
                        /** @noinspection UnusedFunctionResultInspection */
202
                        $this->document->removeChild($child);
203
204
                        break;
205
                    }
206
                }
207
            }
208
        }
209
210
        if (
211 6
            $documentFound === false
212
            &&
213 6
            \count($xmlErrors) > 0
214
        ) {
215 2
            $errorStr = 'XML-Errors: ' . \print_r($xmlErrors, true) . ' in ' . \print_r($xml, true);
216
217 2
            if (!$this->reportXmlErrorsAsException) {
218 1
                \trigger_error($errorStr, \E_USER_WARNING);
219
            } else {
220 1
                throw new \InvalidArgumentException($errorStr);
221
            }
222
        }
223
224
        // set encoding
225 5
        $this->document->encoding = $this->getEncoding();
226
227
        // restore lib-xml settings
228 5
        \libxml_clear_errors();
229 5
        \libxml_use_internal_errors($internalErrors);
230 5
        if (\PHP_VERSION_ID < 80000 && isset($disableEntityLoader)) {
231 5
            \libxml_disable_entity_loader($disableEntityLoader);
232
        }
233
234 5
        return $this->document;
235
    }
236
237
    /**
238
     * Find list of nodes with a CSS selector.
239
     *
240
     * @param string   $selector
241
     * @param int|null $idx
242
     *
243
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
244
     */
245 2 View Code Duplication
    public function find(string $selector, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
246
    {
247 2
        $xPathQuery = SelectorConverter::toXPath($selector);
248
249 2
        $xPath = new \DOMXPath($this->document);
250
251 2
        if ($this->callbackXPathBeforeQuery) {
252 1
            $xPathQuery = \call_user_func($this->callbackXPathBeforeQuery, $selector, $xPathQuery, $xPath, $this);
253
        }
254
255 2
        $nodesList = $xPath->query($xPathQuery);
256
257 2
        $elements = new SimpleXmlDomNode();
258
259 2
        if ($nodesList) {
260 2
            foreach ($nodesList as $node) {
261 2
                $elements[] = new SimpleXmlDom($node);
262
            }
263
        }
264
265
        // return all elements
266 2
        if ($idx === null) {
267 2
            if (\count($elements) === 0) {
268 1
                return new SimpleXmlDomNodeBlank();
269
            }
270
271 2
            return $elements;
272
        }
273
274
        // handle negative values
275 1
        if ($idx < 0) {
276
            $idx = \count($elements) + $idx;
277
        }
278
279
        // return one element
280 1
        return $elements[$idx] ?? new SimpleXmlDomBlank();
281
    }
282
283
    /**
284
     * Find nodes with a CSS selector.
285
     *
286
     * @param string $selector
287
     *
288
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
289
     */
290 1
    public function findMulti(string $selector): SimpleXmlDomNodeInterface
291
    {
292 1
        return $this->find($selector, null);
293
    }
294
295
    /**
296
     * Find nodes with a CSS selector or false, if no element is found.
297
     *
298
     * @param string $selector
299
     *
300
     * @return false|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type false|SimpleXmlDomInterf...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 55. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
301
     */
302 1
    public function findMultiOrFalse(string $selector)
303
    {
304 1
        $return = $this->find($selector, null);
305
306 1
        if ($return instanceof SimpleXmlDomNodeBlank) {
307 1
            return false;
308
        }
309
310
        return $return;
311
    }
312
313
    /**
314
     * Find one node with a CSS selector.
315
     *
316
     * @param string $selector
317
     *
318
     * @return SimpleXmlDomInterface
319
     */
320 1
    public function findOne(string $selector): SimpleXmlDomInterface
321
    {
322 1
        return $this->find($selector, 0);
323
    }
324
325
    /**
326
     * Find one node with a CSS selector or false, if no element is found.
327
     *
328
     * @param string $selector
329
     *
330
     * @return false|SimpleXmlDomInterface
331
     */
332 1
    public function findOneOrFalse(string $selector)
333
    {
334 1
        $return = $this->find($selector, 0);
335
336 1
        if ($return instanceof SimpleXmlDomBlank) {
337 1
            return false;
338
        }
339
340
        return $return;
341
    }
342
343
    /**
344
     * @param string $content
345
     * @param bool   $multiDecodeNewHtmlEntity
346
     *
347
     * @return string
348
     */
349 2
    public function fixHtmlOutput(string $content, bool $multiDecodeNewHtmlEntity = false): string
350
    {
351 2
        $content = $this->decodeHtmlEntity($content, $multiDecodeNewHtmlEntity);
352
353 2
        return self::putReplacedBackToPreserveHtmlEntities($content);
354
    }
355
356
    /**
357
     * Return elements by ".class".
358
     *
359
     * @param string $class
360
     *
361
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
362
     */
363
    public function getElementByClass(string $class): SimpleXmlDomNodeInterface
364
    {
365
        return $this->findMulti(".${class}");
366
    }
367
368
    /**
369
     * Return element by #id.
370
     *
371
     * @param string $id
372
     *
373
     * @return SimpleXmlDomInterface
374
     */
375
    public function getElementById(string $id): SimpleXmlDomInterface
376
    {
377
        return $this->findOne("#${id}");
378
    }
379
380
    /**
381
     * Return element by tag name.
382
     *
383
     * @param string $name
384
     *
385
     * @return SimpleXmlDomInterface
386
     */
387
    public function getElementByTagName(string $name): SimpleXmlDomInterface
388
    {
389
        $node = $this->document->getElementsByTagName($name)->item(0);
390
391
        if ($node === null) {
392
            return new SimpleXmlDomBlank();
393
        }
394
395
        return new SimpleXmlDom($node);
396
    }
397
398
    /**
399
     * Returns elements by "#id".
400
     *
401
     * @param string   $id
402
     * @param int|null $idx
403
     *
404
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
405
     */
406
    public function getElementsById(string $id, $idx = null)
407
    {
408
        return $this->find("#${id}", $idx);
409
    }
410
411
    /**
412
     * Returns elements by tag name.
413
     *
414
     * @param string   $name
415
     * @param int|null $idx
416
     *
417
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
418
     */
419 View Code Duplication
    public function getElementsByTagName(string $name, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
420
    {
421
        $nodesList = $this->document->getElementsByTagName($name);
422
423
        $elements = new SimpleXmlDomNode();
424
425
        foreach ($nodesList as $node) {
426
            $elements[] = new SimpleXmlDom($node);
427
        }
428
429
        // return all elements
430
        if ($idx === null) {
431
            if (\count($elements) === 0) {
432
                return new SimpleXmlDomNodeBlank();
433
            }
434
435
            return $elements;
436
        }
437
438
        // handle negative values
439
        if ($idx < 0) {
440
            $idx = \count($elements) + $idx;
441
        }
442
443
        // return one element
444
        return $elements[$idx] ?? new SimpleXmlDomNodeBlank();
445
    }
446
447
    /**
448
     * Get dom node's outer html.
449
     *
450
     * @param bool $multiDecodeNewHtmlEntity
451
     *
452
     * @return string
453
     */
454
    public function html(bool $multiDecodeNewHtmlEntity = false): string
455
    {
456
        if (static::$callback !== null) {
457
            \call_user_func(static::$callback, [$this]);
458
        }
459
460
        $content = $this->document->saveHTML();
461
462
        if ($content === false) {
463
            return '';
464
        }
465
466
        return $this->fixHtmlOutput($content, $multiDecodeNewHtmlEntity);
467
    }
468
469
    /**
470
     * Load HTML from string.
471
     *
472
     * @param string   $html
473
     * @param int|null $libXMLExtraOptions
474
     *
475
     * @return self
476
     */
477
    public function loadHtml(string $html, $libXMLExtraOptions = null): DomParserInterface
478
    {
479
        $this->document = $this->createDOMDocument($html, $libXMLExtraOptions);
480
481
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtml of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
482
    }
483
484
    /**
485
     * Load HTML from file.
486
     *
487
     * @param string   $filePath
488
     * @param int|null $libXMLExtraOptions
489
     *
490
     * @throws \RuntimeException
491
     *
492
     * @return XmlDomParser
493
     */
494 View Code Duplication
    public function loadHtmlFile(string $filePath, $libXMLExtraOptions = null): DomParserInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
495
    {
496
        if (
497
            !\preg_match("/^https?:\/\//i", $filePath)
498
            &&
499
            !\file_exists($filePath)
500
        ) {
501
            throw new \RuntimeException("File ${filePath} not found");
502
        }
503
504
        try {
505
            if (\class_exists('\voku\helper\UTF8')) {
506
                $html = \voku\helper\UTF8::file_get_contents($filePath);
507
            } else {
508
                $html = \file_get_contents($filePath);
509
            }
510
        } catch (\Exception $e) {
511
            throw new \RuntimeException("Could not load file ${filePath}");
512
        }
513
514
        if ($html === false) {
515
            throw new \RuntimeException("Could not load file ${filePath}");
516
        }
517
518
        return $this->loadHtml($html, $libXMLExtraOptions);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this->loadHtml($..., $libXMLExtraOptions); (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtmlFile of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
519
    }
520
521
    /**
522
     * @param string $selector
523
     * @param int    $idx
524
     *
525
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
526
     */
527
    public function __invoke($selector, $idx = null)
528
    {
529
        return $this->find($selector, $idx);
530
    }
531
532
    /**
533
     * @param string $xml
534
     *
535
     * @return string
536
     */
537 1
    private function removeXPathNamespaces(string $xml): string
538
    {
539 1
        return (string) \preg_replace('#xmlns:?.*="(?:.*)"#Ui', '', $xml);
540
    }
541
542
    /**
543
     * Load XML from string.
544
     *
545
     * @param string   $xml
546
     * @param int|null $libXMLExtraOptions
547
     *
548
     * @return XmlDomParser
549
     */
550 6
    public function loadXml(string $xml, $libXMLExtraOptions = null): self
551
    {
552 6
        $this->document = $this->createDOMDocument($xml, $libXMLExtraOptions);
553
554 5
        return $this;
555
    }
556
557
    /**
558
     * Load XML from file.
559
     *
560
     * @param string   $filePath
561
     * @param int|null $libXMLExtraOptions
562
     *
563
     * @throws \RuntimeException
564
     *
565
     * @return XmlDomParser
566
     */
567 2 View Code Duplication
    public function loadXmlFile(string $filePath, $libXMLExtraOptions = null): self
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
568
    {
569
        if (
570 2
            !\preg_match("/^https?:\/\//i", $filePath)
571
            &&
572 2
            !\file_exists($filePath)
573
        ) {
574
            throw new \RuntimeException("File ${filePath} not found");
575
        }
576
577
        try {
578 2
            if (\class_exists('\voku\helper\UTF8')) {
579
                $xml = \voku\helper\UTF8::file_get_contents($filePath);
580
            } else {
581 2
                $xml = \file_get_contents($filePath);
582
            }
583
        } catch (\Exception $e) {
584
            throw new \RuntimeException("Could not load file ${filePath}");
585
        }
586
587 2
        if ($xml === false) {
588
            throw new \RuntimeException("Could not load file ${filePath}");
589
        }
590
591 2
        return $this->loadXml($xml, $libXMLExtraOptions);
592
    }
593
594
    /**
595
     * @param callable      $callback
596
     * @param \DOMNode|null $domNode
597
     *
598
     * @return void
599
     */
600 1
    public function replaceTextWithCallback($callback, \DOMNode $domNode = null)
601
    {
602 1
        if ($domNode === null) {
603 1
            $domNode = $this->document;
604
        }
605
606 1
        if ($domNode->hasChildNodes()) {
607 1
            $children = [];
608
609
            // since looping through a DOM being modified is a bad idea we prepare an array:
610 1
            foreach ($domNode->childNodes as $child) {
611 1
                $children[] = $child;
612
            }
613
614 1
            foreach ($children as $child) {
615 1
                if ($child->nodeType === \XML_TEXT_NODE) {
616
                    /** @noinspection PhpSillyAssignmentInspection */
617
                    /** @var \DOMText $child */
618 1
                    $child = $child;
0 ignored issues
show
Bug introduced by
Why assign $child to itself?

This checks looks for cases where a variable has been assigned to itself.

This assignement can be removed without consequences.

Loading history...
619
620 1
                    $oldText = self::putReplacedBackToPreserveHtmlEntities($child->wholeText);
621 1
                    $newText = $callback($oldText);
622 1
                    if ($domNode->ownerDocument) {
623 1
                        $newTextNode = $domNode->ownerDocument->createTextNode(self::replaceToPreserveHtmlEntities($newText));
624 1
                        $domNode->replaceChild($newTextNode, $child);
625
                    }
626
                } else {
627 1
                    $this->replaceTextWithCallback($callback, $child);
628
                }
629
            }
630
        }
631 1
    }
632
633
    /**
634
     * @param bool $autoRemoveXPathNamespaces
635
     */
636 1
    public function autoRemoveXPathNamespaces(bool $autoRemoveXPathNamespaces = true)
637
    {
638 1
        $this->autoRemoveXPathNamespaces = $autoRemoveXPathNamespaces;
639 1
    }
640
641
    /**
642
     * @param callable $callbackXPathBeforeQuery
643
     *
644
     * @phpstan-param callable(string $cssSelectorString, string $xPathString, \DOMXPath, \voku\helper\XmlDomParser): string $callbackXPathBeforeQuery
645
     */
646 1
    public function setCallbackXPathBeforeQuery(callable $callbackXPathBeforeQuery)
647
    {
648 1
        $this->callbackXPathBeforeQuery = $callbackXPathBeforeQuery;
649 1
    }
650
651
    /**
652
     * @param callable $callbackBeforeCreateDom
653
     *
654
     * @phpstan-param callable(string $xmlString, \voku\helper\XmlDomParser): string $callbackBeforeCreateDom
655
     */
656 1
    public function setCallbackBeforeCreateDom(callable $callbackBeforeCreateDom)
657
    {
658 1
        $this->callbackBeforeCreateDom = $callbackBeforeCreateDom;
659 1
    }
660
661
    /**
662
     * @param bool $reportXmlErrorsAsException
663
     */
664 2
    public function reportXmlErrorsAsException(bool $reportXmlErrorsAsException = true)
665
    {
666 2
        $this->reportXmlErrorsAsException = $reportXmlErrorsAsException;
667 2
    }
668
}
669