Completed
Pull Request — master (#55)
by Volodymyr
01:53
created

XmlDomParser   F

Complexity

Total Complexity 71

Size/Duplication

Total Lines 549
Duplicated Lines 34.61 %

Coupling/Cohesion

Components 1
Dependencies 7

Test Coverage

Coverage 54.24%

Importance

Changes 0
Metric Value
dl 190
loc 549
ccs 96
cts 177
cp 0.5424
rs 2.7199
c 0
b 0
f 0
wmc 71
lcom 1
cbo 7

23 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 28 28 5
A __callStatic() 20 20 3
A __get() 0 10 2
A __toString() 0 4 1
C createDOMDocument() 30 68 12
B find() 31 31 6
A findMulti() 0 4 1
A findMultiOrFalse() 0 10 2
A findOne() 0 4 1
A findOneOrFalse() 0 10 2
A fixHtmlOutput() 0 6 1
A getElementByClass() 0 4 1
A getElementById() 0 4 1
A getElementByTagName() 0 10 2
A getElementsById() 0 4 1
A getElementsByTagName() 27 27 5
A html() 0 14 3
A loadHtml() 0 6 1
B loadHtmlFile() 27 27 6
A __invoke() 0 4 1
A loadXml() 0 6 1
B loadXmlFile() 27 27 6
B replaceTextWithCallback() 0 32 7

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like XmlDomParser often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use XmlDomParser, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/**
8
 * @property-read string $plaintext
9
 *                                 <p>Get dom node's plain text.</p>
10
 *
11
 * @method static XmlDomParser file_get_xml($xml, $libXMLExtraOptions = null)
12
 *                                 <p>Load XML from file.</p>
13
 * @method static XmlDomParser str_get_xml($xml, $libXMLExtraOptions = null)
14
 *                                 <p>Load XML from string.</p>
15
 */
16
class XmlDomParser extends AbstractDomParser
17
{
18
    /**
19
     * @param \DOMNode|SimpleXmlDomInterface|string $element HTML code or SimpleXmlDomInterface, \DOMNode
20
     */
21 3 View Code Duplication
    public function __construct($element = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
22
    {
23 3
        $this->document = new \DOMDocument('1.0', $this->getEncoding());
24
25
        // DOMDocument settings
26 3
        $this->document->preserveWhiteSpace = true;
27 3
        $this->document->formatOutput = true;
28
29 3
        if ($element instanceof SimpleXmlDomInterface) {
30
            $element = $element->getNode();
31
        }
32
33 3
        if ($element instanceof \DOMNode) {
34
            $domNode = $this->document->importNode($element, true);
35
36
            if ($domNode instanceof \DOMNode) {
37
                /** @noinspection UnusedFunctionResultInspection */
38
                $this->document->appendChild($domNode);
39
            }
40
41
            return;
42
        }
43
44 3
        if ($element !== null) {
45
            /** @noinspection UnusedFunctionResultInspection */
46
            $this->loadXml($element);
47
        }
48 3
    }
49
50
    /**
51
     * @param string $name
52
     * @param array  $arguments
53
     *
54
     * @throws \BadMethodCallException
55
     * @throws \RuntimeException
56
     *
57
     * @return XmlDomParser
58
     */
59 3 View Code Duplication
    public static function __callStatic($name, $arguments)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
60
    {
61 3
        $arguments0 = $arguments[0] ?? '';
62
63 3
        $arguments1 = $arguments[1] ?? null;
64
65 3
        if ($name === 'str_get_xml') {
66 1
            $parser = new static();
67
68 1
            return $parser->loadXml($arguments0, $arguments1);
69
        }
70
71 2
        if ($name === 'file_get_xml') {
72 2
            $parser = new static();
73
74 2
            return $parser->loadXmlFile($arguments0, $arguments1);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $parser->loadXmlF...guments0, $arguments1); (self) is incompatible with the return type declared by the abstract method voku\helper\AbstractDomParser::__callStatic of type voku\helper\AbstractDomParser.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
75
        }
76
77
        throw new \BadMethodCallException('Method does not exist');
78
    }
79
80
    /** @noinspection MagicMethodsValidityInspection */
81
82
    /**
83
     * @param string $name
84
     *
85
     * @return string|null
86
     */
87
    public function __get($name)
88
    {
89
        $name = \strtolower($name);
90
91
        if ($name === 'plaintext') {
92
            return $this->text();
93
        }
94
95
        return null;
96
    }
97
98
    /**
99
     * @return string
100
     */
101 2
    public function __toString()
102
    {
103 2
        return $this->xml(false, false, true, 0);
104
    }
105
106
    /**
107
     * Create DOMDocument from XML.
108
     *
109
     * @param string   $xml
110
     * @param int|null $libXMLExtraOptions
111
     *
112
     * @return \DOMDocument
113
     */
114 3
    protected function createDOMDocument(string $xml, $libXMLExtraOptions = null): \DOMDocument
115
    {
116
        // set error level
117 3
        $internalErrors = \libxml_use_internal_errors(true);
118 3
        $disableEntityLoader = \libxml_disable_entity_loader(true);
119 3
        \libxml_clear_errors();
120
121 3
        $optionsXml = \LIBXML_DTDLOAD | \LIBXML_DTDATTR | \LIBXML_NONET;
122
123 3
        if (\defined('LIBXML_BIGLINES')) {
124 3
            $optionsXml |= \LIBXML_BIGLINES;
125
        }
126
127 3
        if (\defined('LIBXML_COMPACT')) {
128 3
            $optionsXml |= \LIBXML_COMPACT;
129
        }
130
131 3
        if ($libXMLExtraOptions !== null) {
132
            $optionsXml |= $libXMLExtraOptions;
133
        }
134
135 3
        $xml = self::replaceToPreserveHtmlEntities($xml);
136
137 3
        $documentFound = false;
138 3
        $sxe = \simplexml_load_string($xml, \SimpleXMLElement::class, $optionsXml);
139 3 View Code Duplication
        if ($sxe !== false && \count(\libxml_get_errors()) === 0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
140 3
            $domElementTmp = \dom_import_simplexml($sxe);
141 3
            if ($domElementTmp) {
142 3
                $documentFound = true;
143 3
                $this->document = $domElementTmp->ownerDocument;
144
            }
145
        }
146
147 3 View Code Duplication
        if ($documentFound === false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
148
149
            // UTF-8 hack: http://php.net/manual/en/domdocument.loadhtml.php#95251
150
            $xmlHackUsed = false;
151
            /** @noinspection StringFragmentMisplacedInspection */
152
            if (\stripos('<?xml', $xml) !== 0) {
153
                $xmlHackUsed = true;
154
                $xml = '<?xml encoding="' . $this->getEncoding() . '" ?>' . $xml;
155
            }
156
157
            $this->document->loadXML($xml, $optionsXml);
158
159
            // remove the "xml-encoding" hack
160
            if ($xmlHackUsed) {
161
                foreach ($this->document->childNodes as $child) {
162
                    if ($child->nodeType === \XML_PI_NODE) {
163
                        /** @noinspection UnusedFunctionResultInspection */
164
                        $this->document->removeChild($child);
165
166
                        break;
167
                    }
168
                }
169
            }
170
        }
171
172
        // set encoding
173 3
        $this->document->encoding = $this->getEncoding();
174
175
        // restore lib-xml settings
176 3
        \libxml_clear_errors();
177 3
        \libxml_use_internal_errors($internalErrors);
178 3
        \libxml_disable_entity_loader($disableEntityLoader);
179
180 3
        return $this->document;
181
    }
182
183
    /**
184
     * Find list of nodes with a CSS selector.
185
     *
186
     * @param string   $selector
187
     * @param int|null $idx
188
     *
189
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
190
     */
191 1 View Code Duplication
    public function find(string $selector, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
192
    {
193 1
        $xPathQuery = SelectorConverter::toXPath($selector);
194
195 1
        $xPath = new \DOMXPath($this->document);
196 1
        $nodesList = $xPath->query($xPathQuery);
197 1
        $elements = new SimpleXmlDomNode();
198
199 1
        if ($nodesList) {
200 1
            foreach ($nodesList as $node) {
201 1
                $elements[] = new SimpleXmlDom($node);
202
            }
203
        }
204
205
        // return all elements
206 1
        if ($idx === null) {
207 1
            if (\count($elements) === 0) {
208 1
                return new SimpleXmlDomNodeBlank();
209
            }
210
211 1
            return $elements;
212
        }
213
214
        // handle negative values
215 1
        if ($idx < 0) {
216
            $idx = \count($elements) + $idx;
217
        }
218
219
        // return one element
220 1
        return $elements[$idx] ?? new SimpleXmlDomBlank();
221
    }
222
223
    /**
224
     * Find nodes with a CSS selector.
225
     *
226
     * @param string $selector
227
     *
228
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
229
     */
230 1
    public function findMulti(string $selector): SimpleXmlDomNodeInterface
231
    {
232 1
        return $this->find($selector, null);
233
    }
234
235
    /**
236
     * Find nodes with a CSS selector or false, if no element is found.
237
     *
238
     * @param string $selector
239
     *
240
     * @return false|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type false|SimpleXmlDomInterf...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 55. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
241
     */
242 1
    public function findMultiOrFalse(string $selector)
243
    {
244 1
        $return = $this->find($selector, null);
245
246 1
        if ($return instanceof SimpleXmlDomNodeBlank) {
247 1
            return false;
248
        }
249
250
        return $return;
251
    }
252
253
    /**
254
     * Find one node with a CSS selector.
255
     *
256
     * @param string $selector
257
     *
258
     * @return SimpleXmlDomInterface
259
     */
260 1
    public function findOne(string $selector): SimpleXmlDomInterface
261
    {
262 1
        return $this->find($selector, 0);
263
    }
264
265
    /**
266
     * Find one node with a CSS selector or false, if no element is found.
267
     *
268
     * @param string $selector
269
     *
270
     * @return false|SimpleXmlDomInterface
271
     */
272 1
    public function findOneOrFalse(string $selector)
273
    {
274 1
        $return = $this->find($selector, 0);
275
276 1
        if ($return instanceof SimpleXmlDomBlank) {
277 1
            return false;
278
        }
279
280
        return $return;
281
    }
282
283
    /**
284
     * @param string $content
285
     * @param bool   $multiDecodeNewHtmlEntity
286
     *
287
     * @return string
288
     */
289
    public function fixHtmlOutput(string $content, bool $multiDecodeNewHtmlEntity = false): string
290
    {
291
        $content = $this->decodeHtmlEntity($content, $multiDecodeNewHtmlEntity);
292
293
        return self::putReplacedBackToPreserveHtmlEntities($content);
294
    }
295
296
    /**
297
     * Return elements by ".class".
298
     *
299
     * @param string $class
300
     *
301
     * @return SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface[]|...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 49. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
302
     */
303
    public function getElementByClass(string $class): SimpleXmlDomNodeInterface
304
    {
305
        return $this->findMulti(".${class}");
306
    }
307
308
    /**
309
     * Return element by #id.
310
     *
311
     * @param string $id
312
     *
313
     * @return SimpleXmlDomInterface
314
     */
315
    public function getElementById(string $id): SimpleXmlDomInterface
316
    {
317
        return $this->findOne("#${id}");
318
    }
319
320
    /**
321
     * Return element by tag name.
322
     *
323
     * @param string $name
324
     *
325
     * @return SimpleXmlDomInterface
326
     */
327
    public function getElementByTagName(string $name): SimpleXmlDomInterface
328
    {
329
        $node = $this->document->getElementsByTagName($name)->item(0);
330
331
        if ($node === null) {
332
            return new SimpleXmlDomBlank();
333
        }
334
335
        return new SimpleXmlDom($node);
336
    }
337
338
    /**
339
     * Returns elements by "#id".
340
     *
341
     * @param string   $id
342
     * @param int|null $idx
343
     *
344
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
345
     */
346
    public function getElementsById(string $id, $idx = null)
347
    {
348
        return $this->find("#${id}", $idx);
349
    }
350
351
    /**
352
     * Returns elements by tag name.
353
     *
354
     * @param string   $name
355
     * @param int|null $idx
356
     *
357
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
358
     */
359 View Code Duplication
    public function getElementsByTagName(string $name, $idx = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
360
    {
361
        $nodesList = $this->document->getElementsByTagName($name);
362
363
        $elements = new SimpleXmlDomNode();
364
365
        foreach ($nodesList as $node) {
366
            $elements[] = new SimpleXmlDom($node);
367
        }
368
369
        // return all elements
370
        if ($idx === null) {
371
            if (\count($elements) === 0) {
372
                return new SimpleXmlDomNodeBlank();
373
            }
374
375
            return $elements;
376
        }
377
378
        // handle negative values
379
        if ($idx < 0) {
380
            $idx = \count($elements) + $idx;
381
        }
382
383
        // return one element
384
        return $elements[$idx] ?? new SimpleXmlDomNodeBlank();
385
    }
386
387
    /**
388
     * Get dom node's outer html.
389
     *
390
     * @param bool $multiDecodeNewHtmlEntity
391
     *
392
     * @return string
393
     */
394
    public function html(bool $multiDecodeNewHtmlEntity = false): string
395
    {
396
        if (static::$callback !== null) {
397
            \call_user_func(static::$callback, [$this]);
398
        }
399
400
        $content = $this->document->saveHTML();
401
402
        if ($content === false) {
403
            return '';
404
        }
405
406
        return $this->fixHtmlOutput($content, $multiDecodeNewHtmlEntity);
407
    }
408
409
    /**
410
     * Load HTML from string.
411
     *
412
     * @param string   $html
413
     * @param int|null $libXMLExtraOptions
414
     *
415
     * @return self
416
     */
417
    public function loadHtml(string $html, $libXMLExtraOptions = null): DomParserInterface
418
    {
419
        $this->document = $this->createDOMDocument($html, $libXMLExtraOptions);
420
421
        return $this;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this; (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtml of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
422
    }
423
424
    /**
425
     * Load HTML from file.
426
     *
427
     * @param string   $filePath
428
     * @param int|null $libXMLExtraOptions
429
     *
430
     * @throws \RuntimeException
431
     *
432
     * @return XmlDomParser
433
     */
434 View Code Duplication
    public function loadHtmlFile(string $filePath, $libXMLExtraOptions = null): DomParserInterface
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
435
    {
436
        if (
437
            !\preg_match("/^https?:\/\//i", $filePath)
438
            &&
439
            !\file_exists($filePath)
440
        ) {
441
            throw new \RuntimeException("File ${filePath} not found");
442
        }
443
444
        try {
445
            if (\class_exists('\voku\helper\UTF8')) {
446
                /** @noinspection PhpUndefinedClassInspection */
447
                $html = UTF8::file_get_contents($filePath);
448
            } else {
449
                $html = \file_get_contents($filePath);
450
            }
451
        } catch (\Exception $e) {
452
            throw new \RuntimeException("Could not load file ${filePath}");
453
        }
454
455
        if ($html === false) {
456
            throw new \RuntimeException("Could not load file ${filePath}");
457
        }
458
459
        return $this->loadHtml($html, $libXMLExtraOptions);
0 ignored issues
show
Bug Best Practice introduced by
The return type of return $this->loadHtml($..., $libXMLExtraOptions); (voku\helper\XmlDomParser) is incompatible with the return type declared by the interface voku\helper\DomParserInterface::loadHtmlFile of type self.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
460
    }
461
462
    /**
463
     * @param string $selector
464
     * @param int    $idx
465
     *
466
     * @return SimpleXmlDomInterface|SimpleXmlDomInterface[]|SimpleXmlDomNodeInterface<SimpleXmlDomInterface>
0 ignored issues
show
Documentation introduced by
The doc-type SimpleXmlDomInterface|Si...<SimpleXmlDomInterface> could not be parsed: Expected "|" or "end of type", but got "<" at position 71. (view supported doc-types)

This check marks PHPDoc comments that could not be parsed by our parser. To see which comment annotations we can parse, please refer to our documentation on supported doc-types.

Loading history...
467
     */
468
    public function __invoke($selector, $idx = null)
469
    {
470
        return $this->find($selector, $idx);
471
    }
472
473
    /**
474
     * Load XML from string.
475
     *
476
     * @param string   $xml
477
     * @param int|null $libXMLExtraOptions
478
     *
479
     * @return XmlDomParser
480
     */
481 3
    public function loadXml(string $xml, $libXMLExtraOptions = null): self
482
    {
483 3
        $this->document = $this->createDOMDocument($xml, $libXMLExtraOptions);
484
485 3
        return $this;
486
    }
487
488
    /**
489
     * Load XML from file.
490
     *
491
     * @param string   $filePath
492
     * @param int|null $libXMLExtraOptions
493
     *
494
     * @throws \RuntimeException
495
     *
496
     * @return XmlDomParser
497
     */
498 2 View Code Duplication
    public function loadXmlFile(string $filePath, $libXMLExtraOptions = null): self
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
499
    {
500
        if (
501 2
            !\preg_match("/^https?:\/\//i", $filePath)
502
            &&
503 2
            !\file_exists($filePath)
504
        ) {
505
            throw new \RuntimeException("File ${filePath} not found");
506
        }
507
508
        try {
509 2
            if (\class_exists('\voku\helper\UTF8')) {
510
                /** @noinspection PhpUndefinedClassInspection */
511
                $xml = UTF8::file_get_contents($filePath);
512
            } else {
513 2
                $xml = \file_get_contents($filePath);
514
            }
515
        } catch (\Exception $e) {
516
            throw new \RuntimeException("Could not load file ${filePath}");
517
        }
518
519 2
        if ($xml === false) {
520
            throw new \RuntimeException("Could not load file ${filePath}");
521
        }
522
523 2
        return $this->loadXml($xml, $libXMLExtraOptions);
524
    }
525
526
    /**
527
     * @param callable      $callback
528
     * @param \DOMNode|null $domNode
529
     *
530
     * @return void
531
     */
532 1
    public function replaceTextWithCallback($callback, \DOMNode $domNode = null)
533
    {
534 1
        if ($domNode === null) {
535 1
            $domNode = $this->document;
536
        }
537
538 1
        if ($domNode->hasChildNodes()) {
539 1
            $children = [];
540
541
            // since looping through a DOM being modified is a bad idea we prepare an array:
542 1
            foreach ($domNode->childNodes as $child) {
543 1
                $children[] = $child;
544
            }
545
546 1
            foreach ($children as $child) {
547 1
                if ($child->nodeType === \XML_TEXT_NODE) {
548
                    /** @noinspection PhpSillyAssignmentInspection */
549
                    /** @var \DOMText $child */
550 1
                    $child = $child;
0 ignored issues
show
Bug introduced by
Why assign $child to itself?

This checks looks for cases where a variable has been assigned to itself.

This assignement can be removed without consequences.

Loading history...
551
552 1
                    $oldText = self::putReplacedBackToPreserveHtmlEntities($child->wholeText);
553 1
                    $newText = $callback($oldText);
554 1
                    if ($domNode->ownerDocument) {
555 1
                        $newTextNode = $domNode->ownerDocument->createTextNode(self::replaceToPreserveHtmlEntities($newText));
556 1
                        $domNode->replaceChild($newTextNode, $child);
557
                    }
558
                } else {
559 1
                    $this->replaceTextWithCallback($callback, $child);
560
                }
561
            }
562
        }
563 1
    }
564
}
565