Passed
Push — master ( d10009...12b94f )
by Gilles
02:00
created

Dom   F

Complexity

Total Complexity 82

Size/Duplication

Total Lines 814
Duplicated Lines 0 %

Test Coverage

Coverage 92.25%

Importance

Changes 7
Bugs 0 Features 0
Metric Value
eloc 280
dl 0
loc 814
ccs 250
cts 271
cp 0.9225
rs 2
c 7
b 0
f 0
wmc 82

28 Methods

Rating   Name   Duplication   Size   Complexity  
A loadFromFile() 0 3 1
A __toString() 0 3 1
A __get() 0 3 1
A loadStr() 0 18 1
A load() 0 13 4
A setOptions() 0 5 1
A loadFromUrl() 0 9 2
A addSelfClosingTag() 0 10 3
A find() 0 12 2
A getChildren() 0 5 1
A removeNoSlashTag() 0 8 2
B clean() 0 55 8
A removeSelfClosingTag() 0 8 2
D parseTag() 0 148 21
A getElementsByClass() 0 5 1
A getElementById() 0 5 1
A isLoaded() 0 4 2
A addNoSlashTag() 0 10 3
A clearNoSlashTags() 0 5 1
A lastChild() 0 5 1
A clearSelfClosingTags() 0 5 1
A countChildren() 0 5 1
A firstChild() 0 5 1
A detectCharset() 0 42 5
A getElementsByTag() 0 5 1
A findById() 0 5 1
C parse() 0 54 12
A hasChildren() 0 5 1

How to fix   Complexity   

Complex Class

Complex classes like Dom often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Dom, and based on these observations, apply Extract Interface, too.

1
<?php declare(strict_types=1);
2
namespace PHPHtmlParser;
3
4
use PHPHtmlParser\Dom\AbstractNode;
5
use PHPHtmlParser\Dom\Collection;
6
use PHPHtmlParser\Dom\HtmlNode;
7
use PHPHtmlParser\Dom\TextNode;
8
use PHPHtmlParser\Exceptions\ChildNotFoundException;
9
use PHPHtmlParser\Exceptions\CircularException;
10
use PHPHtmlParser\Exceptions\CurlException;
11
use PHPHtmlParser\Exceptions\NotLoadedException;
12
use PHPHtmlParser\Exceptions\ParentNotFoundException;
13
use PHPHtmlParser\Exceptions\StrictException;
14
use PHPHtmlParser\Exceptions\UnknownChildTypeException;
15
use stringEncode\Encode;
16
17
/**
18
 * Class Dom
19
 *
20
 * @package PHPHtmlParser
21
 */
22
class Dom
23
{
24
25
    /**
26
     * The charset we would like the output to be in.
27
     *
28
     * @var string
29
     */
30
    protected $defaultCharset = 'UTF-8';
31
32
    /**
33
     * Contains the root node of this dom tree.
34
     *
35
     * @var HtmlNode
36
     */
37
    public $root;
38
39
    /**
40
     * The raw version of the document string.
41
     *
42
     * @var string
43
     */
44
    protected $raw;
45
46
    /**
47
     * The document string.
48
     *
49
     * @var Content
50
     */
51
    protected $content = null;
52
53
    /**
54
     * The original file size of the document.
55
     *
56
     * @var int
57
     */
58
    protected $rawSize;
59
60
    /**
61
     * The size of the document after it is cleaned.
62
     *
63
     * @var int
64
     */
65
    protected $size;
66
67
    /**
68
     * A global options array to be used by all load calls.
69
     *
70
     * @var array
71
     */
72
    protected $globalOptions = [];
73
74
    /**
75
     * A persistent option object to be used for all options in the
76
     * parsing of the file.
77
     *
78
     * @var Options
79
     */
80
    protected $options;
81
82
    /**
83
     * A list of tags which will always be self closing
84
     *
85
     * @var array
86
     */
87
    protected $selfClosing = [
88
        'area',
89
        'base',
90
        'basefont',
91
        'br',
92
        'col',
93
        'embed',
94
        'hr',
95
        'img',
96
        'input',
97
        'keygen',
98
        'link',
99
        'meta',
100
        'param',
101
        'source',
102
        'spacer',
103
        'track',
104
        'wbr'
105
    ];
106
107
    /**
108
     * A list of tags where there should be no /> at the end (html5 style)
109
     *
110
     * @var array
111
     */
112
    protected $noSlash = [];
113
114
    /**
115
     * Returns the inner html of the root node.
116
     *
117
     * @return string
118
     * @throws ChildNotFoundException
119
     * @throws UnknownChildTypeException
120
     */
121 24
    public function __toString(): string
122
    {
123 24
        return $this->root->innerHtml();
124
    }
125
126
    /**
127
     * A simple wrapper around the root node.
128
     *
129
     * @param string $name
130
     * @return mixed
131
     */
132 15
    public function __get($name)
133
    {
134 15
        return $this->root->$name;
135
    }
136
137
    /**
138
     * Attempts to load the dom from any resource, string, file, or URL.
139
     * @param string $str
140
     * @param array  $options
141
     * @return Dom
142
     * @throws ChildNotFoundException
143
     * @throws CircularException
144
     * @throws CurlException
145
     * @throws StrictException
146
     */
147 177
    public function load(string $str, array $options = []): Dom
148
    {
149 177
        AbstractNode::resetCount();
150
        // check if it's a file
151 177
        if (strpos($str, "\n") === false && is_file($str)) {
152 6
            return $this->loadFromFile($str, $options);
153
        }
154
        // check if it's a url
155 171
        if (preg_match("/^https?:\/\//i", $str)) {
156
            return $this->loadFromUrl($str, $options);
157
        }
158
159 171
        return $this->loadStr($str, $options);
160
    }
161
162
    /**
163
     * Loads the dom from a document file/url
164
     * @param string $file
165
     * @param array  $options
166
     * @return Dom
167
     * @throws ChildNotFoundException
168
     * @throws CircularException
169
     * @throws StrictException
170
     */
171 48
    public function loadFromFile(string $file, array $options = []): Dom
172
    {
173 48
        return $this->loadStr(file_get_contents($file), $options);
174
    }
175
176
    /**
177
     * Use a curl interface implementation to attempt to load
178
     * the content from a url.
179
     * @param string                            $url
180
     * @param array                             $options
181
     * @param CurlInterface|null $curl
182
     * @return Dom
183
     * @throws ChildNotFoundException
184
     * @throws CircularException
185
     * @throws CurlException
186
     * @throws StrictException
187
     */
188 6
    public function loadFromUrl(string $url, array $options = [], CurlInterface $curl = null): Dom
189
    {
190 6
        if (is_null($curl)) {
191
            // use the default curl interface
192
            $curl = new Curl;
193
        }
194 6
        $content = $curl->get($url, $options);
195
196 6
        return $this->loadStr($content, $options);
197
    }
198
199
    /**
200
     * Parsers the html of the given string. Used for load(), loadFromFile(),
201
     * and loadFromUrl().
202
     * @param string $str
203
     * @param array  $option
204
     * @return Dom
205
     * @throws ChildNotFoundException
206
     * @throws CircularException
207
     * @throws StrictException
208
     */
209 237
    public function loadStr(string $str, array $option = []): Dom
210
    {
211 237
        $this->options = new Options;
212 237
        $this->options->setOptions($this->globalOptions)
213 237
                      ->setOptions($option);
214
215 237
        $this->rawSize = strlen($str);
216 237
        $this->raw     = $str;
217
218 237
        $html = $this->clean($str);
219
220 237
        $this->size    = strlen($str);
221 237
        $this->content = new Content($html);
222
223 237
        $this->parse();
224 231
        $this->detectCharset();
225
226 231
        return $this;
227
    }
228
229
    /**
230
     * Sets a global options array to be used by all load calls.
231
     *
232
     * @param array $options
233
     * @return Dom
234
     * @chainable
235
     */
236 51
    public function setOptions(array $options): Dom
237
    {
238 51
        $this->globalOptions = $options;
239
240 51
        return $this;
241
    }
242
243
    /**
244
     * Find elements by css selector on the root node.
245
     * @param string   $selector
246
     * @param int|null $nth
247
     * @return mixed|Collection|null
248
     * @throws ChildNotFoundException
249
     * @throws NotLoadedException
250
     */
251 174
    public function find(string $selector, int $nth = null)
252
    {
253 174
        $this->isLoaded();
254
255 171
        $depthFirstSearch = $this->options->get('depthFirstSearch');
256 171
        if (is_bool($depthFirstSearch)) {
257 171
            $result = $this->root->find($selector, $nth, $depthFirstSearch);
258
        } else {
259
            $result = $this->root->find($selector, $nth);
260
        }
261
262 171
        return $result;
263
    }
264
265
    /**
266
     * Find element by Id on the root node
267
     * @param int $id
268
     * @return bool|AbstractNode
269
     * @throws ChildNotFoundException
270
     * @throws NotLoadedException
271
     * @throws ParentNotFoundException
272
     */
273 9
    public function findById(int $id)
274
    {
275 9
        $this->isLoaded();
276
277 9
        return $this->root->findById($id);
278
    }
279
280
    /**
281
     * Adds the tag (or tags in an array) to the list of tags that will always
282
     * be self closing.
283
     *
284
     * @param string|array $tag
285
     * @return Dom
286
     * @chainable
287
     */
288 6
    public function addSelfClosingTag($tag): Dom
289
    {
290 6
        if ( ! is_array($tag)) {
291 3
            $tag = [$tag];
292
        }
293 6
        foreach ($tag as $value) {
294 6
            $this->selfClosing[] = $value;
295
        }
296
297 6
        return $this;
298
    }
299
300
    /**
301
     * Removes the tag (or tags in an array) from the list of tags that will
302
     * always be self closing.
303
     *
304
     * @param string|array $tag
305
     * @return Dom
306
     * @chainable
307
     */
308 3
    public function removeSelfClosingTag($tag): Dom
309
    {
310 3
        if ( ! is_array($tag)) {
311 3
            $tag = [$tag];
312
        }
313 3
        $this->selfClosing = array_diff($this->selfClosing, $tag);
314
315 3
        return $this;
316
    }
317
318
    /**
319
     * Sets the list of self closing tags to empty.
320
     *
321
     * @return Dom
322
     * @chainable
323
     */
324 3
    public function clearSelfClosingTags(): Dom
325
    {
326 3
        $this->selfClosing = [];
327
328 3
        return $this;
329
    }
330
331
332
    /**
333
     * Adds a tag to the list of self closing tags that should not have a trailing slash
334
     *
335
     * @param $tag
336
     * @return Dom
337
     * @chainable
338
     */
339 3
    public function addNoSlashTag($tag): Dom
340
    {
341 3
        if ( ! is_array($tag)) {
342 3
            $tag = [$tag];
343
        }
344 3
        foreach ($tag as $value) {
345 3
            $this->noSlash[] = $value;
346
        }
347
348 3
        return $this;
349
    }
350
351
    /**
352
     * Removes a tag from the list of no-slash tags.
353
     *
354
     * @param $tag
355
     * @return Dom
356
     * @chainable
357
     */
358
    public function removeNoSlashTag($tag): Dom
359
    {
360
        if ( ! is_array($tag)) {
361
            $tag = [$tag];
362
        }
363
        $this->noSlash = array_diff($this->noSlash, $tag);
364
365
        return $this;
366
    }
367
368
    /**
369
     * Empties the list of no-slash tags.
370
     *
371
     * @return Dom
372
     * @chainable
373
     */
374
    public function clearNoSlashTags(): Dom
375
    {
376
        $this->noSlash = [];
377
378
        return $this;
379
    }
380
381
    /**
382
     * Simple wrapper function that returns the first child.
383
     * @return AbstractNode
384
     * @throws ChildNotFoundException
385
     * @throws NotLoadedException
386
     */
387 3
    public function firstChild(): AbstractNode
388
    {
389 3
        $this->isLoaded();
390
391 3
        return $this->root->firstChild();
392
    }
393
394
    /**
395
     * Simple wrapper function that returns the last child.
396
     * @return AbstractNode
397
     * @throws ChildNotFoundException
398
     * @throws NotLoadedException
399
     */
400 3
    public function lastChild(): AbstractNode
401
    {
402 3
        $this->isLoaded();
403
404 3
        return $this->root->lastChild();
405
    }
406
407
    /**
408
     * Simple wrapper function that returns count of child elements
409
     *
410
     * @return int
411
     * @throws NotLoadedException
412
     */
413 3
    public function countChildren(): int
414
    {
415 3
        $this->isLoaded();
416
417 3
        return $this->root->countChildren();
418
    }
419
420
    /**
421
     * Get array of children
422
     *
423
     * @return array
424
     * @throws NotLoadedException
425
     */
426 3
    public function getChildren(): array
427
    {
428 3
        $this->isLoaded();
429
430 3
        return $this->root->getChildren();
431
    }
432
433
    /**
434
     * Check if node have children nodes
435
     *
436
     * @return bool
437
     * @throws NotLoadedException
438
     */
439 3
    public function hasChildren(): bool
440
    {
441 3
        $this->isLoaded();
442
443 3
        return $this->root->hasChildren();
444
    }
445
446
    /**
447
     * Simple wrapper function that returns an element by the
448
     * id.
449
     * @param $id
450
     * @return mixed|Collection|null
451
     * @throws ChildNotFoundException
452
     * @throws NotLoadedException
453
     */
454 12
    public function getElementById($id)
455
    {
456 12
        $this->isLoaded();
457
458 12
        return $this->find('#'.$id, 0);
459
    }
460
461
    /**
462
     * Simple wrapper function that returns all elements by
463
     * tag name.
464
     * @param string $name
465
     * @return mixed|Collection|null
466
     * @throws ChildNotFoundException
467
     * @throws NotLoadedException
468
     */
469 15
    public function getElementsByTag(string $name)
470
    {
471 15
        $this->isLoaded();
472
473 15
        return $this->find($name);
474
    }
475
476
    /**
477
     * Simple wrapper function that returns all elements by
478
     * class name.
479
     * @param string $class
480
     * @return mixed|Collection|null
481
     * @throws ChildNotFoundException
482
     * @throws NotLoadedException
483
     */
484 3
    public function getElementsByClass(string $class)
485
    {
486 3
        $this->isLoaded();
487
488 3
        return $this->find('.'.$class);
489
    }
490
491
    /**
492
     * Checks if the load methods have been called.
493
     *
494
     * @throws NotLoadedException
495
     */
496 198
    protected function isLoaded(): void
497
    {
498 198
        if (is_null($this->content)) {
499 3
            throw new NotLoadedException('Content is not loaded!');
500
        }
501 195
    }
502
503
    /**
504
     * Cleans the html of any none-html information.
505
     *
506
     * @param string $str
507
     * @return string
508
     */
509 237
    protected function clean(string $str): string
510
    {
511 237
        if ($this->options->get('cleanupInput') != true) {
512
            // skip entire cleanup step
513 6
            return $str;
514
        }
515
516 231
        $is_gzip = 0 === mb_strpos($str, "\x1f" . "\x8b" . "\x08", 0, "US-ASCII");
517 231
        if ($is_gzip) {
518
            $str = gzdecode($str);
519
        }
520
521
        // remove white space before closing tags
522 231
        $str = mb_eregi_replace("'\s+>", "'>", $str);
523 231
        $str = mb_eregi_replace('"\s+>', '">', $str);
524
525
        // clean out the \n\r
526 231
        $replace = ' ';
527 231
        if ($this->options->get('preserveLineBreaks')) {
528 3
            $replace = '&#10;';
529
        }
530 231
        $str = str_replace(["\r\n", "\r", "\n"], $replace, $str);
531
532
        // strip the doctype
533 231
        $str = mb_eregi_replace("<!doctype(.*?)>", '', $str);
534
535
        // strip out comments
536 231
        $str = mb_eregi_replace("<!--(.*?)-->", '', $str);
537
538
        // strip out cdata
539 231
        $str = mb_eregi_replace("<!\[CDATA\[(.*?)\]\]>", '', $str);
540
541
        // strip out <script> tags
542 231
        if ($this->options->get('removeScripts')) {
543 228
            $str = mb_eregi_replace("<\s*script[^>]*[^/]>(.*?)<\s*/\s*script\s*>", '', $str);
544 228
            $str = mb_eregi_replace("<\s*script\s*>(.*?)<\s*/\s*script\s*>", '', $str);
545
        }
546
547
        // strip out <style> tags
548 231
        if ($this->options->get('removeStyles')) {
549 228
            $str = mb_eregi_replace("<\s*style[^>]*[^/]>(.*?)<\s*/\s*style\s*>", '', $str);
550 228
            $str = mb_eregi_replace("<\s*style\s*>(.*?)<\s*/\s*style\s*>", '', $str);
551
        }
552
553
        // strip out server side scripts
554 231
        if ($this->options->get('serverSideScripts')) {
555
            $str = mb_eregi_replace("(<\?)(.*?)(\?>)", '', $str);
556
        }
557
558
        // strip smarty scripts
559 231
        if ($this->options->get('removeSmartyScripts')) {
560 228
            $str = mb_eregi_replace("(\{\w)(.*?)(\})", '', $str);
561
        }
562
563 231
        return $str;
564
    }
565
566
    /**
567
     * Attempts to parse the html in content.
568
     *
569
     * @return void
570
     * @throws ChildNotFoundException
571
     * @throws CircularException
572
     * @throws StrictException
573
     */
574 237
    protected function parse(): void
575
    {
576
        // add the root node
577 237
        $this->root = new HtmlNode('root');
578 237
        $this->root->setHtmlSpecialCharsDecode($this->options->htmlSpecialCharsDecode);
579 237
        $activeNode = $this->root;
580 237
        while ( ! is_null($activeNode)) {
581 237
            $str = $this->content->copyUntil('<');
582 237
            if ($str == '') {
583 237
                $info = $this->parseTag();
584 237
                if ( ! $info['status']) {
585
                    // we are done here
586 231
                    $activeNode = null;
587 231
                    continue;
588
                }
589
590
                // check if it was a closing tag
591 231
                if ($info['closing']) {
592 222
                    $foundOpeningTag  = true;
593 222
                    $originalNode     = $activeNode;
594 222
                    while ($activeNode->getTag()->name() != $info['tag']) {
595 78
                        $activeNode = $activeNode->getParent();
596 78
                        if (is_null($activeNode)) {
597
                            // we could not find opening tag
598 36
                            $activeNode = $originalNode;
599 36
                            $foundOpeningTag = false;
600 36
                            break;
601
                        }
602
                    }
603 222
                    if ($foundOpeningTag) {
604 222
                        $activeNode = $activeNode->getParent();
605
                    }
606 222
                    continue;
607
                }
608
609 231
                if ( ! isset($info['node'])) {
610 12
                    continue;
611
                }
612
613
                /** @var AbstractNode $node */
614 231
                $node = $info['node'];
615 231
                $activeNode->addChild($node);
616
617
                // check if node is self closing
618 231
                if ( ! $node->getTag()->isSelfClosing()) {
619 231
                    $activeNode = $node;
620
                }
621 219
            } else if ($this->options->whitespaceTextNode ||
622 219
                trim($str) != ''
623
            ) {
624
                // we found text we care about
625 219
                $textNode = new TextNode($str, $this->options->removeDoubleSpace);
626 219
                $textNode->setHtmlSpecialCharsDecode($this->options->htmlSpecialCharsDecode);
627 219
                $activeNode->addChild($textNode);
628
            }
629
        }
630 231
    }
631
632
    /**
633
     * Attempt to parse a tag out of the content.
634
     *
635
     * @return array
636
     * @throws StrictException
637
     */
638 237
    protected function parseTag(): array
639
    {
640
        $return = [
641 237
            'status'  => false,
642
            'closing' => false,
643
            'node'    => null,
644
        ];
645 237
        if ($this->content->char() != '<') {
646
            // we are not at the beginning of a tag
647 228
            return $return;
648
        }
649
650
        // check if this is a closing tag
651 231
        if ($this->content->fastForward(1)->char() == '/') {
652
            // end tag
653 222
            $tag = $this->content->fastForward(1)
654 222
                                 ->copyByToken('slash', true);
655
            // move to end of tag
656 222
            $this->content->copyUntil('>');
657 222
            $this->content->fastForward(1);
658
659
            // check if this closing tag counts
660 222
            $tag = strtolower($tag);
661 222
            if (in_array($tag, $this->selfClosing)) {
662 12
                $return['status'] = true;
663
664 12
                return $return;
665
            } else {
666 222
                $return['status']  = true;
667 222
                $return['closing'] = true;
668 222
                $return['tag']     = strtolower($tag);
669
            }
670
671 222
            return $return;
672
        }
673
674 231
        $tag  = strtolower($this->content->copyByToken('slash', true));
675 231
        if (trim($tag) == '')
676
        {
677
            // no tag found, invalid < found
678 3
            return $return;
679
        }
680 231
        $node = new HtmlNode($tag);
681 231
        $node->setHtmlSpecialCharsDecode($this->options->htmlSpecialCharsDecode);
682
683
        // attributes
684 231
        while ($this->content->char() != '>' &&
685 231
            $this->content->char() != '/') {
686 222
            $space = $this->content->skipByToken('blank', true);
687 222
            if (empty($space)) {
688 6
                $this->content->fastForward(1);
689 6
                continue;
690
            }
691
692 222
            $name = $this->content->copyByToken('equal', true);
693 222
            if ($name == '/') {
694
                break;
695
            }
696
697 222
            if (empty($name)) {
698 120
				$this->content->skipByToken('blank');
699 120
				continue;
700
            }
701
702 219
            $this->content->skipByToken('blank');
703 219
            if ($this->content->char() == '=') {
704 219
                $attr = [];
705 219
                $this->content->fastForward(1)
706 219
                              ->skipByToken('blank');
707 219
                switch ($this->content->char()) {
708 219
                    case '"':
709 204
                        $attr['doubleQuote'] = true;
710 204
                        $this->content->fastForward(1);
711 204
                        $string = $this->content->copyUntil('"', true, true);
712
                        do {
713 204
                            $moreString = $this->content->copyUntilUnless('"', '=>');
714 204
                            $string .= $moreString;
715 204
                        } while ( ! empty($moreString));
716 204
                        $attr['value'] = $string;
717 204
                        $this->content->fastForward(1);
718 204
                        $node->getTag()->$name = $attr;
719 204
                        break;
720 21
                    case "'":
721 18
                        $attr['doubleQuote'] = false;
722 18
                        $this->content->fastForward(1);
723 18
                        $string = $this->content->copyUntil("'", true, true);
724
                        do {
725 18
                            $moreString = $this->content->copyUntilUnless("'", '=>');
726 18
                            $string .= $moreString;
727 18
                        } while ( ! empty($moreString));
728 18
                        $attr['value'] = $string;
729 18
                        $this->content->fastForward(1);
730 18
                        $node->getTag()->$name = $attr;
731 18
                        break;
732
                    default:
733 3
                        $attr['doubleQuote']   = true;
734 3
                        $attr['value']         = $this->content->copyByToken('attr', true);
735 3
                        $node->getTag()->$name = $attr;
736 219
                        break;
737
                }
738
            } else {
739
                // no value attribute
740 69
                if ($this->options->strict) {
741
                    // can't have this in strict html
742 3
                    $character = $this->content->getPosition();
743 3
                    throw new StrictException("Tag '$tag' has an attribute '$name' with out a value! (character #$character)");
744
                }
745 66
                $node->getTag()->$name = [
746
                    'value'       => null,
747
                    'doubleQuote' => true,
748
                ];
749 66
                if ($this->content->char() != '>') {
750 12
                    $this->content->rewind(1);
751
                }
752
            }
753
        }
754
755 231
        $this->content->skipByToken('blank');
756 231
        $tag = strtolower($tag);
757 231
        if ($this->content->char() == '/') {
758
            // self closing tag
759 117
            $node->getTag()->selfClosing();
760 117
            $this->content->fastForward(1);
761 228
        } elseif (in_array($tag, $this->selfClosing)) {
762
763
            // Should be a self closing tag, check if we are strict
764 84
            if ($this->options->strict) {
765 3
                $character = $this->content->getPosition();
766 3
                throw new StrictException("Tag '$tag' is not self closing! (character #$character)");
767
            }
768
769
            // We force self closing on this tag.
770 81
            $node->getTag()->selfClosing();
771
772
            // Should this tag use a trailing slash?
773 81
            if(in_array($tag, $this->noSlash))
774
            {
775 3
                $node->getTag()->noTrailingSlash();
776
            }
777
778
        }
779
780 231
        $this->content->fastForward(1);
781
782 231
        $return['status'] = true;
783 231
        $return['node']   = $node;
784
785 231
        return $return;
786
    }
787
788
    /**
789
     * Attempts to detect the charset that the html was sent in.
790
     *
791
     * @return bool
792
     * @throws ChildNotFoundException
793
     */
794 231
    protected function detectCharset(): bool
795
    {
796
        // set the default
797 231
        $encode = new Encode;
798 231
        $encode->from($this->defaultCharset);
799 231
        $encode->to($this->defaultCharset);
800
801 231
        if ( ! is_null($this->options->enforceEncoding)) {
802
            //  they want to enforce the given encoding
803
            $encode->from($this->options->enforceEncoding);
804
            $encode->to($this->options->enforceEncoding);
805
806
            return false;
807
        }
808
809
        /** @var AbstractNode $meta */
810 231
        $meta = $this->root->find('meta[http-equiv=Content-Type]', 0);
811 231
        if (is_null($meta)) {
812
            // could not find meta tag
813 201
            $this->root->propagateEncoding($encode);
814
815 201
            return false;
816
        }
817 30
        $content = $meta->getAttribute('content');
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $content is correct as $meta->getAttribute('content') targeting PHPHtmlParser\Dom\AbstractNode::getAttribute() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
818 30
        if (is_null($content)) {
0 ignored issues
show
introduced by
The condition is_null($content) is always true.
Loading history...
819
            // could not find content
820
            $this->root->propagateEncoding($encode);
821
822
            return false;
823
        }
824 30
        $matches = [];
825 30
        if (preg_match('/charset=(.+)/', $content, $matches)) {
826 30
            $encode->from(trim($matches[1]));
827 30
            $this->root->propagateEncoding($encode);
828
829 30
            return true;
830
        }
831
832
        // no charset found
833
        $this->root->propagateEncoding($encode);
834
835
        return false;
836
    }
837
}
838