Completed
Pull Request — master (#164)
by Asmir
03:33
created

DOMTreeBuilder   F

Complexity

Total Complexity 108

Size/Duplication

Total Lines 671
Duplicated Lines 0 %

Coupling/Cohesion

Dependencies 3

Test Coverage

Coverage 85.04%

Importance

Changes 0
Metric Value
wmc 108
cbo 3
dl 0
loc 671
ccs 233
cts 274
cp 0.8504
rs 1.929
c 0
b 0
f 0

19 Methods

Rating   Name   Duplication   Size   Complexity  
B __construct() 0 39 5
A document() 0 4 1
A fragment() 0 4 1
A setInstructionProcessor() 0 4 1
A doctype() 0 14 2
F startTag() 0 202 62
C endTag() 0 68 13
A comment() 0 6 1
A text() 0 20 3
A eof() 0 4 1
A parseError() 0 4 1
A getErrors() 0 4 1
A cdata() 0 5 1
A processingInstruction() 0 22 5
A normalizeTagName() 0 7 1
A quirksTreeResolver() 0 4 1
A autoclose() 0 16 4
A isAncestor() 0 12 3
A isParent() 0 4 1

How to fix   Complexity   

Complex Class

Complex classes like DOMTreeBuilder often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use DOMTreeBuilder, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
namespace Masterminds\HTML5\Parser;
4
5
use Masterminds\HTML5\Elements;
6
use Masterminds\HTML5\InstructionProcessor;
7
8
/**
9
 * Create an HTML5 DOM tree from events.
10
 *
11
 * This attempts to create a DOM from events emitted by a parser. This
12
 * attempts (but does not guarantee) to up-convert older HTML documents
13
 * to HTML5. It does this by applying HTML5's rules, but it will not
14
 * change the architecture of the document itself.
15
 *
16
 * Many of the error correction and quirks features suggested in the specification
17
 * are implemented herein; however, not all of them are. Since we do not
18
 * assume a graphical user agent, no presentation-specific logic is conducted
19
 * during tree building.
20
 *
21
 * FIXME: The present tree builder does not exactly follow the state machine rules
22
 * for insert modes as outlined in the HTML5 spec. The processor needs to be
23
 * re-written to accomodate this. See, for example, the Go language HTML5
24
 * parser.
25
 */
26
class DOMTreeBuilder implements EventHandler
27
{
28
    /**
29
     * Defined in http://www.w3.org/TR/html51/infrastructure.html#html-namespace-0.
30
     */
31
    const NAMESPACE_HTML = 'http://www.w3.org/1999/xhtml';
32
33
    const NAMESPACE_MATHML = 'http://www.w3.org/1998/Math/MathML';
34
35
    const NAMESPACE_SVG = 'http://www.w3.org/2000/svg';
36
37
    const NAMESPACE_XLINK = 'http://www.w3.org/1999/xlink';
38
39
    const NAMESPACE_XML = 'http://www.w3.org/XML/1998/namespace';
40
41
    const NAMESPACE_XMLNS = 'http://www.w3.org/2000/xmlns/';
42
43
    const OPT_DISABLE_HTML_NS = 'disable_html_ns';
44
45
    const OPT_TARGET_DOC = 'target_document';
46
47
    const OPT_IMPLICIT_NS = 'implicit_namespaces';
48
49
    /**
50
     * Holds the HTML5 element names that causes a namespace switch.
51
     *
52
     * @var array
53
     */
54
    protected $nsRoots = array(
55
        'html' => self::NAMESPACE_HTML,
56
        'svg' => self::NAMESPACE_SVG,
57
        'math' => self::NAMESPACE_MATHML,
58
    );
59
60
    /**
61
     * Holds the always available namespaces (which does not require the XMLNS declaration).
62
     *
63
     * @var array
64
     */
65
    protected $implicitNamespaces = array(
66
        'xml' => self::NAMESPACE_XML,
67
        'xmlns' => self::NAMESPACE_XMLNS,
68
        'xlink' => self::NAMESPACE_XLINK,
69
    );
70
71
    /**
72
     * Holds a stack of currently active namespaces.
73
     *
74
     * @var array
75
     */
76
    protected $nsStack = array();
77
78
    /**
79
     * Holds the number of namespaces declared by a node.
80
     *
81
     * @var array
82
     */
83
    protected $pushes = array();
84
85
    /**
86
     * Defined in 8.2.5.
87
     */
88
    const IM_INITIAL = 0;
89
90
    const IM_BEFORE_HTML = 1;
91
92
    const IM_BEFORE_HEAD = 2;
93
94
    const IM_IN_HEAD = 3;
95
96
    const IM_IN_HEAD_NOSCRIPT = 4;
97
98
    const IM_AFTER_HEAD = 5;
99
100
    const IM_IN_BODY = 6;
101
102
    const IM_TEXT = 7;
103
104
    const IM_IN_TABLE = 8;
105
106
    const IM_IN_TABLE_TEXT = 9;
107
108
    const IM_IN_CAPTION = 10;
109
110
    const IM_IN_COLUMN_GROUP = 11;
111
112
    const IM_IN_TABLE_BODY = 12;
113
114
    const IM_IN_ROW = 13;
115
116
    const IM_IN_CELL = 14;
117
118
    const IM_IN_SELECT = 15;
119
120
    const IM_IN_SELECT_IN_TABLE = 16;
121
122
    const IM_AFTER_BODY = 17;
123
124
    const IM_IN_FRAMESET = 18;
125
126
    const IM_AFTER_FRAMESET = 19;
127
128
    const IM_AFTER_AFTER_BODY = 20;
129
130
    const IM_AFTER_AFTER_FRAMESET = 21;
131
132
    const IM_IN_SVG = 22;
133
134
    const IM_IN_MATHML = 23;
135
136
    protected $options = array();
137
138
    protected $stack = array();
139
140
    protected $current; // Pointer in the tag hierarchy.
141
    protected $rules;
142
    protected $doc;
143
144
    protected $frag;
145
146
    protected $processor;
147
148
    protected $insertMode = 0;
149
150
    /**
151
     * Track if we are in an element that allows only inline child nodes.
152
     *
153
     * @var string|null
154
     */
155
    protected $onlyInline;
156
157
    /**
158
     * Quirks mode is enabled by default.
159
     * Any document that is missing the DT will be considered to be in quirks mode.
160
     */
161
    protected $quirks = true;
162
163
    protected $errors = array();
164
165 112
    public function __construct($isFragment = false, array $options = array())
166
    {
167 112
        $this->options = $options;
168
169 112
        if (isset($options[self::OPT_TARGET_DOC])) {
170 1
            $this->doc = $options[self::OPT_TARGET_DOC];
171 1
        } else {
172 111
            $impl = new \DOMImplementation();
173
            // XXX:
174
            // Create the doctype. For now, we are always creating HTML5
175
            // documents, and attempting to up-convert any older DTDs to HTML5.
176 111
            $dt = $impl->createDocumentType('html');
177
            // $this->doc = \DOMImplementation::createDocument(NULL, 'html', $dt);
178 111
            $this->doc = $impl->createDocument(null, null, $dt);
179
        }
180
181 112
        $this->errors = array();
182
183 112
        $this->current = $this->doc; // ->documentElement;
184
185
        // Create a rules engine for tags.
186 112
        $this->rules = new TreeBuildingRules();
187
188 112
        $implicitNS = array();
189 112
        if (isset($this->options[self::OPT_IMPLICIT_NS])) {
190
            $implicitNS = $this->options[self::OPT_IMPLICIT_NS];
191 112
        } elseif (isset($this->options['implicitNamespaces'])) {
192 2
            $implicitNS = $this->options['implicitNamespaces'];
193 2
        }
194
195
        // Fill $nsStack with the defalut HTML5 namespaces, plus the "implicitNamespaces" array taken form $options
196 112
        array_unshift($this->nsStack, $implicitNS + array('' => self::NAMESPACE_HTML) + $this->implicitNamespaces);
197
198 112
        if ($isFragment) {
199 18
            $this->insertMode = static::IM_IN_BODY;
200 18
            $this->frag = $this->doc->createDocumentFragment();
201 18
            $this->current = $this->frag;
202 18
        }
203 112
    }
204
205
    /**
206
     * Get the document.
207
     */
208 102
    public function document()
209
    {
210 102
        return $this->doc;
211
    }
212
213
    /**
214
     * Get the DOM fragment for the body.
215
     *
216
     * This returns a DOMNodeList because a fragment may have zero or more
217
     * DOMNodes at its root.
218
     *
219
     * @see http://www.w3.org/TR/2012/CR-html5-20121217/syntax.html#concept-frag-parse-context
220
     *
221
     * @return \DOMDocumentFragment
222
     */
223 18
    public function fragment()
224
    {
225 18
        return $this->frag;
226
    }
227
228
    /**
229
     * Provide an instruction processor.
230
     *
231
     * This is used for handling Processor Instructions as they are
232
     * inserted. If omitted, PI's are inserted directly into the DOM tree.
233
     *
234
     * @param InstructionProcessor $proc
235
     */
236 1
    public function setInstructionProcessor(InstructionProcessor $proc)
237
    {
238 1
        $this->processor = $proc;
239 1
    }
240
241 96
    public function doctype($name, $idType = 0, $id = null, $quirks = false)
242
    {
243
        // This is used solely for setting quirks mode. Currently we don't
244
        // try to preserve the inbound DT. We convert it to HTML5.
245 96
        $this->quirks = $quirks;
246
247 96
        if ($this->insertMode > static::IM_INITIAL) {
248
            $this->parseError('Illegal placement of DOCTYPE tag. Ignoring: ' . $name);
249
250
            return;
251
        }
252
253 96
        $this->insertMode = static::IM_BEFORE_HTML;
254 96
    }
255
256
    /**
257
     * Process the start tag.
258
     *
259
     * @todo - XMLNS namespace handling (we need to parse, even if it's not valid)
260
     *       - XLink, MathML and SVG namespace handling
261
     *       - Omission rules: 8.1.2.4 Optional tags
262
     *
263
     * @param string $name
264
     * @param array  $attributes
265
     * @param bool   $selfClosing
266
     *
267
     * @return int
268
     */
269 108
    public function startTag($name, $attributes = array(), $selfClosing = false)
270
    {
271 108
        $lname = $this->normalizeTagName($name);
272
273
        // Make sure we have an html element.
274 108
        if (!$this->doc->documentElement && 'html' !== $name && !$this->frag) {
275 3
            $this->startTag('html');
276 3
        }
277
278
        // Set quirks mode if we're at IM_INITIAL with no doctype.
279 108
        if ($this->insertMode === static::IM_INITIAL) {
280 5
            $this->quirks = true;
281 5
            $this->parseError('No DOCTYPE specified.');
282 5
        }
283
284
        // SPECIAL TAG HANDLING:
285
        // Spec says do this, and "don't ask."
286
        // find the spec where this is defined... looks problematic
287 108
        if ('image' === $name && !($this->insertMode === static::IM_IN_SVG || $this->insertMode === static::IM_IN_MATHML)) {
288
            $name = 'img';
289
        }
290
291
        // Autoclose p tags where appropriate.
292 108
        if ($this->insertMode >= static::IM_IN_BODY && Elements::isA($name, Elements::AUTOCLOSE_P)) {
293 56
            $this->autoclose('p');
294 56
        }
295
296
        // Set insert mode:
297
        switch ($name) {
298 108
            case 'html':
299 101
                $this->insertMode = static::IM_BEFORE_HEAD;
300 101
                break;
301 102
            case 'head':
302 43
                if ($this->insertMode > static::IM_BEFORE_HEAD) {
303
                    $this->parseError('Unexpected head tag outside of head context.');
304
                } else {
305 43
                    $this->insertMode = static::IM_IN_HEAD;
306
                }
307 43
                break;
308 101
            case 'body':
309 86
                $this->insertMode = static::IM_IN_BODY;
310 86
                break;
311 95
            case 'svg':
312 8
                $this->insertMode = static::IM_IN_SVG;
313 8
                break;
314 95
            case 'math':
315 7
                $this->insertMode = static::IM_IN_MATHML;
316 7
                break;
317 92
            case 'noscript':
318 1
                if ($this->insertMode === static::IM_IN_HEAD) {
319 1
                    $this->insertMode = static::IM_IN_HEAD_NOSCRIPT;
320 1
                }
321 1
                break;
322
        }
323
324
        // Special case handling for SVG.
325 108
        if ($this->insertMode === static::IM_IN_SVG) {
326 8
            $lname = Elements::normalizeSvgElement($lname);
327 8
        }
328
329 108
        $pushes = 0;
330
        // when we found a tag thats appears inside $nsRoots, we have to switch the defalut namespace
331 108
        if (isset($this->nsRoots[$lname]) && $this->nsStack[0][''] !== $this->nsRoots[$lname]) {
332 15
            array_unshift($this->nsStack, array(
333 15
                '' => $this->nsRoots[$lname],
334 15
            ) + $this->nsStack[0]);
335 15
            ++$pushes;
336 15
        }
337 108
        $needsWorkaround = false;
338 108
        if (isset($this->options['xmlNamespaces']) && $this->options['xmlNamespaces']) {
339
            // when xmlNamespaces is true a and we found a 'xmlns' or 'xmlns:*' attribute, we should add a new item to the $nsStack
340 6
            foreach ($attributes as $aName => $aVal) {
341 5
                if ('xmlns' === $aName) {
342 3
                    $needsWorkaround = $aVal;
343 3
                    array_unshift($this->nsStack, array(
344 3
                        '' => $aVal,
345 3
                    ) + $this->nsStack[0]);
346 3
                    ++$pushes;
347 5
                } elseif ('xmlns' === (($pos = strpos($aName, ':')) ? substr($aName, 0, $pos) : '')) {
348 3
                    array_unshift($this->nsStack, array(
349 3
                        substr($aName, $pos + 1) => $aVal,
350 3
                    ) + $this->nsStack[0]);
351 3
                    ++$pushes;
352 3
                }
353 6
            }
354 6
        }
355
356 108
        if ($this->onlyInline && Elements::isA($lname, Elements::BLOCK_TAG)) {
357 2
            $this->autoclose($this->onlyInline);
358 2
            $this->onlyInline = null;
359 2
        }
360
361
        try {
362 108
            $prefix = ($pos = strpos($lname, ':')) ? substr($lname, 0, $pos) : '';
363
364 108
            if (false !== $needsWorkaround) {
365 3
                $xml = "<$lname xmlns=\"$needsWorkaround\" " . (strlen($prefix) && isset($this->nsStack[0][$prefix]) ? ("xmlns:$prefix=\"" . $this->nsStack[0][$prefix] . '"') : '') . '/>';
366
367 3
                $frag = new \DOMDocument('1.0', 'UTF-8');
368 3
                $frag->loadXML($xml);
369
370 3
                $ele = $this->doc->importNode($frag->documentElement, true);
371 3
            } else {
372 108
                if (!isset($this->nsStack[0][$prefix]) || ('' === $prefix && isset($this->options[self::OPT_DISABLE_HTML_NS]) && $this->options[self::OPT_DISABLE_HTML_NS])) {
373 2
                    $ele = $this->doc->createElement($lname);
374 2
                } else {
375 107
                    $ele = $this->doc->createElementNS($this->nsStack[0][$prefix], $lname);
376
                }
377
            }
378 108
        } catch (\DOMException $e) {
379
            $this->parseError("Illegal tag name: <$lname>. Replaced with <invalid>.");
380
            $ele = $this->doc->createElement('invalid');
381
        }
382
383 108
        if (Elements::isA($lname, Elements::BLOCK_ONLY_INLINE)) {
384 29
            $this->onlyInline = $lname;
385 29
        }
386
387
        // When we add some namespacess, we have to track them. Later, when "endElement" is invoked, we have to remove them.
388
        // When we are on a void tag, we do not need to care about namesapce nesting.
389 108
        if ($pushes > 0 && !Elements::isA($name, Elements::VOID_TAG)) {
390
            // PHP tends to free the memory used by DOM,
391
            // to avoid spl_object_hash collisions whe have to avoid garbage collection of $ele storing it into $pushes
392
            // see https://bugs.php.net/bug.php?id=67459
393 17
            $this->pushes[spl_object_hash($ele)] = array($pushes, $ele);
394 17
        }
395
396 108
        foreach ($attributes as $aName => $aVal) {
397
            // xmlns attributes can't be set
398 80
            if ('xmlns' === $aName) {
399 5
                continue;
400
            }
401
402 79
            if ($this->insertMode === static::IM_IN_SVG) {
403 8
                $aName = Elements::normalizeSvgAttribute($aName);
404 79
            } elseif ($this->insertMode === static::IM_IN_MATHML) {
405 4
                $aName = Elements::normalizeMathMlAttribute($aName);
406 4
            }
407
408
            try {
409 79
                $prefix = ($pos = strpos($aName, ':')) ? substr($aName, 0, $pos) : false;
410
411 79
                if ('xmlns' === $prefix) {
412 4
                    $ele->setAttributeNS(self::NAMESPACE_XMLNS, $aName, $aVal);
413 79
                } elseif (false !== $prefix && isset($this->nsStack[0][$prefix])) {
414 6
                    $ele->setAttributeNS($this->nsStack[0][$prefix], $aName, $aVal);
415 6
                } else {
416 76
                    $ele->setAttribute($aName, $aVal);
417
                }
418 79
            } catch (\DOMException $e) {
419
                $this->parseError("Illegal attribute name for tag $name. Ignoring: $aName");
420
                continue;
421
            }
422
423
            // This is necessary on a non-DTD schema, like HTML5.
424 79
            if ('id' === $aName) {
425 24
                $ele->setIdAttribute('id', true);
426 24
            }
427 108
        }
428
429 108
        if ($this->frag !== $this->current && $this->rules->hasRules($name)) {
430
            // Some elements have special processing rules. Handle those separately.
431 6
            $this->current = $this->rules->evaluate($ele, $this->current);
432 6
        } else {
433
            // Otherwise, it's a standard element.
434 108
            $this->current->appendChild($ele);
435
436 108
            if (!Elements::isA($name, Elements::VOID_TAG)) {
437 108
                $this->current = $ele;
438 108
            }
439
440
            // Self-closing tags should only be respected on foreign elements
441
            // (and are implied on void elements)
442
            // See: https://www.w3.org/TR/html5/syntax.html#start-tags
443 108
            if (Elements::isHtml5Element($name)) {
444 107
                $selfClosing = false;
445 107
            }
446
        }
447
448
        // This is sort of a last-ditch attempt to correct for cases where no head/body
449
        // elements are provided.
450 108
        if ($this->insertMode <= static::IM_BEFORE_HEAD && 'head' !== $name && 'html' !== $name) {
451 5
            $this->insertMode = static::IM_IN_BODY;
452 5
        }
453
454
        // When we are on a void tag, we do not need to care about namesapce nesting,
455
        // but we have to remove the namespaces pushed to $nsStack.
456 108
        if ($pushes > 0 && Elements::isA($name, Elements::VOID_TAG)) {
457
            // remove the namespaced definded by current node
458
            for ($i = 0; $i < $pushes; ++$i) {
459
                array_shift($this->nsStack);
460
            }
461
        }
462
463 108
        if ($selfClosing) {
464 7
            $this->endTag($name);
465 7
        }
466
467
        // Return the element mask, which the tokenizer can then use to set
468
        // various processing rules.
469 108
        return Elements::element($name);
470
    }
471
472 106
    public function endTag($name)
473
    {
474 106
        $lname = $this->normalizeTagName($name);
475
476
        // Ignore closing tags for unary elements.
477 106
        if (Elements::isA($name, Elements::VOID_TAG)) {
478
            return;
479
        }
480
481 106
        if ($this->insertMode <= static::IM_BEFORE_HTML) {
482
            // 8.2.5.4.2
483
            if (in_array($name, array(
484
                'html',
485
                'br',
486
                'head',
487
                'title',
488
            ))) {
489
                $this->startTag('html');
490
                $this->endTag($name);
491
                $this->insertMode = static::IM_BEFORE_HEAD;
492
493
                return;
494
            }
495
496
            // Ignore the tag.
497
            $this->parseError('Illegal closing tag at global scope.');
498
499
            return;
500
        }
501
502
        // Special case handling for SVG.
503 106
        if ($this->insertMode === static::IM_IN_SVG) {
504 8
            $lname = Elements::normalizeSvgElement($lname);
505 8
        }
506
507 106
        $cid = spl_object_hash($this->current);
508
509
        // XXX: HTML has no parent. What do we do, though,
510
        // if this element appears in the wrong place?
511 106
        if ('html' === $lname) {
512 97
            return;
513
        }
514
515
        // remove the namespaced definded by current node
516 100
        if (isset($this->pushes[$cid])) {
517 15
            for ($i = 0; $i < $this->pushes[$cid][0]; ++$i) {
518 15
                array_shift($this->nsStack);
519 15
            }
520 15
            unset($this->pushes[$cid]);
521 15
        }
522
523 100
        if (!$this->autoclose($lname)) {
524 2
            $this->parseError('Could not find closing tag for ' . $lname);
525 2
        }
526
527
        switch ($lname) {
528 100
            case 'head':
529 43
                $this->insertMode = static::IM_AFTER_HEAD;
530 43
                break;
531 99
            case 'body':
532 87
                $this->insertMode = static::IM_AFTER_BODY;
533 87
                break;
534 84
            case 'svg':
535 84
            case 'mathml':
536 8
                $this->insertMode = static::IM_IN_BODY;
537 8
                break;
538
        }
539 100
    }
540
541 5
    public function comment($cdata)
542
    {
543
        // TODO: Need to handle case where comment appears outside of the HTML tag.
544 5
        $node = $this->doc->createComment($cdata);
545 5
        $this->current->appendChild($node);
546 5
    }
547
548 89
    public function text($data)
549
    {
550
        // XXX: Hmmm.... should we really be this strict?
551 89
        if ($this->insertMode < static::IM_IN_HEAD) {
552
            // Per '8.2.5.4.3 The "before head" insertion mode' the characters
553
            // " \t\n\r\f" should be ignored but no mention of a parse error. This is
554
            // practical as most documents contain these characters. Other text is not
555
            // expected here so recording a parse error is necessary.
556 58
            $dataTmp = trim($data, " \t\n\r\f");
557 58
            if (!empty($dataTmp)) {
558
                // fprintf(STDOUT, "Unexpected insert mode: %d", $this->insertMode);
559 1
                $this->parseError('Unexpected text. Ignoring: ' . $dataTmp);
560 1
            }
561
562 58
            return;
563
        }
564
        // fprintf(STDOUT, "Appending text %s.", $data);
565 88
        $node = $this->doc->createTextNode($data);
566 88
        $this->current->appendChild($node);
567 88
    }
568
569 112
    public function eof()
570
    {
571
        // If the $current isn't the $root, do we need to do anything?
572 112
    }
573
574 11
    public function parseError($msg, $line = 0, $col = 0)
575
    {
576 11
        $this->errors[] = sprintf('Line %d, Col %d: %s', $line, $col, $msg);
577 11
    }
578
579 106
    public function getErrors()
580
    {
581 106
        return $this->errors;
582
    }
583
584 3
    public function cdata($data)
585
    {
586 3
        $node = $this->doc->createCDATASection($data);
587 3
        $this->current->appendChild($node);
588 3
    }
589
590 5
    public function processingInstruction($name, $data = null)
591
    {
592
        // XXX: Ignore initial XML declaration, per the spec.
593 5
        if ($this->insertMode === static::IM_INITIAL && 'xml' === strtolower($name)) {
594 1
            return;
595
        }
596
597
        // Important: The processor may modify the current DOM tree however it sees fit.
598 5
        if ($this->processor instanceof InstructionProcessor) {
599 1
            $res = $this->processor->process($this->current, $name, $data);
0 ignored issues
show
Bug introduced by
It seems like $this->current can also be of type object<DOMNode>; however, Masterminds\HTML5\InstructionProcessor::process() does only seem to accept object<DOMElement>, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
600 1
            if (!empty($res)) {
601 1
                $this->current = $res;
602 1
            }
603
604 1
            return;
605
        }
606
607
        // Otherwise, this is just a dumb PI element.
608 4
        $node = $this->doc->createProcessingInstruction($name, $data);
609
610 4
        $this->current->appendChild($node);
611 4
    }
612
613
    // ==========================================================================
614
    // UTILITIES
615
    // ==========================================================================
616
617
    /**
618
     * Apply normalization rules to a tag name.
619
     * See sections 2.9 and 8.1.2.
620
     *
621
     * @param string $tagName
622
     *
623
     * @return string The normalized tag name.
624
     */
625 108
    protected function normalizeTagName($tagName)
626
    {
627
        /*
628
         * Section 2.9 suggests that we should not do this. if (strpos($name, ':') !== false) { // We know from the grammar that there must be at least one other // char besides :, since : is not a legal tag start. $parts = explode(':', $name); return array_pop($parts); }
629
         */
630 108
        return $tagName;
631
    }
632
633
    protected function quirksTreeResolver($name)
634
    {
635
        throw new \Exception('Not implemented.');
636
    }
637
638
    /**
639
     * Automatically climb the tree and close the closest node with the matching $tag.
640
     *
641
     * @param string $tagName
642
     *
643
     * @return bool
644
     */
645 100
    protected function autoclose($tagName)
646
    {
647 100
        $working = $this->current;
648
        do {
649 100
            if (XML_ELEMENT_NODE !== $working->nodeType) {
650 56
                return false;
651
            }
652 100
            if ($working->tagName === $tagName) {
653 100
                $this->current = $working->parentNode;
654
655 100
                return true;
656
            }
657 53
        } while ($working = $working->parentNode);
658
659
        return false;
660
    }
661
662
    /**
663
     * Checks if the given tagname is an ancestor of the present candidate.
664
     *
665
     * If $this->current or anything above $this->current matches the given tag
666
     * name, this returns true.
667
     *
668
     * @param string $tagName
669
     *
670
     * @return bool
671
     */
672
    protected function isAncestor($tagName)
673
    {
674
        $candidate = $this->current;
675
        while (XML_ELEMENT_NODE === $candidate->nodeType) {
676
            if ($candidate->tagName === $tagName) {
677
                return true;
678
            }
679
            $candidate = $candidate->parentNode;
680
        }
681
682
        return false;
683
    }
684
685
    /**
686
     * Returns true if the immediate parent element is of the given tagname.
687
     *
688
     * @param string $tagName
689
     *
690
     * @return bool
691
     */
692
    protected function isParent($tagName)
693
    {
694
        return $this->current->tagName === $tagName;
695
    }
696
}
697