TagNode   F
last analyzed

Complexity

Total Complexity 68

Size/Duplication

Total Lines 494
Duplicated Lines 0 %

Test Coverage

Coverage 100%

Importance

Changes 0
Metric Value
eloc 144
dl 0
loc 494
ccs 165
cts 165
cp 1
rs 2.96
c 0
b 0
f 0
wmc 68

27 Methods

Rating   Name   Duplication   Size   Complexity  
B getMinimalDeletedSet() 0 29 7
A removeChild() 0 6 3
A copyTree() 0 13 2
B expandWhiteSpace() 0 25 7
A getLeftMostChild() 0 9 2
A getRightMostChild() 0 9 2
A setRoot() 0 6 2
A __construct() 0 5 1
A isBlockLevel() 0 3 1
A equals() 0 17 4
A getIterator() 0 3 1
A isPre() 0 3 1
A getChild() 0 7 2
A isSimilarTag() 0 11 3
A getMatchRatio() 0 6 1
A getAttributes() 0 3 1
A getEndTag() 0 3 1
A isSameTag() 0 7 2
A addChild() 0 10 3
A __toString() 0 3 1
C splitUntil() 0 62 12
A isInline() 0 3 1
A hasSameAttributes() 0 5 1
A getNumChildren() 0 3 1
A getIndexOf() 0 9 3
A getQName() 0 3 1
A getOpeningTag() 0 11 2

How to fix   Complexity   

Complex Class

Complex classes like TagNode often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use TagNode, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * (c) Steve Nebes <[email protected]>
4
 *
5
 * For the full copyright and license information, please view the LICENSE
6
 * file that was distributed with this source code.
7
 */
8
9
declare(strict_types=1);
10
11
namespace SN\DaisyDiff\Html\Dom;
12
13
use ArrayIterator;
14
use SN\DaisyDiff\Html\Ancestor\TextOnlyComparator;
15
use IteratorAggregate;
16
17
/**
18
 * Node that can contain other nodes. Represents an HTML tag.
19
 */
20
class TagNode extends Node implements IteratorAggregate
21
{
22
    /** @var Node[] */
23
    private $children = [];
24
25
    /** @var string */
26
    private $qName;
27
28
    /** @var array */
29
    private $attributes = [];
30
31
    /**
32
     * @param TagNode|null $parent
33
     * @param string       $qName
34
     * @param array        $attributes
35
     */
36 184
    public function __construct(?TagNode $parent, string $qName, array $attributes = [])
37
    {
38 184
        parent::__construct($parent);
39 184
        $this->qName = \mb_strtolower($qName);
40 184
        $this->attributes = $attributes;
41 184
    }
42
43
    /**
44
     * Appends the provided node ot the collection of children if $this node is set as teh parameter's parent. This
45
     * method is used in Node's constructor.
46
     *
47
     * @param Node $node
48
     * @param int  $index
49
     *
50
     * @throws \InvalidArgumentException
51
     */
52 139
    public function addChild(Node $node, int $index = null): void
53
    {
54 139
        if ($node->getParent() !== $this) {
55 2
            throw new \InvalidArgumentException('The new child must have this node as a parent.');
56
        }
57
58 139
        if (null !== $index) {
59 50
            \array_splice($this->children, $index, 0, [$node]);
60
        } else {
61 139
            $this->children[] = $node;
62
        }
63 139
    }
64
65
    /**
66
     * Update the root nodes of all child nodes.
67
     *
68
     * @param TagNode $root
69
     */
70 13
    protected function setRoot(TagNode $root): void
71
    {
72 13
        parent::setRoot($root);
73
74 13
        foreach ($this->getIterator() as $child) {
75 10
            $child->setRoot($root);
76
        }
77 13
    }
78
79
    /**
80
     * If the provided parameter is in the same tree with $this object then this method fetches index of the parameter
81
     * object in the children collection. If the parameter is from a different tree, then this method attempts to return
82
     * the index of first semantically equivalent node to the parameter.
83
     *
84
     * @param Node $child Tag we need an index for.
85
     * @return int Index of first semantically equivalent child or -1 if couldn't find one.
86
     */
87 22
    public function getIndexOf(Node $child): int
88
    {
89 22
        $key = \array_search($child, $this->children, true);
90
91 22
        if (false !== $key && \is_int($key)) {
92 22
            return $key;
93
        }
94
95 2
        return -1;
96
    }
97
98
    /**
99
     * @param int $index
100
     * @return Node
101
     *
102
     * @throws \OutOfBoundsException
103
     */
104 59
    public function getChild(int $index): Node
105
    {
106 59
        if (isset($this->children[$index])) {
107 58
            return $this->children[$index];
108
        }
109
110 1
        throw new \OutOfBoundsException(\sprintf('Index: %d, Size: %d', $index, \count($this->children)));
111
    }
112
113
    /**
114
     * IteratorAggregate::getIterator()
115
     *
116
     * @return ArrayIterator
117
     */
118 81
    public function getIterator(): ArrayIterator
119
    {
120 81
        return new ArrayIterator($this->children);
121
    }
122
123
    /**
124
     * @return int
125
     */
126 60
    public function getNumChildren(): int
127
    {
128 60
        return \count($this->children);
129
    }
130
131
    /**
132
     * @return string
133
     */
134 102
    public function getQName(): string
135
    {
136 102
        return $this->qName;
137
    }
138
139
    /**
140
     * @return array
141
     */
142 89
    public function getAttributes(): array
143
    {
144 89
        return $this->attributes;
145
    }
146
147
    /**
148
     * Checks tags for being semantically equivalent if it's from a different tree and for being the same object if it's
149
     * from the same tree as $this tag.
150
     *
151
     * @param TagNode|null $other
152
     * @return bool
153
     */
154 49
    public function isSameTag(?TagNode $other): bool
155
    {
156 49
        if (null === $other) {
157 1
            return false;
158
        }
159
160 49
        return $this->equals($other);
161
    }
162
163
    /**
164
     * Considers tags from different trees equal if they have same name and equivalent attributes. No attention paid to
165
     * the content (children) of the tag. Considers tags from the same tree equal if it is the same object.
166
     *
167
     * @param TagNode $tagNode
168
     * @return bool
169
     */
170 50
    public function equals(TagNode $tagNode): bool
171
    {
172 50
        if ($tagNode === $this) {
173 5
            return true;
174
        }
175
176 49
        if ($this->getRoot() === $tagNode->getRoot()) {
177
            // Not the same and in the same tree, so not equal.
178 1
            return false;
179
        }
180
181
        // Still a chance for being equal if we are in the different tree, we should use semantic equivalence instead.
182 48
        if ($this->isSimilarTag($tagNode)) {
183 43
            return true;
184
        }
185
186 18
        return false;
187
    }
188
189
    /**
190
     * @param array $otherAttributes
191
     * @return bool
192
     */
193 45
    private function hasSameAttributes(array $otherAttributes): bool
194
    {
195
        // http://php.net/manual/en/language.operators.array.php
196
        // $a == $b is TRUE if $a and $b have the same key/value pairs, order does not matter.
197 45
        return $this->getAttributes() == $otherAttributes;
198
    }
199
200
    /**
201
     * Returns true if this tag is similar to the given other tag. The tags may be from different trees. If the tag name
202
     * and attributes are the same, the result will be true.
203
     *
204
     * @param Node $other
205
     * @return bool
206
     */
207 49
    protected function isSimilarTag(Node $other): bool
208
    {
209 49
        $result = false;
210
211 49
        if ($other instanceof TagNode) {
212 49
            if ($this->getQName() === $other->getQName()) {
213 44
                $result = $this->hasSameAttributes($other->getAttributes());
214
            }
215
        }
216
217 49
        return $result;
218
    }
219
220
    /**
221
     * Produces string for the opening HTML tag for this node. Includes the attributes. This probably doesn't work for
222
     * image tag.
223
     *
224
     * @return string
225
     */
226 36
    public function getOpeningTag(): string
227
    {
228 36
        $s = '<' . $this->getQName();
229
230 36
        foreach ($this->getAttributes() as $name => $value) {
231 16
            $s .= \sprintf(' %s="%s"', $name, $value);
232
        }
233
234 36
        $s .= '>';
235
236 36
        return $s;
237
    }
238
239
    /**
240
     * Return the closing HTML tag that corresponds to the current node. Probably doesn't work for image tag.
241
     *
242
     * @return string
243
     */
244 23
    public function getEndTag(): string
245
    {
246 23
        return \sprintf('</%s>', $this->getQName());
247
    }
248
249
    /**
250
     * This recursive method considers a descendant deleted if all its children had TextNodes that now are marked as
251
     * removed with the provided id. If all children of a descendant is considered deleted, only that descendant is kept
252
     * in the collection of the deleted nodes, and its children are removed from the collection of the deleted nodes.
253
     *
254
     * The HTML tag nodes that never had any text content are never considered removed.
255
     *
256
     * It actually might have nothing to do with being really deleted, because the element might be kept after its text
257
     * content was deleted.
258
     *
259
     * Example:
260
     *      table cells can be kept after its text content was deleted
261
     *      horizontal rule has never had text content, but can be deleted
262
     *
263
     * @param int $id
264
     * @return Node[]
265
     */
266 23
    public function getMinimalDeletedSet(int $id): array
267
    {
268 23
        $nodes = [];
269
270
        // No-content tags are never included in the set.
271 23
        if (0 === \count($this->children)) {
272 4
            return $nodes;
273
        }
274
275
        // By default, we think that all children are in the deleted set until we prove otherwise.
276 23
        $hasNotDeletedDescendant = false;
277
278 23
        foreach ($this->getIterator() as $child) {
279 23
            $childrenChildren = $child->getMinimalDeletedSet($id);
280 23
            $nodes = \array_merge($nodes, $childrenChildren);
281
282 23
            if (!$hasNotDeletedDescendant &&
283 23
                !(1 === \count($childrenChildren) && \in_array($child, $childrenChildren, true))) {
284
                // This child is not entirely deleted.
285 23
                $hasNotDeletedDescendant = true;
286
            }
287
        }
288
289
        // If all children are in the deleted set, remove them and put $this instead.
290 23
        if (!$hasNotDeletedDescendant) {
291 8
            $nodes = [$this];
292
        }
293
294 23
        return $nodes;
295
    }
296
297
    /**
298
     * @return string
299
     */
300 14
    public function __toString(): string
301
    {
302 14
        return $this->getOpeningTag();
303
    }
304
305
    /**
306
     * Attempts to create 2 TagNodes with the same name and attributes as the original $this node. All children
307
     * preceding split parameter are placed into the left part, all children following the split parameter are placed
308
     * into the right part. Placement of the split node is determined by $includeLeft flag parameter. The newly created
309
     * nodes are only added to the parent of $this node if they have some children. The original $this node is removed
310
     * afterwards. The process proceeds recursively hiking up the tree until the "parent" node is reached. "Parent" node
311
     * will not be touched. This method is used when the parent tags of a deleted TextNode can no longer be found in the
312
     * new doc. (means they either has been deleted or changed arguments). The "parent" parameter in that case is the
313
     * deepest common parent between the deleted node and its surrounding remaining siblings.
314
     *
315
     * @param TagNode $parent
316
     * @param Node    $split
317
     * @param bool    $includeLeft
318
     * @return bool
319
     */
320 4
    public function splitUntil(TagNode $parent, Node $split, bool $includeLeft): bool
321
    {
322 4
        $splitOccurred = false;
323
324 4
        if ($parent !== $this) {
325 4
            $part1 = new TagNode(null, $this->getQName(), $this->getAttributes());
326 4
            $part2 = new TagNode(null, $this->getQName(), $this->getAttributes());
327 4
            $part1->setParent($this->getParent());
328 4
            $part2->setParent($this->getParent());
329
330 4
            $i = 0;
331 4
            $iMax = \count($this->children);
332
333 4
            while ($i < $iMax && $this->children[$i] !== $split) {
334 2
                $this->children[$i]->setParent($part1);
335 2
                $part1->addChild($this->children[$i]);
336 2
                $i++;
337
            }
338
339 4
            if ($i < $iMax) {
340
                // We've found a split node.
341 4
                if ($includeLeft) {
342 2
                    $this->children[$i]->setParent($part1);
343 2
                    $part1->addChild($this->children[$i]);
344
                } else {
345 3
                    $this->children[$i]->setParent($part2);
346 3
                    $part2->addChild($this->children[$i]);
347
                }
348
349 4
                $i++;
350
            }
351
352 4
            while ($i < $iMax) {
353 3
                $this->children[$i]->setParent($part2);
354 3
                $part2->addChild($this->children[$i]);
355 3
                $i++;
356
            }
357
358 4
            if ($part1->getNumChildren() > 0) {
359 2
                $this->getParent()->addChild($part1, $this->getParent()->getIndexOf($this));
360
            }
361
362 4
            if ($part2->getNumChildren() > 0) {
363 3
                $this->getParent()->addChild($part2, $this->getParent()->getIndexOf($this));
364
            }
365
366 4
            if ($part1->getNumChildren() > 0 && $part2->getNumChildren() > 0) {
367 1
                $splitOccurred = true;
368
            }
369
370
            // Since split isn't meant for no-children tags, we won't have a case where we removed $this and did not
371
            // substitute it with anything.
372 4
            $this->getParent()->removeChild($this);
373
374 4
            if ($includeLeft) {
375 2
                $this->getParent()->splitUntil($parent, $part1, $includeLeft);
376
            } else {
377 3
                $this->getParent()->splitUntil($parent, $part2, $includeLeft);
378
            }
379
        }
380
381 4
        return $splitOccurred;
382
    }
383
384
    /**
385
     * @param Node $node
386
     */
387 4
    public function removeChild(Node $node): void
388
    {
389 4
        $key = \array_search($node, $this->children, true);
390
391 4
        if (false !== $key && \is_int($key)) {
392 4
            \array_splice($this->children, $key, 1);
393
        }
394 4
    }
395
396
    /** @var string[] */
397
    private static $blocks = [
398
        'html', 'body', 'p', 'blockquote', 'h1', 'h2', 'h3', 'h4', 'h5', 'pre', 'div', 'ul', 'ol', 'li',
399
        'table', 'tbody', 'tr', 'td', 'th', 'br', 'thead', 'tfoot',
400
    ];
401
402
    /**
403
     * @return bool
404
     */
405 62
    public function isBlockLevel(): bool
406
    {
407 62
        return \in_array($this->getQName(), self::$blocks, true);
408
    }
409
410
    /**
411
     * @return bool
412
     */
413 57
    public function isInline(): bool
414
    {
415 57
        return !$this->isBlockLevel();
416
    }
417
418
    /**
419
     * {@inheritdoc}
420
     */
421 9
    public function copyTree(): Node
422
    {
423 9
        $newThis = new TagNode(null, $this->getQName(), $this->getAttributes());
424 9
        $newThis->setWhiteBefore($this->isWhiteBefore());
425 9
        $newThis->setWhiteAfter($this->isWhiteAfter());
426
427 9
        foreach ($this->getIterator() as $child) {
428 9
            $newChild = $child->copyTree();
429 9
            $newChild->setParent($newThis);
430 9
            $newThis->addChild($newChild);
431
        }
432
433 9
        return $newThis;
434
    }
435
436
    /**
437
     * @param TagNode $other
438
     * @return float
439
     */
440 2
    public function getMatchRatio(TagNode $other): float
441
    {
442 2
        $thisComp = new TextOnlyComparator($this);
443 2
        $otherComp = new TextOnlyComparator($other);
444
445 2
        return $otherComp->getMatchRatio($thisComp);
446
    }
447
448
    /**
449
     * @return void
450
     */
451 50
    public function expandWhiteSpace(): void
452
    {
453 50
        $shift = 0;
454 50
        $spaceAdded = false;
455
456 50
        for ($i = 0, $iMax = $this->getNumChildren(); $i < $iMax; $i++) {
457 49
            $child = $this->getChild($i + $shift);
458
459 49
            if ($child instanceof TagNode && !$child->isPre()) {
460 49
                $child->expandWhiteSpace();
461
            }
462
463 49
            if (!$spaceAdded && $child->isWhiteBefore()) {
464 26
                $ws = new WhiteSpaceNode(null, ' ', $child->getLeftMostChild());
465 26
                $ws->setParent($this);
466 26
                $this->addChild($ws, $i + ($shift++));
467
            }
468
469 49
            if ($child->isWhiteAfter()) {
470 45
                $ws = new WhiteSpaceNode(null, ' ', $child->getRightMostChild());
471 45
                $ws->setParent($this);
472 45
                $this->addChild($ws, $i + 1 + ($shift++));
473 45
                $spaceAdded = true;
474
            } else {
475 49
                $spaceAdded = false;
476
            }
477
        }
478 50
    }
479
480
    /**
481
     * @return Node
482
     */
483 5
    public function getLeftMostChild(): Node
484
    {
485 5
        if ($this->getNumChildren() < 1) {
486 1
            return $this;
487
        }
488
489 5
        $child = $this->getChild(0);
490
491 5
        return $child->getLeftMostChild();
492
    }
493
494
    /**
495
     * @return Node
496
     */
497 33
    public function getRightMostChild(): Node
498
    {
499 33
        if ($this->getNumChildren() < 1) {
500 2
            return $this;
501
        }
502
503 33
        $child = $this->getChild($this->getNumChildren() - 1);
504
505 33
        return $child->getRightMostChild();
506
    }
507
508
    /**
509
     * @return bool
510
     */
511 62
    public function isPre(): bool
512
    {
513 62
        return 'pre' === $this->getQName();
514
    }
515
}
516