Completed
Push — master ( fd8f2b...760841 )
by Lars
01:28
created

HtmlMin   F

Complexity

Total Complexity 208

Size/Duplication

Total Lines 1689
Duplicated Lines 0.71 %

Coupling/Cohesion

Components 14
Dependencies 3

Test Coverage

Coverage 92.74%

Importance

Changes 0
Metric Value
wmc 208
lcom 14
cbo 3
dl 12
loc 1689
ccs 409
cts 441
cp 0.9274
rs 0.8
c 0
b 0
f 0

63 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 6 1
A attachObserverToTheDomLoop() 0 4 1
A doOptimizeAttributes() 0 6 1
A doOptimizeViaHtmlDomParser() 0 6 1
A doRemoveComments() 0 6 1
A doRemoveDefaultAttributes() 0 6 1
A doRemoveDeprecatedAnchorName() 0 6 1
A doRemoveDeprecatedScriptCharsetAttribute() 0 6 1
A doRemoveDeprecatedTypeFromScriptTag() 0 6 1
A doRemoveDeprecatedTypeFromStylesheetLink() 0 6 1
A doRemoveEmptyAttributes() 0 6 1
A doRemoveHttpPrefixFromAttributes() 0 6 1
A doRemoveHttpsPrefixFromAttributes() 0 6 1
A keepPrefixOnExternalAttributes() 0 6 1
A doMakeSameDomainLinksRelative() 0 8 1
A getLocalDomain() 0 4 1
A doRemoveOmittedHtmlTags() 0 6 1
A doRemoveOmittedQuotes() 0 6 1
A doRemoveSpacesBetweenTags() 0 6 1
A doRemoveValueFromEmptyInput() 0 6 1
A doRemoveWhitespaceAroundTags() 0 6 1
A doSortCssClassNames() 0 6 1
A doSortHtmlAttributes() 0 6 1
A doSumUpWhitespace() 0 6 1
C domNodeAttributesToString() 0 61 16
F domNodeClosingTagOptional() 0 270 51
F domNodeToString() 0 112 31
A getDomainsToRemoveHttpPrefixFromAttributes() 0 4 1
A isDoOptimizeAttributes() 0 4 1
A isDoOptimizeViaHtmlDomParser() 0 4 1
A isDoRemoveComments() 0 4 1
A isDoRemoveDefaultAttributes() 0 4 1
A isDoRemoveDeprecatedAnchorName() 0 4 1
A isDoRemoveDeprecatedScriptCharsetAttribute() 0 4 1
A isDoRemoveDeprecatedTypeFromScriptTag() 0 4 1
A isDoRemoveDeprecatedTypeFromStylesheetLink() 0 4 1
A isDoRemoveEmptyAttributes() 0 4 1
A isDoRemoveHttpPrefixFromAttributes() 0 4 1
A isDoRemoveHttpsPrefixFromAttributes() 0 4 1
A isKeepPrefixOnExternalAttributes() 0 4 1
A isDoMakeSameDomainLinksRelative() 0 4 1
A isDoRemoveOmittedHtmlTags() 0 4 1
A isDoRemoveOmittedQuotes() 0 4 1
A isDoRemoveSpacesBetweenTags() 0 4 1
A isDoRemoveValueFromEmptyInput() 0 4 1
A isDoRemoveWhitespaceAroundTags() 0 4 1
A isDoSortCssClassNames() 0 4 1
A isDoSortHtmlAttributes() 0 4 1
A isDoSumUpWhitespace() 0 4 1
C minify() 0 125 11
A getNextSiblingOfTypeDOMElement() 0 9 3
A isConditionalComment() 12 18 5
B minifyHtmlDom() 0 78 6
A notifyObserversAboutDomElementAfterMinification() 0 6 2
A notifyObserversAboutDomElementBeforeMinification() 0 6 2
A protectTagHelper() 0 18 4
B protectTags() 0 50 10
A removeComments() 0 17 4
B removeWhitespaceAroundTags() 0 29 7
A restoreProtectedHtml() 0 6 1
A setDomainsToRemoveHttpPrefixFromAttributes() 0 6 1
B sumUpWhitespace() 0 33 7
A useKeepBrokenHtml() 0 6 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like HtmlMin often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use HtmlMin, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
namespace voku\helper;
6
7
/**
8
 * Class HtmlMin
9
 *
10
 * Inspired by:
11
 * - JS: https://github.com/kangax/html-minifier/blob/gh-pages/src/htmlminifier.js
12
 * - PHP: https://github.com/searchturbine/phpwee-php-minifier
13
 * - PHP: https://github.com/WyriHaximus/HtmlCompress
14
 * - PHP: https://github.com/zaininnari/html-minifier
15
 * - PHP: https://github.com/ampaze/PHP-HTML-Minifier
16
 * - Java: https://code.google.com/archive/p/htmlcompressor/
17
 *
18
 * Ideas:
19
 * - http://perfectionkills.com/optimizing-html/
20
 */
21
class HtmlMin implements HtmlMinInterface
22
{
23
    /**
24
     * @var string
25
     */
26
    private static $regExSpace = "/[[:space:]]{2,}|[\r\n]/u";
27
28
    /**
29
     * @var string[]
30
     *
31
     * @psalm-var list<string>
32
     */
33
    private static $optional_end_tags = [
34
        'html',
35
        'head',
36
        'body',
37
    ];
38
39
    /**
40
     * @var string[]
41
     *
42
     * @psalm-var list<string>
43
     */
44
    private static $selfClosingTags = [
45
        'area',
46
        'base',
47
        'basefont',
48
        'br',
49
        'col',
50
        'command',
51
        'embed',
52
        'frame',
53
        'hr',
54
        'img',
55
        'input',
56
        'isindex',
57
        'keygen',
58
        'link',
59
        'meta',
60
        'param',
61
        'source',
62
        'track',
63
        'wbr',
64
    ];
65
66
    /**
67
     * @var string[]
68
     *
69
     * @psalm-var array<string, string>
70
     */
71
    private static $trimWhitespaceFromTags = [
72
        'article' => '',
73
        'br'      => '',
74
        'div'     => '',
75
        'footer'  => '',
76
        'hr'      => '',
77
        'nav'     => '',
78
        'p'       => '',
79
        'script'  => '',
80
    ];
81
82
    /**
83
     * @var array
84
     */
85
    private static $booleanAttributes = [
86
        'allowfullscreen' => '',
87
        'async'           => '',
88
        'autofocus'       => '',
89
        'autoplay'        => '',
90
        'checked'         => '',
91
        'compact'         => '',
92
        'controls'        => '',
93
        'declare'         => '',
94
        'default'         => '',
95
        'defaultchecked'  => '',
96
        'defaultmuted'    => '',
97
        'defaultselected' => '',
98
        'defer'           => '',
99
        'disabled'        => '',
100
        'enabled'         => '',
101
        'formnovalidate'  => '',
102
        'hidden'          => '',
103
        'indeterminate'   => '',
104
        'inert'           => '',
105
        'ismap'           => '',
106
        'itemscope'       => '',
107
        'loop'            => '',
108
        'multiple'        => '',
109
        'muted'           => '',
110
        'nohref'          => '',
111
        'noresize'        => '',
112
        'noshade'         => '',
113
        'novalidate'      => '',
114
        'nowrap'          => '',
115
        'open'            => '',
116
        'pauseonexit'     => '',
117
        'readonly'        => '',
118
        'required'        => '',
119
        'reversed'        => '',
120
        'scoped'          => '',
121
        'seamless'        => '',
122
        'selected'        => '',
123
        'sortable'        => '',
124
        'truespeed'       => '',
125
        'typemustmatch'   => '',
126
        'visible'         => '',
127
    ];
128
129
    /**
130
     * @var array
131
     */
132
    private static $skipTagsForRemoveWhitespace = [
133
        'code',
134
        'pre',
135
        'script',
136
        'style',
137
        'textarea',
138
    ];
139
140
    /**
141
     * @var array
142
     */
143
    private $protectedChildNodes = [];
144
145
    /**
146
     * @var string
147
     */
148
    private $protectedChildNodesHelper = 'html-min--voku--saved-content';
149
150
    /**
151
     * @var bool
152
     */
153
    private $doOptimizeViaHtmlDomParser = true;
154
155
    /**
156
     * @var bool
157
     */
158
    private $doOptimizeAttributes = true;
159
160
    /**
161
     * @var bool
162
     */
163
    private $doRemoveComments = true;
164
165
    /**
166
     * @var bool
167
     */
168
    private $doRemoveWhitespaceAroundTags = false;
169
170
    /**
171
     * @var bool
172
     */
173
    private $doRemoveOmittedQuotes = true;
174
175
    /**
176
     * @var bool
177
     */
178
    private $doRemoveOmittedHtmlTags = true;
179
180
    /**
181
     * @var bool
182
     */
183
    private $doRemoveHttpPrefixFromAttributes = false;
184
185
    /**
186
     * @var bool
187
     */
188
    private $doRemoveHttpsPrefixFromAttributes = false;
189
190
    /**
191
     * @var bool
192
     */
193
    private $keepPrefixOnExternalAttributes = false;
194
195
    /**
196
     * @var bool
197
     */
198
    private $doMakeSameDomainLinksRelative = false;
199
200
    /**
201
     * @var string
202
     */
203
    private $localDomain = '';
204
205
    /**
206
     * @var array
207
     */
208
    private $domainsToRemoveHttpPrefixFromAttributes = [
209
        'google.com',
210
        'google.de',
211
    ];
212
213
    /**
214
     * @var bool
215
     */
216
    private $doSortCssClassNames = true;
217
218
    /**
219
     * @var bool
220
     */
221
    private $doSortHtmlAttributes = true;
222
223
    /**
224
     * @var bool
225
     */
226
    private $doRemoveDeprecatedScriptCharsetAttribute = true;
227
228
    /**
229
     * @var bool
230
     */
231
    private $doRemoveDefaultAttributes = false;
232
233
    /**
234
     * @var bool
235
     */
236
    private $doRemoveDeprecatedAnchorName = true;
237
238
    /**
239
     * @var bool
240
     */
241
    private $doRemoveDeprecatedTypeFromStylesheetLink = true;
242
243
    /**
244
     * @var bool
245
     */
246
    private $doRemoveDeprecatedTypeFromScriptTag = true;
247
248
    /**
249
     * @var bool
250
     */
251
    private $doRemoveValueFromEmptyInput = true;
252
253
    /**
254
     * @var bool
255
     */
256
    private $doRemoveEmptyAttributes = true;
257
258
    /**
259
     * @var bool
260
     */
261
    private $doSumUpWhitespace = true;
262
263
    /**
264
     * @var bool
265
     */
266
    private $doRemoveSpacesBetweenTags = false;
267
268
    /**
269
     * @var bool
270
     */
271
    private $keepBrokenHtml = false;
272
273
    /**
274
     * @var bool
275
     */
276
    private $withDocType = false;
277
278
    /**
279
     * @var HtmlMinDomObserverInterface[]|\SplObjectStorage
280
     */
281
    private $domLoopObservers;
282
283
    /**
284
     * @var int
285
     */
286
    private $protected_tags_counter = 0;
287
288
    /**
289
     * HtmlMin constructor.
290
     */
291 55
    public function __construct()
292
    {
293 55
        $this->domLoopObservers = new \SplObjectStorage();
294
295 55
        $this->attachObserverToTheDomLoop(new HtmlMinDomObserverOptimizeAttributes());
296 55
    }
297
298
    /**
299
     * @param HtmlMinDomObserverInterface $observer
300
     *
301
     * @return void
302
     */
303 55
    public function attachObserverToTheDomLoop(HtmlMinDomObserverInterface $observer)
304
    {
305 55
        $this->domLoopObservers->attach($observer);
306 55
    }
307
308
    /**
309
     * @param bool $doOptimizeAttributes
310
     *
311
     * @return $this
312
     */
313 2
    public function doOptimizeAttributes(bool $doOptimizeAttributes = true): self
314
    {
315 2
        $this->doOptimizeAttributes = $doOptimizeAttributes;
316
317 2
        return $this;
318
    }
319
320
    /**
321
     * @param bool $doOptimizeViaHtmlDomParser
322
     *
323
     * @return $this
324
     */
325 1
    public function doOptimizeViaHtmlDomParser(bool $doOptimizeViaHtmlDomParser = true): self
326
    {
327 1
        $this->doOptimizeViaHtmlDomParser = $doOptimizeViaHtmlDomParser;
328
329 1
        return $this;
330
    }
331
332
    /**
333
     * @param bool $doRemoveComments
334
     *
335
     * @return $this
336
     */
337 3
    public function doRemoveComments(bool $doRemoveComments = true): self
338
    {
339 3
        $this->doRemoveComments = $doRemoveComments;
340
341 3
        return $this;
342
    }
343
344
    /**
345
     * @param bool $doRemoveDefaultAttributes
346
     *
347
     * @return $this
348
     */
349 2
    public function doRemoveDefaultAttributes(bool $doRemoveDefaultAttributes = true): self
350
    {
351 2
        $this->doRemoveDefaultAttributes = $doRemoveDefaultAttributes;
352
353 2
        return $this;
354
    }
355
356
    /**
357
     * @param bool $doRemoveDeprecatedAnchorName
358
     *
359
     * @return $this
360
     */
361 2
    public function doRemoveDeprecatedAnchorName(bool $doRemoveDeprecatedAnchorName = true): self
362
    {
363 2
        $this->doRemoveDeprecatedAnchorName = $doRemoveDeprecatedAnchorName;
364
365 2
        return $this;
366
    }
367
368
    /**
369
     * @param bool $doRemoveDeprecatedScriptCharsetAttribute
370
     *
371
     * @return $this
372
     */
373 2
    public function doRemoveDeprecatedScriptCharsetAttribute(bool $doRemoveDeprecatedScriptCharsetAttribute = true): self
374
    {
375 2
        $this->doRemoveDeprecatedScriptCharsetAttribute = $doRemoveDeprecatedScriptCharsetAttribute;
376
377 2
        return $this;
378
    }
379
380
    /**
381
     * @param bool $doRemoveDeprecatedTypeFromScriptTag
382
     *
383
     * @return $this
384
     */
385 2
    public function doRemoveDeprecatedTypeFromScriptTag(bool $doRemoveDeprecatedTypeFromScriptTag = true): self
386
    {
387 2
        $this->doRemoveDeprecatedTypeFromScriptTag = $doRemoveDeprecatedTypeFromScriptTag;
388
389 2
        return $this;
390
    }
391
392
    /**
393
     * @param bool $doRemoveDeprecatedTypeFromStylesheetLink
394
     *
395
     * @return $this
396
     */
397 2
    public function doRemoveDeprecatedTypeFromStylesheetLink(bool $doRemoveDeprecatedTypeFromStylesheetLink = true): self
398
    {
399 2
        $this->doRemoveDeprecatedTypeFromStylesheetLink = $doRemoveDeprecatedTypeFromStylesheetLink;
400
401 2
        return $this;
402
    }
403
404
    /**
405
     * @param bool $doRemoveEmptyAttributes
406
     *
407
     * @return $this
408
     */
409 2
    public function doRemoveEmptyAttributes(bool $doRemoveEmptyAttributes = true): self
410
    {
411 2
        $this->doRemoveEmptyAttributes = $doRemoveEmptyAttributes;
412
413 2
        return $this;
414
    }
415
416
    /**
417
     * @param bool $doRemoveHttpPrefixFromAttributes
418
     *
419
     * @return $this
420
     */
421 6
    public function doRemoveHttpPrefixFromAttributes(bool $doRemoveHttpPrefixFromAttributes = true): self
422
    {
423 6
        $this->doRemoveHttpPrefixFromAttributes = $doRemoveHttpPrefixFromAttributes;
424
425 6
        return $this;
426
    }
427
428
    /**
429
     * @param bool $doRemoveHttpsPrefixFromAttributes
430
     *
431
     * @return $this
432
     */
433 1
    public function doRemoveHttpsPrefixFromAttributes(bool $doRemoveHttpsPrefixFromAttributes = true): self
434
    {
435 1
        $this->doRemoveHttpsPrefixFromAttributes = $doRemoveHttpsPrefixFromAttributes;
436
437 1
        return $this;
438
    }
439
440
    /**
441
     * @param bool $keepPrefixOnExternalAttributes
442
     *
443
     * @return $this
444
     */
445 1
    public function keepPrefixOnExternalAttributes(bool $keepPrefixOnExternalAttributes = true): self
446
    {
447 1
        $this->keepPrefixOnExternalAttributes = $keepPrefixOnExternalAttributes;
448
449 1
        return $this;
450
    }
451
452
    /**
453
     * @param string $localDomain
454
     *
455
     * @return $this
456
     */
457 1
    public function doMakeSameDomainLinksRelative(string $localDomain = ''): self
458
    {
459 1
        $this->localDomain = \rtrim((string) \preg_replace('/(?:https?:)?\/\//i', '', $localDomain), '/');
460
461 1
        $this->doMakeSameDomainLinksRelative = ($this->localDomain !== '');
462
463 1
        return $this;
464
    }
465
466
    /**
467
     * @return string
468
     */
469 1
    public function getLocalDomain(): string
470
    {
471 1
        return $this->localDomain;
472
    }
473
474
    /**
475
     * @param bool $doRemoveOmittedHtmlTags
476
     *
477
     * @return $this
478
     */
479 1
    public function doRemoveOmittedHtmlTags(bool $doRemoveOmittedHtmlTags = true): self
480
    {
481 1
        $this->doRemoveOmittedHtmlTags = $doRemoveOmittedHtmlTags;
482
483 1
        return $this;
484
    }
485
486
    /**
487
     * @param bool $doRemoveOmittedQuotes
488
     *
489
     * @return $this
490
     */
491 1
    public function doRemoveOmittedQuotes(bool $doRemoveOmittedQuotes = true): self
492
    {
493 1
        $this->doRemoveOmittedQuotes = $doRemoveOmittedQuotes;
494
495 1
        return $this;
496
    }
497
498
    /**
499
     * @param bool $doRemoveSpacesBetweenTags
500
     *
501
     * @return $this
502
     */
503 1
    public function doRemoveSpacesBetweenTags(bool $doRemoveSpacesBetweenTags = true): self
504
    {
505 1
        $this->doRemoveSpacesBetweenTags = $doRemoveSpacesBetweenTags;
506
507 1
        return $this;
508
    }
509
510
    /**
511
     * @param bool $doRemoveValueFromEmptyInput
512
     *
513
     * @return $this
514
     */
515 2
    public function doRemoveValueFromEmptyInput(bool $doRemoveValueFromEmptyInput = true): self
516
    {
517 2
        $this->doRemoveValueFromEmptyInput = $doRemoveValueFromEmptyInput;
518
519 2
        return $this;
520
    }
521
522
    /**
523
     * @param bool $doRemoveWhitespaceAroundTags
524
     *
525
     * @return $this
526
     */
527 5
    public function doRemoveWhitespaceAroundTags(bool $doRemoveWhitespaceAroundTags = true): self
528
    {
529 5
        $this->doRemoveWhitespaceAroundTags = $doRemoveWhitespaceAroundTags;
530
531 5
        return $this;
532
    }
533
534
    /**
535
     * @param bool $doSortCssClassNames
536
     *
537
     * @return $this
538
     */
539 2
    public function doSortCssClassNames(bool $doSortCssClassNames = true): self
540
    {
541 2
        $this->doSortCssClassNames = $doSortCssClassNames;
542
543 2
        return $this;
544
    }
545
546
    /**
547
     * @param bool $doSortHtmlAttributes
548
     *
549
     * @return $this
550
     */
551 2
    public function doSortHtmlAttributes(bool $doSortHtmlAttributes = true): self
552
    {
553 2
        $this->doSortHtmlAttributes = $doSortHtmlAttributes;
554
555 2
        return $this;
556
    }
557
558
    /**
559
     * @param bool $doSumUpWhitespace
560
     *
561
     * @return $this
562
     */
563 2
    public function doSumUpWhitespace(bool $doSumUpWhitespace = true): self
564
    {
565 2
        $this->doSumUpWhitespace = $doSumUpWhitespace;
566
567 2
        return $this;
568
    }
569
570 51
    private function domNodeAttributesToString(\DOMNode $node): string
571
    {
572
        // Remove quotes around attribute values, when allowed (<p class="foo"> → <p class=foo>)
573 51
        $attr_str = '';
574 51
        if ($node->attributes !== null) {
575 51
            foreach ($node->attributes as $attribute) {
576 34
                $attr_str .= $attribute->name;
577
578
                if (
579 34
                    $this->doOptimizeAttributes
580
                    &&
581 34
                    isset(self::$booleanAttributes[$attribute->name])
582
                ) {
583 10
                    $attr_str .= ' ';
584
585 10
                    continue;
586
                }
587
588 34
                $attr_str .= '=';
589
590
                // http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#attributes-0
591 34
                $omit_quotes = $this->doRemoveOmittedQuotes
592
                               &&
593 34
                               $attribute->value !== ''
594
                               &&
595 34
                               \strpos($attribute->name, '____SIMPLE_HTML_DOM__VOKU') !== 0
596
                               &&
597 34
                               \strpos($attribute->name, ' ') === false
598
                               &&
599 34
                               \preg_match('/["\'=<>` \t\r\n\f]/', $attribute->value) === 0;
600
601 34
                $quoteTmp = '"';
602
                if (
603 34
                    !$omit_quotes
604
                    &&
605 34
                    \strpos($attribute->value, '"') !== false
606
                ) {
607 1
                    $quoteTmp = "'";
608
                }
609
610
                if (
611 34
                    $this->doOptimizeAttributes
612
                    &&
613
                    (
614 33
                        $attribute->name === 'srcset'
615
                        ||
616 34
                        $attribute->name === 'sizes'
617
                    )
618
                ) {
619 2
                    $attr_val = \preg_replace(self::$regExSpace, ' ', $attribute->value);
620
                } else {
621 34
                    $attr_val = $attribute->value;
622
                }
623
624 34
                $attr_str .= ($omit_quotes ? '' : $quoteTmp) . $attr_val . ($omit_quotes ? '' : $quoteTmp);
625 34
                $attr_str .= ' ';
626
            }
627
        }
628
629 51
        return \trim($attr_str);
630
    }
631
632
    /**
633
     * @param \DOMNode $node
634
     *
635
     * @return bool
636
     */
637 50
    private function domNodeClosingTagOptional(\DOMNode $node): bool
638
    {
639 50
        $tag_name = $node->nodeName;
640
641
        /** @var \DOMNode|null $parent_node - false-positive error from phpstan */
642 50
        $parent_node = $node->parentNode;
643
644 50
        if ($parent_node) {
645 50
            $parent_tag_name = $parent_node->nodeName;
646
        } else {
647
            $parent_tag_name = null;
648
        }
649
650 50
        $nextSibling = $this->getNextSiblingOfTypeDOMElement($node);
651
652
        // https://html.spec.whatwg.org/multipage/syntax.html#syntax-tag-omission
653
654
        // Implemented:
655
        //
656
        // A <p> element's end tag may be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, dl, fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, main, menu, nav, ol, p, pre, section, table, or ul element, or if there is no more content in the parent element and the parent element is an HTML element that is not an a, audio, del, ins, map, noscript, or video element, or an autonomous custom element.
657
        // An <li> element's end tag may be omitted if the li element is immediately followed by another li element or if there is no more content in the parent element.
658
        // A <td> element's end tag may be omitted if the td element is immediately followed by a td or th element, or if there is no more content in the parent element.
659
        // An <option> element's end tag may be omitted if the option element is immediately followed by another option element, or if it is immediately followed by an optgroup element, or if there is no more content in the parent element.
660
        // A <tr> element's end tag may be omitted if the tr element is immediately followed by another tr element, or if there is no more content in the parent element.
661
        // A <th> element's end tag may be omitted if the th element is immediately followed by a td or th element, or if there is no more content in the parent element.
662
        // A <dt> element's end tag may be omitted if the dt element is immediately followed by another dt element or a dd element.
663
        // A <dd> element's end tag may be omitted if the dd element is immediately followed by another dd element or a dt element, or if there is no more content in the parent element.
664
        // An <rp> element's end tag may be omitted if the rp element is immediately followed by an rt or rp element, or if there is no more content in the parent element.
665
        // An <optgroup> element's end tag may be omitted if the optgroup element is immediately followed by another optgroup element, or if there is no more content in the parent element.
666
667
        /**
668
         * @noinspection TodoComment
669
         *
670
         * TODO: Not Implemented
671
         */
672
        //
673
        // <html> may be omitted if first thing inside is not comment
674
        // <head> may be omitted if first thing inside is an element
675
        // <body> may be omitted if first thing inside is not space, comment, <meta>, <link>, <script>, <style> or <template>
676
        // <colgroup> may be omitted if first thing inside is <col>
677
        // <tbody> may be omitted if first thing inside is <tr>
678
        // A <colgroup> element's start tag may be omitted if the first thing inside the colgroup element is a col element, and if the element is not immediately preceded by another colgroup element whose end tag has been omitted. (It can't be omitted if the element is empty.)
679
        // A <colgroup> element's end tag may be omitted if the colgroup element is not immediately followed by ASCII whitespace or a comment.
680
        // A <caption> element's end tag may be omitted if the caption element is not immediately followed by ASCII whitespace or a comment.
681
        // A <thead> element's end tag may be omitted if the thead element is immediately followed by a tbody or tfoot element.
682
        // A <tbody> element's start tag may be omitted if the first thing inside the tbody element is a tr element, and if the element is not immediately preceded by a tbody, thead, or tfoot element whose end tag has been omitted. (It can't be omitted if the element is empty.)
683
        // A <tbody> element's end tag may be omitted if the tbody element is immediately followed by a tbody or tfoot element, or if there is no more content in the parent element.
684
        // A <tfoot> element's end tag may be omitted if there is no more content in the parent element.
685
        //
686
        // <-- However, a start tag must never be omitted if it has any attributes.
687
688
        /** @noinspection InArrayCanBeUsedInspection */
689 50
        return \in_array($tag_name, self::$optional_end_tags, true)
690
               ||
691
               (
692 47
                   $tag_name === 'li'
693
                   &&
694
                   (
695 6
                       $nextSibling === null
696
                       ||
697
                       (
698 4
                           $nextSibling instanceof \DOMElement
699
                           &&
700 47
                           $nextSibling->tagName === 'li'
701
                       )
702
                   )
703
               )
704
               ||
705
               (
706 47
                   $tag_name === 'optgroup'
707
                   &&
708
                   (
709 1
                       $nextSibling === null
710
                       ||
711
                       (
712 1
                           $nextSibling instanceof \DOMElement
713
                           &&
714 47
                           $nextSibling->tagName === 'optgroup'
715
                       )
716
                   )
717
               )
718
               ||
719
               (
720 47
                   $tag_name === 'rp'
721
                   &&
722
                   (
723
                       $nextSibling === null
724
                       ||
725
                       (
726
                           $nextSibling instanceof \DOMElement
727
                           &&
728
                           (
729
                               $nextSibling->tagName === 'rp'
730
                               ||
731 47
                               $nextSibling->tagName === 'rt'
732
                           )
733
                       )
734
                   )
735
               )
736
               ||
737
               (
738 47
                   $tag_name === 'tr'
739
                   &&
740
                   (
741 1
                       $nextSibling === null
742
                       ||
743
                       (
744 1
                           $nextSibling instanceof \DOMElement
745
                           &&
746 47
                           $nextSibling->tagName === 'tr'
747
                       )
748
                   )
749
               )
750
               ||
751
               (
752 47
                   $tag_name === 'source'
753
                   &&
754
                   (
755 1
                       $parent_tag_name === 'audio'
756
                       ||
757 1
                       $parent_tag_name === 'video'
758
                       ||
759 1
                       $parent_tag_name === 'picture'
760
                       ||
761 47
                       $parent_tag_name === 'source'
762
                   )
763
                   &&
764
                   (
765 1
                       $nextSibling === null
766
                       ||
767
                       (
768
                           $nextSibling instanceof \DOMElement
769
                           &&
770 47
                           $nextSibling->tagName === 'source'
771
                       )
772
                   )
773
               )
774
               ||
775
               (
776
                   (
777 47
                       $tag_name === 'td'
778
                       ||
779 47
                       $tag_name === 'th'
780
                   )
781
                   &&
782
                   (
783 1
                       $nextSibling === null
784
                       ||
785
                       (
786 1
                           $nextSibling instanceof \DOMElement
787
                           &&
788
                           (
789 1
                               $nextSibling->tagName === 'td'
790
                               ||
791 47
                               $nextSibling->tagName === 'th'
792
                           )
793
                       )
794
                   )
795
               )
796
               ||
797
               (
798
                   (
799 47
                       $tag_name === 'dd'
800
                       ||
801 47
                       $tag_name === 'dt'
802
                   )
803
                   &&
804
                   (
805
                       (
806 3
                           $nextSibling === null
807
                           &&
808 3
                           $tag_name === 'dd'
809
                       )
810
                       ||
811
                       (
812 3
                           $nextSibling instanceof \DOMElement
813
                           &&
814
                           (
815 3
                               $nextSibling->tagName === 'dd'
816
                               ||
817 47
                               $nextSibling->tagName === 'dt'
818
                           )
819
                       )
820
                   )
821
               )
822
               ||
823
               (
824 47
                   $tag_name === 'option'
825
                   &&
826
                   (
827 2
                       $nextSibling === null
828
                       ||
829
                       (
830 2
                           $nextSibling instanceof \DOMElement
831
                           &&
832
                           (
833 2
                               $nextSibling->tagName === 'option'
834
                               ||
835 47
                               $nextSibling->tagName === 'optgroup'
836
                           )
837
                       )
838
                   )
839
               )
840
               ||
841
               (
842 47
                   $tag_name === 'p'
843
                   &&
844
                   (
845
                       (
846 14
                           $nextSibling === null
847
                           &&
848
                           (
849 12
                               $node->parentNode !== null
850
                               &&
851 12
                               !\in_array(
852 12
                                   $node->parentNode->nodeName,
853
                                   [
854 12
                                       'a',
855
                                       'audio',
856
                                       'del',
857
                                       'ins',
858
                                       'map',
859
                                       'noscript',
860
                                       'video',
861
                                   ],
862 12
                                   true
863
                               )
864
                           )
865
                       )
866
                       ||
867
                       (
868 9
                           $nextSibling instanceof \DOMElement
869
                           &&
870 9
                           \in_array(
871 9
                               $nextSibling->tagName,
872
                               [
873 9
                                   'address',
874
                                   'article',
875
                                   'aside',
876
                                   'blockquote',
877
                                   'dir',
878
                                   'div',
879
                                   'dl',
880
                                   'fieldset',
881
                                   'footer',
882
                                   'form',
883
                                   'h1',
884
                                   'h2',
885
                                   'h3',
886
                                   'h4',
887
                                   'h5',
888
                                   'h6',
889
                                   'header',
890
                                   'hgroup',
891
                                   'hr',
892
                                   'menu',
893
                                   'nav',
894
                                   'ol',
895
                                   'p',
896
                                   'pre',
897
                                   'section',
898
                                   'table',
899
                                   'ul',
900
                               ],
901 50
                               true
902
                           )
903
                       )
904
                   )
905
               );
906
    }
907
908 51
    protected function domNodeToString(\DOMNode $node): string
909
    {
910
        // init
911 51
        $html = '';
912 51
        $emptyStringTmp = '';
913
914 51
        foreach ($node->childNodes as $child) {
915 51
            if ($emptyStringTmp === 'is_empty') {
916 28
                $emptyStringTmp = 'last_was_empty';
917
            } else {
918 51
                $emptyStringTmp = '';
919
            }
920
921 51
            if ($child instanceof \DOMDocumentType) {
922
                // add the doc-type only if it wasn't generated by DomDocument
923 12
                if (!$this->withDocType) {
924
                    continue;
925
                }
926
927 12
                if ($child->name) {
928 12
                    if (!$child->publicId && $child->systemId) {
929
                        $tmpTypeSystem = 'SYSTEM';
930
                        $tmpTypePublic = '';
931
                    } else {
932 12
                        $tmpTypeSystem = '';
933 12
                        $tmpTypePublic = 'PUBLIC';
934
                    }
935
936 12
                    $html .= '<!DOCTYPE ' . $child->name . ''
937 12
                             . ($child->publicId ? ' ' . $tmpTypePublic . ' "' . $child->publicId . '"' : '')
938 12
                             . ($child->systemId ? ' ' . $tmpTypeSystem . ' "' . $child->systemId . '"' : '')
939 12
                             . '>';
940
                }
941 51
            } elseif ($child instanceof \DOMElement) {
942 51
                $html .= \rtrim('<' . $child->tagName . ' ' . $this->domNodeAttributesToString($child));
943 51
                $html .= '>' . $this->domNodeToString($child);
944
945
                if (
946 51
                    !$this->doRemoveOmittedHtmlTags
947
                    ||
948 51
                    !$this->domNodeClosingTagOptional($child)
949
                ) {
950 45
                    $html .= '</' . $child->tagName . '>';
951
                }
952
953 51
                if (!$this->doRemoveWhitespaceAroundTags) {
954
                    /** @noinspection NestedPositiveIfStatementsInspection */
955
                    if (
956 50
                        $child->nextSibling instanceof \DOMText
957
                        &&
958 50
                        $child->nextSibling->wholeText === ' '
959
                    ) {
960
                        if (
961 27
                            $emptyStringTmp !== 'last_was_empty'
962
                            &&
963 27
                            \substr($html, -1) !== ' '
964
                        ) {
965 27
                            $html = \rtrim($html);
966
967
                            if (
968 27
                                $child->parentNode
969
                                &&
970 27
                                $child->parentNode->nodeName !== 'head'
971
                            ) {
972 27
                                $html .= ' ';
973
                            }
974
                        }
975 51
                        $emptyStringTmp = 'is_empty';
976
                    }
977
                }
978 47
            } elseif ($child instanceof \DOMText) {
979 47
                if ($child->isElementContentWhitespace()) {
980
                    if (
981 31
                        $child->previousSibling !== null
982
                        &&
983 31
                        $child->nextSibling !== null
984
                    ) {
985
                        if (
986
                            (
987 22
                                $child->wholeText
988
                                &&
989 22
                                \strpos($child->wholeText, ' ') !== false
990
                            )
991
                            ||
992
                            (
993
                                $emptyStringTmp !== 'last_was_empty'
994
                                &&
995 22
                                \substr($html, -1) !== ' '
996
                            )
997
                        ) {
998 22
                            $html = \rtrim($html);
999
1000
                            if (
1001 22
                                $child->parentNode
1002
                                &&
1003 22
                                $child->parentNode->nodeName !== 'head'
1004
                            ) {
1005 22
                                $html .= ' ';
1006
                            }
1007
                        }
1008 31
                        $emptyStringTmp = 'is_empty';
1009
                    }
1010
                } else {
1011 47
                    $html .= $child->wholeText;
1012
                }
1013 1
            } elseif ($child instanceof \DOMComment) {
1014 51
                $html .= '<!--' . $child->textContent . '-->';
1015
            }
1016
        }
1017
1018 51
        return $html;
1019
    }
1020
1021
    /**
1022
     * @return array
1023
     */
1024
    public function getDomainsToRemoveHttpPrefixFromAttributes(): array
1025
    {
1026
        return $this->domainsToRemoveHttpPrefixFromAttributes;
1027
    }
1028
1029
    /**
1030
     * @return bool
1031
     */
1032
    public function isDoOptimizeAttributes(): bool
1033
    {
1034
        return $this->doOptimizeAttributes;
1035
    }
1036
1037
    /**
1038
     * @return bool
1039
     */
1040
    public function isDoOptimizeViaHtmlDomParser(): bool
1041
    {
1042
        return $this->doOptimizeViaHtmlDomParser;
1043
    }
1044
1045
    /**
1046
     * @return bool
1047
     */
1048
    public function isDoRemoveComments(): bool
1049
    {
1050
        return $this->doRemoveComments;
1051
    }
1052
1053
    /**
1054
     * @return bool
1055
     */
1056 34
    public function isDoRemoveDefaultAttributes(): bool
1057
    {
1058 34
        return $this->doRemoveDefaultAttributes;
1059
    }
1060
1061
    /**
1062
     * @return bool
1063
     */
1064 34
    public function isDoRemoveDeprecatedAnchorName(): bool
1065
    {
1066 34
        return $this->doRemoveDeprecatedAnchorName;
1067
    }
1068
1069
    /**
1070
     * @return bool
1071
     */
1072 34
    public function isDoRemoveDeprecatedScriptCharsetAttribute(): bool
1073
    {
1074 34
        return $this->doRemoveDeprecatedScriptCharsetAttribute;
1075
    }
1076
1077
    /**
1078
     * @return bool
1079
     */
1080 34
    public function isDoRemoveDeprecatedTypeFromScriptTag(): bool
1081
    {
1082 34
        return $this->doRemoveDeprecatedTypeFromScriptTag;
1083
    }
1084
1085
    /**
1086
     * @return bool
1087
     */
1088 34
    public function isDoRemoveDeprecatedTypeFromStylesheetLink(): bool
1089
    {
1090 34
        return $this->doRemoveDeprecatedTypeFromStylesheetLink;
1091
    }
1092
1093
    /**
1094
     * @return bool
1095
     */
1096 34
    public function isDoRemoveEmptyAttributes(): bool
1097
    {
1098 34
        return $this->doRemoveEmptyAttributes;
1099
    }
1100
1101
    /**
1102
     * @return bool
1103
     */
1104 34
    public function isDoRemoveHttpPrefixFromAttributes(): bool
1105
    {
1106 34
        return $this->doRemoveHttpPrefixFromAttributes;
1107
    }
1108
1109
    /**
1110
     * @return bool
1111
     */
1112 34
    public function isDoRemoveHttpsPrefixFromAttributes(): bool
1113
    {
1114 34
        return $this->doRemoveHttpsPrefixFromAttributes;
1115
    }
1116
1117
    /**
1118
     * @return bool
1119
     */
1120 4
    public function isKeepPrefixOnExternalAttributes(): bool
1121
    {
1122 4
        return $this->keepPrefixOnExternalAttributes;
1123
    }
1124
1125
    /**
1126
     * @return bool
1127
     */
1128 34
    public function isDoMakeSameDomainLinksRelative(): bool
1129
    {
1130 34
        return $this->doMakeSameDomainLinksRelative;
1131
    }
1132
1133
    /**
1134
     * @return bool
1135
     */
1136
    public function isDoRemoveOmittedHtmlTags(): bool
1137
    {
1138
        return $this->doRemoveOmittedHtmlTags;
1139
    }
1140
1141
    /**
1142
     * @return bool
1143
     */
1144
    public function isDoRemoveOmittedQuotes(): bool
1145
    {
1146
        return $this->doRemoveOmittedQuotes;
1147
    }
1148
1149
    /**
1150
     * @return bool
1151
     */
1152
    public function isDoRemoveSpacesBetweenTags(): bool
1153
    {
1154
        return $this->doRemoveSpacesBetweenTags;
1155
    }
1156
1157
    /**
1158
     * @return bool
1159
     */
1160 34
    public function isDoRemoveValueFromEmptyInput(): bool
1161
    {
1162 34
        return $this->doRemoveValueFromEmptyInput;
1163
    }
1164
1165
    /**
1166
     * @return bool
1167
     */
1168
    public function isDoRemoveWhitespaceAroundTags(): bool
1169
    {
1170
        return $this->doRemoveWhitespaceAroundTags;
1171
    }
1172
1173
    /**
1174
     * @return bool
1175
     */
1176 34
    public function isDoSortCssClassNames(): bool
1177
    {
1178 34
        return $this->doSortCssClassNames;
1179
    }
1180
1181
    /**
1182
     * @return bool
1183
     */
1184 34
    public function isDoSortHtmlAttributes(): bool
1185
    {
1186 34
        return $this->doSortHtmlAttributes;
1187
    }
1188
1189
    /**
1190
     * @return bool
1191
     */
1192
    public function isDoSumUpWhitespace(): bool
1193
    {
1194
        return $this->doSumUpWhitespace;
1195
    }
1196
1197
    /**
1198
     * @param string $html
1199
     * @param bool   $multiDecodeNewHtmlEntity
1200
     *
1201
     * @return string
1202
     */
1203 55
    public function minify($html, $multiDecodeNewHtmlEntity = false): string
1204
    {
1205 55
        $html = (string) $html;
1206 55
        if (!isset($html[0])) {
1207 1
            return '';
1208
        }
1209
1210 55
        $html = \trim($html);
1211 55
        if (!$html) {
1212 3
            return '';
1213
        }
1214
1215
        // reset
1216 52
        $this->protectedChildNodes = [];
1217
1218
        // save old content
1219 52
        $origHtml = $html;
1220 52
        $origHtmlLength = \strlen($html);
1221
1222
        // -------------------------------------------------------------------------
1223
        // Minify the HTML via "HtmlDomParser"
1224
        // -------------------------------------------------------------------------
1225
1226 52
        if ($this->doOptimizeViaHtmlDomParser) {
1227 51
            $html = $this->minifyHtmlDom($html, $multiDecodeNewHtmlEntity);
1228
        }
1229
1230
        // -------------------------------------------------------------------------
1231
        // Trim whitespace from html-string. [protected html is still protected]
1232
        // -------------------------------------------------------------------------
1233
1234
        // Remove extra white-space(s) between HTML attribute(s)
1235 52
        if (\strpos($html, ' ') !== false) {
1236 46
            $html = (string) \preg_replace_callback(
1237 46
                '#<([^/\s<>!]+)(?:\s+([^<>]*?)\s*|\s*)(/?)>#',
1238
                static function ($matches) {
1239 46
                    return '<' . $matches[1] . \preg_replace('#([^\s=]+)(=([\'"]?)(.*?)\3)?(\s+|$)#su', ' $1$2', $matches[2]) . $matches[3] . '>';
1240 46
                },
1241 46
                $html
1242
            );
1243
        }
1244
1245 52
        if ($this->doRemoveSpacesBetweenTags) {
1246
            /** @noinspection NestedPositiveIfStatementsInspection */
1247 1
            if (\strpos($html, ' ') !== false) {
1248
                // Remove spaces that are between > and <
1249 1
                $html = (string) \preg_replace('#(>)\s(<)#', '>$2', $html);
1250
            }
1251
        }
1252
1253
        // -------------------------------------------------------------------------
1254
        // Restore protected HTML-code.
1255
        // -------------------------------------------------------------------------
1256
1257 52
        if (\strpos($html, $this->protectedChildNodesHelper) !== false) {
1258 9
            $html = (string) \preg_replace_callback(
1259 9
                '/<(?<element>' . $this->protectedChildNodesHelper . ')(?<attributes> [^>]*)?>(?<value>.*?)<\/' . $this->protectedChildNodesHelper . '>/',
1260 9
                [$this, 'restoreProtectedHtml'],
1261 9
                $html
1262
            );
1263
        }
1264
1265
        // -------------------------------------------------------------------------
1266
        // Restore protected HTML-entities.
1267
        // -------------------------------------------------------------------------
1268
1269 52
        if ($this->doOptimizeViaHtmlDomParser) {
1270 51
            $html = HtmlDomParser::putReplacedBackToPreserveHtmlEntities($html);
1271
        }
1272
1273
        // ------------------------------------
1274
        // Final clean-up
1275
        // ------------------------------------
1276
1277 52
        $html = \str_replace(
1278
            [
1279 52
                'html>' . "\n",
1280
                "\n" . '<html',
1281
                'html/>' . "\n",
1282
                "\n" . '</html',
1283
                'head>' . "\n",
1284
                "\n" . '<head',
1285
                'head/>' . "\n",
1286
                "\n" . '</head',
1287
            ],
1288
            [
1289 52
                'html>',
1290
                '<html',
1291
                'html/>',
1292
                '</html',
1293
                'head>',
1294
                '<head',
1295
                'head/>',
1296
                '</head',
1297
            ],
1298 52
            $html
1299
        );
1300
1301
        // self closing tags, don't need a trailing slash ...
1302 52
        $replace = [];
1303 52
        $replacement = [];
1304 52
        foreach (self::$selfClosingTags as $selfClosingTag) {
1305 52
            $replace[] = '<' . $selfClosingTag . '/>';
1306 52
            $replacement[] = '<' . $selfClosingTag . '>';
1307 52
            $replace[] = '<' . $selfClosingTag . ' />';
1308 52
            $replacement[] = '<' . $selfClosingTag . '>';
1309 52
            $replace[] = '></' . $selfClosingTag . '>';
1310 52
            $replacement[] = '>';
1311
        }
1312 52
        $html = \str_replace(
1313 52
            $replace,
1314 52
            $replacement,
1315 52
            $html
1316
        );
1317
1318
        // ------------------------------------
1319
        // check if compression worked
1320
        // ------------------------------------
1321
1322 52
        if ($origHtmlLength < \strlen($html)) {
1323
            $html = $origHtml;
1324
        }
1325
1326 52
        return $html;
1327
    }
1328
1329
    /**
1330
     * @param \DOMNode $node
1331
     *
1332
     * @return \DOMNode|null
1333
     */
1334 50
    protected function getNextSiblingOfTypeDOMElement(\DOMNode $node)
1335
    {
1336
        do {
1337
            /** @var \DOMNode|null $node - false-positive error from phpstan */
1338 50
            $node = $node->nextSibling;
1339 50
        } while (!($node === null || $node instanceof \DOMElement));
1340
1341 50
        return $node;
1342
    }
1343
1344
    /**
1345
     * Check if the current string is an conditional comment.
1346
     *
1347
     * INFO: since IE >= 10 conditional comment are not working anymore
1348
     *
1349
     * <!--[if expression]> HTML <![endif]-->
1350
     * <![if expression]> HTML <![endif]>
1351
     *
1352
     * @param string $comment
1353
     *
1354
     * @return bool
1355
     */
1356 4
    private function isConditionalComment($comment): bool
1357
    {
1358 4 View Code Duplication
        if (\strpos($comment, '[if ') !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1359
            /** @noinspection RegExpRedundantEscape */
1360 2
            if (\preg_match('/^\[if [^\]]+\]/', $comment)) {
1361 2
                return true;
1362
            }
1363
        }
1364
1365 4 View Code Duplication
        if (\strpos($comment, '[endif]') !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1366
            /** @noinspection RegExpRedundantEscape */
1367 1
            if (\preg_match('/\[endif\]$/', $comment)) {
1368 1
                return true;
1369
            }
1370
        }
1371
1372 4
        return false;
1373
    }
1374
1375
    /**
1376
     * @param string $html
1377
     * @param bool   $multiDecodeNewHtmlEntity
1378
     *
1379
     * @return string
1380
     */
1381 51
    private function minifyHtmlDom($html, $multiDecodeNewHtmlEntity): string
1382
    {
1383
        // init dom
1384 51
        $dom = new HtmlDomParser();
1385
        /** @noinspection UnusedFunctionResultInspection */
1386 51
        $dom->useKeepBrokenHtml($this->keepBrokenHtml);
1387
1388 51
        $dom->getDocument()->preserveWhiteSpace = false; // remove redundant white space
1389 51
        $dom->getDocument()->formatOutput = false; // do not formats output with indentation
1390
1391
        // load dom
1392
        /** @noinspection UnusedFunctionResultInspection */
1393 51
        $dom->loadHtml($html);
1394
1395 51
        $this->withDocType = (\stripos(\ltrim($html), '<!DOCTYPE') === 0);
1396
1397
        // -------------------------------------------------------------------------
1398
        // Protect <nocompress> HTML tags first.
1399
        // -------------------------------------------------------------------------
1400
1401 51
        $dom = $this->protectTagHelper($dom, 'nocompress');
1402
1403
        // -------------------------------------------------------------------------
1404
        // Notify the Observer before the minification.
1405
        // -------------------------------------------------------------------------
1406
1407 51
        foreach ($dom->find('*') as $element) {
1408 51
            $this->notifyObserversAboutDomElementBeforeMinification($element);
1409
        }
1410
1411
        // -------------------------------------------------------------------------
1412
        // Protect HTML tags and conditional comments.
1413
        // -------------------------------------------------------------------------
1414
1415 51
        $dom = $this->protectTags($dom);
1416
1417
        // -------------------------------------------------------------------------
1418
        // Remove default HTML comments. [protected html is still protected]
1419
        // -------------------------------------------------------------------------
1420
1421 51
        if ($this->doRemoveComments) {
1422 49
            $dom = $this->removeComments($dom);
1423
        }
1424
1425
        // -------------------------------------------------------------------------
1426
        // Sum-Up extra whitespace from the Dom. [protected html is still protected]
1427
        // -------------------------------------------------------------------------
1428
1429 51
        if ($this->doSumUpWhitespace) {
1430 50
            $dom = $this->sumUpWhitespace($dom);
1431
        }
1432
1433 51
        foreach ($dom->find('*') as $element) {
1434
1435
            // -------------------------------------------------------------------------
1436
            // Remove whitespace around tags. [protected html is still protected]
1437
            // -------------------------------------------------------------------------
1438
1439 51
            if ($this->doRemoveWhitespaceAroundTags) {
1440 3
                $this->removeWhitespaceAroundTags($element);
1441
            }
1442
1443
            // -------------------------------------------------------------------------
1444
            // Notify the Observer after the minification.
1445
            // -------------------------------------------------------------------------
1446
1447 51
            $this->notifyObserversAboutDomElementAfterMinification($element);
1448
        }
1449
1450
        // -------------------------------------------------------------------------
1451
        // Convert the Dom into a string.
1452
        // -------------------------------------------------------------------------
1453
1454 51
        return $dom->fixHtmlOutput(
1455 51
            $this->domNodeToString($dom->getDocument()),
1456 51
            $multiDecodeNewHtmlEntity
1457
        );
1458
    }
1459
1460
    /**
1461
     * @param SimpleHtmlDomInterface $domElement
1462
     *
1463
     * @return void
1464
     */
1465 51
    private function notifyObserversAboutDomElementAfterMinification(SimpleHtmlDomInterface $domElement)
1466
    {
1467 51
        foreach ($this->domLoopObservers as $observer) {
1468 51
            $observer->domElementAfterMinification($domElement, $this);
1469
        }
1470 51
    }
1471
1472
    /**
1473
     * @param SimpleHtmlDomInterface $domElement
1474
     *
1475
     * @return void
1476
     */
1477 51
    private function notifyObserversAboutDomElementBeforeMinification(SimpleHtmlDomInterface $domElement)
1478
    {
1479 51
        foreach ($this->domLoopObservers as $observer) {
1480 51
            $observer->domElementBeforeMinification($domElement, $this);
1481
        }
1482 51
    }
1483
1484
    /**
1485
     * @param HtmlDomParser $dom
1486
     * @param string        $selector
1487
     *
1488
     * @return HtmlDomParser
1489
     */
1490 51
    private function protectTagHelper(HtmlDomParser $dom, string $selector): HtmlDomParser
1491
    {
1492 51
        foreach ($dom->find($selector) as $element) {
1493 5
            if ($element->isRemoved()) {
1494 1
                continue;
1495
            }
1496
1497 5
            $this->protectedChildNodes[$this->protected_tags_counter] = $element->parentNode()->innerHtml();
1498 5
            $parentNode = $element->getNode()->parentNode;
1499 5
            if ($parentNode !== null) {
1500 5
                $parentNode->nodeValue = '<' . $this->protectedChildNodesHelper . ' data-' . $this->protectedChildNodesHelper . '="' . $this->protected_tags_counter . '"></' . $this->protectedChildNodesHelper . '>';
1501
            }
1502
1503 5
            ++$this->protected_tags_counter;
1504
        }
1505
1506 51
        return $dom;
1507
    }
1508
1509
    /**
1510
     * Prevent changes of inline "styles" and "scripts".
1511
     *
1512
     * @param HtmlDomParser $dom
1513
     *
1514
     * @return HtmlDomParser
1515
     */
1516 51
    private function protectTags(HtmlDomParser $dom): HtmlDomParser
1517
    {
1518 51
        $this->protectTagHelper($dom, 'code');
1519
1520 51
        foreach ($dom->find('script, style') as $element) {
1521 7
            if ($element->isRemoved()) {
1522
                continue;
1523
            }
1524
1525 7
            if ($element->tag === 'script' || $element->tag === 'style') {
1526 7
                $attributes = $element->getAllAttributes();
1527
                // skip external links
1528 7
                if (isset($attributes['src'])) {
1529 4
                    continue;
1530
                }
1531
            }
1532
1533 5
            $this->protectedChildNodes[$this->protected_tags_counter] = $element->innerhtml;
1534 5
            $element->getNode()->nodeValue = '<' . $this->protectedChildNodesHelper . ' data-' . $this->protectedChildNodesHelper . '="' . $this->protected_tags_counter . '"></' . $this->protectedChildNodesHelper . '>';
1535
1536 5
            ++$this->protected_tags_counter;
1537
        }
1538
1539 51
        foreach ($dom->find('//comment()') as $element) {
1540 4
            if ($element->isRemoved()) {
1541
                continue;
1542
            }
1543
1544 4
            $text = $element->text();
1545
1546
            // skip normal comments
1547 4
            if (!$this->isConditionalComment($text)) {
1548 4
                continue;
1549
            }
1550
1551 2
            $this->protectedChildNodes[$this->protected_tags_counter] = '<!--' . $text . '-->';
1552
1553
            /* @var $node \DOMComment */
1554 2
            $node = $element->getNode();
1555 2
            $child = new \DOMText('<' . $this->protectedChildNodesHelper . ' data-' . $this->protectedChildNodesHelper . '="' . $this->protected_tags_counter . '"></' . $this->protectedChildNodesHelper . '>');
1556 2
            $parentNode = $element->getNode()->parentNode;
1557 2
            if ($parentNode !== null) {
1558 2
                $parentNode->replaceChild($child, $node);
1559
            }
1560
1561 2
            ++$this->protected_tags_counter;
1562
        }
1563
1564 51
        return $dom;
1565
    }
1566
1567
    /**
1568
     * Remove comments in the dom.
1569
     *
1570
     * @param HtmlDomParser $dom
1571
     *
1572
     * @return HtmlDomParser
1573
     */
1574 49
    private function removeComments(HtmlDomParser $dom): HtmlDomParser
1575
    {
1576 49
        foreach ($dom->find('//comment()') as $commentWrapper) {
1577 3
            $comment = $commentWrapper->getNode();
1578 3
            $val = $comment->nodeValue;
1579 3
            if (\strpos($val, '[') === false) {
1580 3
                $parentNode = $comment->parentNode;
1581 3
                if ($parentNode !== null) {
1582 3
                    $parentNode->removeChild($comment);
1583
                }
1584
            }
1585
        }
1586
1587 49
        $dom->getDocument()->normalizeDocument();
1588
1589 49
        return $dom;
1590
    }
1591
1592
    /**
1593
     * Trim tags in the dom.
1594
     *
1595
     * @param SimpleHtmlDomInterface $element
1596
     *
1597
     * @return void
1598
     */
1599 3
    private function removeWhitespaceAroundTags(SimpleHtmlDomInterface $element)
1600
    {
1601 3
        if (isset(self::$trimWhitespaceFromTags[$element->tag])) {
0 ignored issues
show
Bug introduced by
Accessing tag on the interface voku\helper\SimpleHtmlDomInterface suggest that you code against a concrete implementation. How about adding an instanceof check?

If you access a property on an interface, you most likely code against a concrete implementation of the interface.

Available Fixes

  1. Adding an additional type check:

    interface SomeInterface { }
    class SomeClass implements SomeInterface {
        public $a;
    }
    
    function someFunction(SomeInterface $object) {
        if ($object instanceof SomeClass) {
            $a = $object->a;
        }
    }
    
  2. Changing the type hint:

    interface SomeInterface { }
    class SomeClass implements SomeInterface {
        public $a;
    }
    
    function someFunction(SomeClass $object) {
        $a = $object->a;
    }
    
Loading history...
1602 1
            $node = $element->getNode();
1603
1604
            /** @var \DOMNode[] $candidates */
1605 1
            $candidates = [];
1606 1
            if ($node->childNodes->length > 0) {
1607 1
                $candidates[] = $node->firstChild;
1608 1
                $candidates[] = $node->lastChild;
1609 1
                $candidates[] = $node->previousSibling;
1610 1
                $candidates[] = $node->nextSibling;
1611
            }
1612
1613
            /** @var mixed $candidate - false-positive error from phpstan */
1614 1
            foreach ($candidates as &$candidate) {
1615 1
                if ($candidate === null) {
1616
                    continue;
1617
                }
1618
1619 1
                if ($candidate->nodeType === \XML_TEXT_NODE) {
1620 1
                    $nodeValueTmp = \preg_replace(self::$regExSpace, ' ', $candidate->nodeValue);
1621 1
                    if ($nodeValueTmp !== null) {
1622 1
                        $candidate->nodeValue = $nodeValueTmp;
1623
                    }
1624
                }
1625
            }
1626
        }
1627 3
    }
1628
1629
    /**
1630
     * Callback function for preg_replace_callback use.
1631
     *
1632
     * @param array $matches PREG matches
1633
     *
1634
     * @return string
1635
     */
1636 9
    private function restoreProtectedHtml($matches): string
1637
    {
1638 9
        \preg_match('/.*"(?<id>\d*)"/', $matches['attributes'], $matchesInner);
1639
1640 9
        return $this->protectedChildNodes[$matchesInner['id']] ?? '';
1641
    }
1642
1643
    /**
1644
     * @param array $domainsToRemoveHttpPrefixFromAttributes
1645
     *
1646
     * @return $this
1647
     */
1648 2
    public function setDomainsToRemoveHttpPrefixFromAttributes($domainsToRemoveHttpPrefixFromAttributes): self
1649
    {
1650 2
        $this->domainsToRemoveHttpPrefixFromAttributes = $domainsToRemoveHttpPrefixFromAttributes;
1651
1652 2
        return $this;
1653
    }
1654
1655
    /**
1656
     * Sum-up extra whitespace from dom-nodes.
1657
     *
1658
     * @param HtmlDomParser $dom
1659
     *
1660
     * @return HtmlDomParser
1661
     */
1662 50
    private function sumUpWhitespace(HtmlDomParser $dom): HtmlDomParser
1663
    {
1664 50
        $text_nodes = $dom->find('//text()');
1665 50
        foreach ($text_nodes as $text_node_wrapper) {
1666
            /* @var $text_node \DOMNode */
1667 46
            $text_node = $text_node_wrapper->getNode();
1668 46
            $xp = $text_node->getNodePath();
1669 46
            if ($xp === null) {
1670
                continue;
1671
            }
1672
1673 46
            $doSkip = false;
1674 46
            foreach (self::$skipTagsForRemoveWhitespace as $pattern) {
1675 46
                if (\strpos($xp, "/${pattern}") !== false) {
1676 8
                    $doSkip = true;
1677
1678 46
                    break;
1679
                }
1680
            }
1681 46
            if ($doSkip) {
1682 8
                continue;
1683
            }
1684
1685 43
            $nodeValueTmp = \preg_replace(self::$regExSpace, ' ', $text_node->nodeValue);
1686 43
            if ($nodeValueTmp !== null) {
1687 43
                $text_node->nodeValue = $nodeValueTmp;
1688
            }
1689
        }
1690
1691 50
        $dom->getDocument()->normalizeDocument();
1692
1693 50
        return $dom;
1694
    }
1695
1696
    /**
1697
     * WARNING: maybe bad for performance ...
1698
     *
1699
     * @param bool $keepBrokenHtml
1700
     *
1701
     * @return HtmlMin
1702
     */
1703 2
    public function useKeepBrokenHtml(bool $keepBrokenHtml): self
1704
    {
1705 2
        $this->keepBrokenHtml = $keepBrokenHtml;
1706
1707 2
        return $this;
1708
    }
1709
}
1710