Completed
Push — master ( c17796...052b15 )
by Damian
01:29
created

Diff::compareHTML()   F

Complexity

Conditions 26
Paths 1386

Size

Total Lines 100
Code Lines 67

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 26
eloc 67
nc 1386
nop 3
dl 0
loc 100
rs 2
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 14 and the first side effect is on line 9.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
3
namespace SilverStripe\View\Parsers;
4
5
use InvalidArgumentException;
6
use SilverStripe\Core\Convert;
7
use SilverStripe\Core\Injector\Injector;
8
9
require_once 'difflib/difflib.php';
10
11
/**
12
 * Class representing a 'diff' between two sequences of strings.
13
 */
14
class Diff extends \Diff
0 ignored issues
show
Bug introduced by
The type Diff was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
15
{
16
    public static $html_cleaner_class = null;
17
18
    /**
19
     *  Attempt to clean invalid HTML, which messes up diffs.
20
     *  This cleans code if possible, using an instance of HTMLCleaner
21
     *
22
     *  NB: By default, only extremely simple tidying is performed,
23
     *  by passing through DomDocument::loadHTML and saveXML
24
     *
25
     * @param string $content HTML content
26
     * @param HTMLCleaner $cleaner Optional instance of a HTMLCleaner class to
27
     *    use, overriding self::$html_cleaner_class
28
     * @return mixed|string
29
     */
30
    public static function cleanHTML($content, $cleaner = null)
31
    {
32
        if (!$cleaner) {
33
            if (self::$html_cleaner_class && class_exists(self::$html_cleaner_class)) {
34
                $cleaner = Injector::inst()->create(self::$html_cleaner_class);
35
            } else {
36
                //load cleaner if the dependent class is available
37
                $cleaner = HTMLCleaner::inst();
38
            }
39
        }
40
41
        if ($cleaner) {
42
            $content = $cleaner->cleanHTML($content);
43
        } else {
44
            // At most basic level of cleaning, use DOMDocument to save valid XML.
45
            $doc = HTMLValue::create($content);
0 ignored issues
show
Bug introduced by
$content of type string is incompatible with the type array expected by parameter $args of SilverStripe\View\ViewableData::create(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

45
            $doc = HTMLValue::create(/** @scrutinizer ignore-type */ $content);
Loading history...
46
            $content = $doc->getContent();
47
        }
48
49
        // Remove empty <ins /> and <del /> tags because browsers hate them
50
        $content = preg_replace('/<(ins|del)[^>]*\/>/', '', $content);
51
52
        return $content;
53
    }
54
55
    /**
56
     * @param string $from
57
     * @param string $to
58
     * @param bool $escape
59
     * @return string
60
     */
61
    public static function compareHTML($from, $to, $escape = false)
62
    {
63
        // First split up the content into words and tags
64
        $set1 = self::getHTMLChunks($from);
65
        $set2 = self::getHTMLChunks($to);
66
67
        // Diff that
68
        $diff = new Diff($set1, $set2);
69
70
        $tagStack[1] = $tagStack[2] = 0;
0 ignored issues
show
Comprehensibility Best Practice introduced by
$tagStack was never initialized. Although not strictly required by PHP, it is generally a good practice to add $tagStack = array(); before regardless.
Loading history...
71
        $rechunked[1] = $rechunked[2] = array();
0 ignored issues
show
Comprehensibility Best Practice introduced by
$rechunked was never initialized. Although not strictly required by PHP, it is generally a good practice to add $rechunked = array(); before regardless.
Loading history...
72
73
        // Go through everything, converting edited tags (and their content) into single chunks.  Otherwise
74
        // the generated HTML gets crusty
75
        foreach ($diff->edits as $edit) {
76
            $lookForTag = false;
77
            $stuffFor = [];
78
            switch ($edit->type) {
79
                case 'copy':
80
                    $lookForTag = false;
81
                    $stuffFor[1] = $edit->orig;
82
                    $stuffFor[2] = $edit->orig;
83
                    break;
84
85
                case 'change':
86
                    $lookForTag = true;
87
                    $stuffFor[1] = $edit->orig;
88
                    $stuffFor[2] = $edit->final;
89
                    break;
90
91
                case 'add':
92
                    $lookForTag = true;
93
                    $stuffFor[1] = null;
94
                    $stuffFor[2] = $edit->final;
95
                    break;
96
97
                case 'delete':
98
                    $lookForTag = true;
99
                    $stuffFor[1] = $edit->orig;
100
                    $stuffFor[2] = null;
101
                    break;
102
            }
103
104
            foreach ($stuffFor as $listName => $chunks) {
105
                if ($chunks) {
106
                    foreach ($chunks as $item) {
107
                        // $tagStack > 0 indicates that we should be tag-building
108
                        if ($tagStack[$listName]) {
109
                            $rechunked[$listName][sizeof($rechunked[$listName])-1] .= ' ' . $item;
0 ignored issues
show
Bug introduced by
The call to sizeof() has too few arguments starting with mode. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

109
                            $rechunked[$listName][/** @scrutinizer ignore-call */ sizeof($rechunked[$listName])-1] .= ' ' . $item;

This check compares calls to functions or methods with their respective definitions. If the call has less arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress. Please note the @ignore annotation hint above.

Loading history...
110
                        } else {
111
                            $rechunked[$listName][] = $item;
112
                        }
113
114
                        if ($lookForTag
115
                            && !$tagStack[$listName]
116
                            && isset($item[0])
117
                            && $item[0] == "<"
118
                            && substr($item, 0, 2) != "</"
119
                        ) {
120
                            $tagStack[$listName] = 1;
121
                        } elseif ($tagStack[$listName]) {
122
                            if (substr($item, 0, 2) == "</") {
123
                                $tagStack[$listName]--;
124
                            } elseif (isset($item[0]) && $item[0] == "<") {
125
                                $tagStack[$listName]++;
126
                            }
127
                        }
128
                    }
129
                }
130
            }
131
        }
132
133
        // Diff the re-chunked data, turning it into maked up HTML
134
        $diff = new Diff($rechunked[1], $rechunked[2]);
135
        $content = '';
136
        foreach ($diff->edits as $edit) {
137
            $orig = ($escape) ? Convert::raw2xml($edit->orig) : $edit->orig;
138
            $final = ($escape) ? Convert::raw2xml($edit->final) : $edit->final;
139
140
            switch ($edit->type) {
141
                case 'copy':
142
                    $content .= " " . implode(" ", $orig) . " ";
0 ignored issues
show
Bug introduced by
It seems like $orig can also be of type string; however, parameter $pieces of implode() does only seem to accept array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

142
                    $content .= " " . implode(" ", /** @scrutinizer ignore-type */ $orig) . " ";
Loading history...
143
                    break;
144
145
                case 'change':
146
                    $content .= " <ins>" . implode(" ", $final) . "</ins> ";
147
                    $content .= " <del>" . implode(" ", $orig) . "</del> ";
148
                    break;
149
150
                case 'add':
151
                    $content .= " <ins>" . implode(" ", $final) . "</ins> ";
152
                    break;
153
154
                case 'delete':
155
                    $content .= " <del>" . implode(" ", $orig) . "</del> ";
156
                    break;
157
            }
158
        }
159
160
        return self::cleanHTML($content);
161
    }
162
163
    /**
164
     * @param string|bool|array $content If passed as an array, values will be concatenated with a comma.
165
     * @return array
166
     */
167
    public static function getHTMLChunks($content)
168
    {
169
        if ($content && !is_string($content) && !is_array($content) && !is_numeric($content) && !is_bool($content)) {
170
            throw new InvalidArgumentException('$content parameter needs to be a string or array');
171
        }
172
        if (is_bool($content)) {
173
            // Convert boolean to strings
174
            $content = $content ? "true" : "false";
175
        }
176
        if (is_array($content)) {
177
            // Convert array to CSV
178
            $content = implode(',', $content);
179
        }
180
181
        $content = str_replace(array("&nbsp;", "<", ">"), array(" "," <", "> "), $content);
182
        $candidateChunks = preg_split("/[\t\r\n ]+/", $content);
183
        $chunks = [];
184
        for ($i = 0; $i < count($candidateChunks); $i++) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
Bug introduced by
It seems like $candidateChunks can also be of type false; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

184
        for ($i = 0; $i < count(/** @scrutinizer ignore-type */ $candidateChunks); $i++) {
Loading history...
185
            $item = $candidateChunks[$i];
186
            if (isset($item[0]) && $item[0] == "<") {
187
                $newChunk = $item;
188
                while ($item[strlen($item)-1] != ">") {
189
                    if (++$i >= count($candidateChunks)) {
190
                        break;
191
                    }
192
                    $item = $candidateChunks[$i];
193
                    $newChunk .= ' ' . $item;
194
                }
195
                $chunks[] = $newChunk;
196
            } else {
197
                $chunks[] = $item;
198
            }
199
        }
200
        return $chunks;
201
    }
202
}
203