Passed
Push — master ( d7540c...231250 )
by Josh
05:57
created

ListDiffLines   B

Complexity

Total Complexity 47

Size/Duplication

Total Lines 343
Duplicated Lines 13.99 %

Coupling/Cohesion

Components 1
Dependencies 7

Test Coverage

Coverage 97.4%

Importance

Changes 1
Bugs 0 Features 0
Metric Value
dl 48
loc 343
ccs 150
cts 154
cp 0.974
rs 8.439
c 1
b 0
f 0
wmc 47
lcom 1
cbo 7

11 Methods

Rating   Name   Duplication   Size   Complexity  
A build() 0 15 3
A listByLines() 0 14 1
A findListNode() 0 4 1
A deleteListItem() 7 7 1
A addListItem() 7 7 2
A addClassToNode() 0 4 1
A create() 10 10 2
A getListTextArray() 0 9 2
C getListItemOperations() 0 59 9
B getRelevantNodeText() 0 18 5
F processOperations() 24 97 20

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like ListDiffLines often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use ListDiffLines, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
namespace Caxy\HtmlDiff;
4
5
use Caxy\HtmlDiff\Strategy\ListItemMatchStrategy;
6
use Sunra\PhpSimple\HtmlDomParser;
7
8
class ListDiffLines extends AbstractDiff
9
{
10
    const CLASS_LIST_ITEM_ADDED = 'normal new';
11
    const CLASS_LIST_ITEM_DELETED = 'removed';
12
    const CLASS_LIST_ITEM_CHANGED = 'replacement';
13
    const CLASS_LIST_ITEM_NONE = 'normal';
14
15
    protected static $listTypes = array('ul', 'ol', 'dl');
16
17
    /**
18
     * List of tags that should be included when retrieving
19
     * text from a single list item that will be used in
20
     * matching logic (and only in matching logic).
21
     *
22
     * @see getRelevantNodeText()
23
     *
24
     * @var array
25
     */
26
    protected static $listContentTags = array(
27
        'h1','h2','h3','h4','h5','pre','div','br','hr','code',
28
        'input','form','img','span','a','i','b','strong','em',
29
        'font','big','del','tt','sub','sup','strike',
30
    );
31
32
    /**
33
     * @var LcsService
34
     */
35
    protected $lcsService;
36
37
    /**
38
     * @param string              $oldText
39
     * @param string              $newText
40
     * @param HtmlDiffConfig|null $config
41
     *
42
     * @return ListDiffLines
43
     */
44 7 View Code Duplication
    public static function create($oldText, $newText, HtmlDiffConfig $config = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
45
    {
46 7
        $diff = new self($oldText, $newText);
47
48 7
        if (null !== $config) {
49 7
            $diff->setConfig($config);
50
        }
51
52 7
        return $diff;
53
    }
54
55
    /**
56
     * {@inheritDoc}
57
     */
58 7
    public function build()
59
    {
60 7
        $this->prepare();
61
62 7
        if ($this->hasDiffCache() && $this->getDiffCache()->contains($this->oldText, $this->newText)) {
63
            $this->content = $this->getDiffCache()->fetch($this->oldText, $this->newText);
64
65
            return $this->content;
66
        }
67
68 7
        $matchStrategy = new ListItemMatchStrategy($this->config->getMatchThreshold());
69 7
        $this->lcsService = new LcsService($matchStrategy);
70
71 7
        return $this->listByLines($this->oldText, $this->newText);
72
    }
73
74
    /**
75
     * @param string $old
76
     * @param string $new
77
     *
78
     * @return string
79
     */
80 7
    protected function listByLines($old, $new)
81
    {
82
        /* @var $newDom \simple_html_dom */
83 7
        $newDom = HtmlDomParser::str_get_html($new);
84
        /* @var $oldDom \simple_html_dom */
85 7
        $oldDom = HtmlDomParser::str_get_html($old);
86
87 7
        $newListNode = $this->findListNode($newDom);
88 7
        $oldListNode = $this->findListNode($oldDom);
89
90 7
        $operations = $this->getListItemOperations($oldListNode, $newListNode);
0 ignored issues
show
Bug introduced by
It seems like $oldListNode defined by $this->findListNode($oldDom) on line 88 can also be of type array<integer,object<simple_html_dom_node>> or null; however, Caxy\HtmlDiff\ListDiffLi...getListItemOperations() does only seem to accept object<simple_html_dom_node>, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
Bug introduced by
It seems like $newListNode defined by $this->findListNode($newDom) on line 87 can also be of type array<integer,object<simple_html_dom_node>> or null; however, Caxy\HtmlDiff\ListDiffLi...getListItemOperations() does only seem to accept object<simple_html_dom_node>, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
91
92 7
        return $this->processOperations($operations, $oldListNode, $newListNode);
0 ignored issues
show
Bug introduced by
It seems like $oldListNode defined by $this->findListNode($oldDom) on line 88 can also be of type array<integer,object<simple_html_dom_node>> or null; however, Caxy\HtmlDiff\ListDiffLines::processOperations() does only seem to accept object<simple_html_dom_node>, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
Bug introduced by
It seems like $newListNode defined by $this->findListNode($newDom) on line 87 can also be of type array<integer,object<simple_html_dom_node>> or null; however, Caxy\HtmlDiff\ListDiffLines::processOperations() does only seem to accept object<simple_html_dom_node>, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
93
    }
94
95
    /**
96
     * @param \simple_html_dom|\simple_html_dom_node $dom
97
     *
98
     * @return \simple_html_dom_node[]|\simple_html_dom_node|null
99
     */
100 7
    protected function findListNode($dom)
101
    {
102 7
        return $dom->find(implode(', ', static::$listTypes), 0);
103
    }
104
105
    /**
106
     * @param \simple_html_dom_node $oldListNode
107
     * @param \simple_html_dom_node $newListNode
108
     *
109
     * @return array|Operation[]
110
     */
111 7
    protected function getListItemOperations($oldListNode, $newListNode)
112
    {
113
        // Prepare arrays of list item content to use in LCS algorithm
114 7
        $oldListText = $this->getListTextArray($oldListNode);
115 7
        $newListText = $this->getListTextArray($newListNode);
116
117 7
        $lcsMatches = $this->lcsService->longestCommonSubsequence($oldListText, $newListText);
118
119 7
        $oldLength = count($oldListText);
120 7
        $newLength = count($newListText);
121
122 7
        $operations = array();
123 7
        $currentLineInOld = 0;
124 7
        $currentLineInNew = 0;
125 7
        $lcsMatches[$oldLength + 1] = $newLength + 1;
126 7
        foreach ($lcsMatches as $matchInOld => $matchInNew) {
127
            // No matching line in new list
128 7
            if ($matchInNew === 0) {
129 7
                continue;
130
            }
131
132 7
            $nextLineInOld = $currentLineInOld + 1;
133 7
            $nextLineInNew = $currentLineInNew + 1;
134
135 7
            if ($matchInNew > $nextLineInNew && $matchInOld > $nextLineInOld) {
136
                // Change
137 1
                $operations[] = new Operation(
138 1
                    Operation::CHANGED,
139 1
                    $nextLineInOld,
140 1
                    $matchInOld - 1,
141 1
                    $nextLineInNew,
142 1
                    $matchInNew - 1
143
                );
144 7
            } elseif ($matchInNew > $nextLineInNew && $matchInOld === $nextLineInOld) {
145
                // Add items before this
146 3
                $operations[] = new Operation(
147 3
                    Operation::ADDED,
148 3
                    $currentLineInOld,
149 3
                    $currentLineInOld,
150 3
                    $nextLineInNew,
151 3
                    $matchInNew - 1
152
                );
153 7
            } elseif ($matchInNew === $nextLineInNew && $matchInOld > $nextLineInOld) {
154
                // Delete items before this
155 2
                $operations[] = new Operation(
156 2
                    Operation::DELETED,
157 2
                    $nextLineInOld,
158 2
                    $matchInOld - 1,
159 2
                    $currentLineInNew,
160 2
                    $currentLineInNew
161
                );
162
            }
163
164 7
            $currentLineInNew = $matchInNew;
165 7
            $currentLineInOld = $matchInOld;
166
        }
167
168 7
        return $operations;
169
    }
170
171
    /**
172
     * @param \simple_html_dom_node $listNode
173
     *
174
     * @return array
175
     */
176 7
    protected function getListTextArray($listNode)
177
    {
178 7
        $output = array();
179 7
        foreach ($listNode->children() as $listItem) {
180 7
            $output[] = $this->getRelevantNodeText($listItem);
181
        }
182
183 7
        return $output;
184
    }
185
186
    /**
187
     * @param \simple_html_dom_node $node
188
     *
189
     * @return string
190
     */
191 7
    protected function getRelevantNodeText($node)
192
    {
193 7
        if (!$node->hasChildNodes()) {
194 3
            return $node->innertext();
195
        }
196
197 5
        $output = '';
198 5
        foreach ($node->nodes as $child) {
199
            /* @var $child \simple_html_dom_node */
200 5
            if (!$child->hasChildNodes()) {
201 5
                $output .= $child->outertext();
202 4
            } elseif (in_array($child->nodeName(), static::$listContentTags, true)) {
203
                $output .= sprintf('<%1$s>%2$s</%1$s>', $child->nodeName(), $this->getRelevantNodeText($child));
204
            }
205
        }
206
207 5
        return $output;
208
    }
209
210
    /**
211
     * @param \simple_html_dom_node $li
212
     *
213
     * @return string
214
     */
215 3 View Code Duplication
    protected function deleteListItem($li)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
216
    {
217 3
        $this->addClassToNode($li, self::CLASS_LIST_ITEM_DELETED);
218 3
        $li->innertext = sprintf('<del>%s</del>', $li->innertext);
219
220 3
        return $li->outertext;
221
    }
222
223
    /**
224
     * @param \simple_html_dom_node $li
225
     * @param bool                  $replacement
226
     *
227
     * @return string
228
     */
229 4 View Code Duplication
    protected function addListItem($li, $replacement = false)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
230
    {
231 4
        $this->addClassToNode($li, $replacement ? self::CLASS_LIST_ITEM_CHANGED : self::CLASS_LIST_ITEM_ADDED);
232 4
        $li->innertext = sprintf('<ins>%s</ins>', $li->innertext);
233
234 4
        return $li->outertext;
235
    }
236
237
    /**
238
     * @param Operation[]|array     $operations
239
     * @param \simple_html_dom_node $oldListNode
240
     * @param \simple_html_dom_node $newListNode
241
     *
242
     * @return string
243
     */
244 7
    protected function processOperations($operations, $oldListNode, $newListNode)
245
    {
246 7
        $output = '';
247
248 7
        $indexInOld = 0;
249 7
        $indexInNew = 0;
250 7
        $lastOperation = null;
251
252 7
        foreach ($operations as $operation) {
253 5
            $replaced = false;
254 5
            while ($operation->startInOld > ($operation->action === Operation::ADDED ? $indexInOld : $indexInOld + 1)) {
255 4
                $li = $oldListNode->children($indexInOld);
256 4
                $matchingLi = null;
257 4
                if ($operation->startInNew > ($operation->action === Operation::DELETED ? $indexInNew
258 4
                        : $indexInNew + 1)
259
                ) {
260 4
                    $matchingLi = $newListNode->children($indexInNew);
261
                }
262 4 View Code Duplication
                if (null !== $matchingLi) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
263 4
                    $htmlDiff = HtmlDiff::create($li->innertext, $matchingLi->innertext, $this->config);
264 4
                    $li->innertext = $htmlDiff->build();
265 4
                    $indexInNew++;
266
                }
267 4
                $class = self::CLASS_LIST_ITEM_NONE;
268
269 4
                if ($lastOperation === Operation::DELETED && !$replaced) {
270 1
                    $class = self::CLASS_LIST_ITEM_CHANGED;
271 1
                    $replaced = true;
272
                }
273 4
                $li->setAttribute('class', trim($li->getAttribute('class').' '.$class));
274
275 4
                $output .= $li->outertext;
276 4
                $indexInOld++;
277
            }
278
279 5
            switch ($operation->action) {
280 5
                case Operation::ADDED:
281 3 View Code Duplication
                    for ($i = $operation->startInNew; $i <= $operation->endInNew; $i++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
282 3
                        $output .= $this->addListItem($newListNode->children($i - 1));
283
                    }
284 3
                    $indexInNew = $operation->endInNew;
285 3
                    break;
286
287 3
                case Operation::DELETED:
288 2 View Code Duplication
                    for ($i = $operation->startInOld; $i <= $operation->endInOld; $i++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
289 2
                        $output .= $this->deleteListItem($oldListNode->children($i - 1));
290
                    }
291 2
                    $indexInOld = $operation->endInOld;
292 2
                    break;
293
294 1
                case Operation::CHANGED:
295 1
                    $changeDelta = 0;
296 1 View Code Duplication
                    for ($i = $operation->startInOld; $i <= $operation->endInOld; $i++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
297 1
                        $output .= $this->deleteListItem($oldListNode->children($i - 1));
298 1
                        $changeDelta--;
299
                    }
300 1 View Code Duplication
                    for ($i = $operation->startInNew; $i <= $operation->endInNew; $i++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
301 1
                        $output .= $this->addListItem($newListNode->children($i - 1), $changeDelta < 0);
302 1
                        $changeDelta++;
303
                    }
304 1
                    $indexInOld = $operation->endInOld;
305 1
                    $indexInNew = $operation->endInNew;
306 1
                    break;
307
            }
308
309 5
            $lastOperation = $operation->action;
310
        }
311
312 7
        $oldCount = count($oldListNode->children());
313 7
        $newCount = count($newListNode->children());
314 7
        while ($indexInOld < $oldCount) {
315 4
            $li = $oldListNode->children($indexInOld);
316 4
            $matchingLi = null;
317 4
            if ($indexInNew < $newCount) {
318 4
                $matchingLi = $newListNode->children($indexInNew);
319
            }
320 4 View Code Duplication
            if (null !== $matchingLi) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
321 4
                $htmlDiff = HtmlDiff::create($li->innertext(), $matchingLi->innertext(), $this->config);
322 4
                $li->innertext = $htmlDiff->build();
323 4
                $indexInNew++;
324
            }
325 4
            $class = self::CLASS_LIST_ITEM_NONE;
326
327 4
            if ($lastOperation === Operation::DELETED) {
328
                $class = self::CLASS_LIST_ITEM_CHANGED;
329
            }
330 4
            $li->setAttribute('class', trim($li->getAttribute('class').' '.$class));
331
332 4
            $output .= $li->outertext;
333 4
            $indexInOld++;
334
        }
335
336 7
        $newListNode->innertext = $output;
337 7
        $newListNode->setAttribute('class', trim($newListNode->getAttribute('class').' diff-list'));
338
339 7
        return $newListNode->outertext;
340
    }
341
342
    /**
343
     * @param \simple_html_dom_node $node
344
     * @param string                $class
345
     */
346 5
    protected function addClassToNode($node, $class)
347
    {
348 5
        $node->setAttribute('class', trim(sprintf('%s %s', $node->getAttribute('class'), $class)));
349 5
    }
350
}
351