Completed
Push — authenticator-refactor ( 16f104...61b037 )
by Simon
06:52
created

Diff   B

Complexity

Total Complexity 45

Size/Duplication

Total Lines 190
Duplicated Lines 5.26 %

Coupling/Cohesion

Components 1
Dependencies 3

Importance

Changes 0
Metric Value
dl 10
loc 190
rs 8.3673
c 0
b 0
f 0
wmc 45
lcom 1
cbo 3

3 Methods

Rating   Name   Duplication   Size   Complexity  
B cleanHTML() 0 24 5
F compareHTML() 10 101 26
C getHTMLChunks() 0 36 14

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like Diff often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Diff, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
namespace SilverStripe\View\Parsers;
4
5
use InvalidArgumentException;
6
use SilverStripe\Core\Convert;
7
use SilverStripe\Core\Injector\Injector;
8
9
require_once 'difflib/difflib.php';
10
11
/**
12
 * Class representing a 'diff' between two sequences of strings.
13
 */
14
class Diff extends \Diff
15
{
16
    public static $html_cleaner_class = null;
17
18
    /**
19
     *  Attempt to clean invalid HTML, which messes up diffs.
20
     *  This cleans code if possible, using an instance of HTMLCleaner
21
     *
22
     *  NB: By default, only extremely simple tidying is performed,
23
     *  by passing through DomDocument::loadHTML and saveXML
24
     *
25
     * @param string $content HTML content
26
     * @param HTMLCleaner $cleaner Optional instance of a HTMLCleaner class to
27
     *    use, overriding self::$html_cleaner_class
28
     * @return mixed|string
29
     */
30
    public static function cleanHTML($content, $cleaner = null)
31
    {
32
        if (!$cleaner) {
33
            if (self::$html_cleaner_class && class_exists(self::$html_cleaner_class)) {
34
                $cleaner = Injector::inst()->create(self::$html_cleaner_class);
0 ignored issues
show
Bug introduced by
It seems like you code against a concrete implementation and not the interface Psr\Container\ContainerInterface as the method create() does only exist in the following implementations of said interface: SilverStripe\Core\Injector\Injector.

Let’s take a look at an example:

interface User
{
    /** @return string */
    public function getPassword();
}

class MyUser implements User
{
    public function getPassword()
    {
        // return something
    }

    public function getDisplayName()
    {
        // return some name.
    }
}

class AuthSystem
{
    public function authenticate(User $user)
    {
        $this->logger->info(sprintf('Authenticating %s.', $user->getDisplayName()));
        // do something.
    }
}

In the above example, the authenticate() method works fine as long as you just pass instances of MyUser. However, if you now also want to pass a different implementation of User which does not have a getDisplayName() method, the code will break.

Available Fixes

  1. Change the type-hint for the parameter:

    class AuthSystem
    {
        public function authenticate(MyUser $user) { /* ... */ }
    }
    
  2. Add an additional type-check:

    class AuthSystem
    {
        public function authenticate(User $user)
        {
            if ($user instanceof MyUser) {
                $this->logger->info(/** ... */);
            }
    
            // or alternatively
            if ( ! $user instanceof MyUser) {
                throw new \LogicException(
                    '$user must be an instance of MyUser, '
                   .'other instances are not supported.'
                );
            }
    
        }
    }
    
Note: PHP Analyzer uses reverse abstract interpretation to narrow down the types inside the if block in such a case.
  1. Add the method to the interface:

    interface User
    {
        /** @return string */
        public function getPassword();
    
        /** @return string */
        public function getDisplayName();
    }
    
Loading history...
35
            } else {
36
                //load cleaner if the dependent class is available
37
                $cleaner = HTMLCleaner::inst();
38
            }
39
        }
40
41
        if ($cleaner) {
42
            $content = $cleaner->cleanHTML($content);
43
        } else {
44
            // At most basic level of cleaning, use DOMDocument to save valid XML.
45
            $doc = Injector::inst()->create('HTMLValue', $content);
0 ignored issues
show
Bug introduced by
It seems like you code against a concrete implementation and not the interface Psr\Container\ContainerInterface as the method create() does only exist in the following implementations of said interface: SilverStripe\Core\Injector\Injector.

Let’s take a look at an example:

interface User
{
    /** @return string */
    public function getPassword();
}

class MyUser implements User
{
    public function getPassword()
    {
        // return something
    }

    public function getDisplayName()
    {
        // return some name.
    }
}

class AuthSystem
{
    public function authenticate(User $user)
    {
        $this->logger->info(sprintf('Authenticating %s.', $user->getDisplayName()));
        // do something.
    }
}

In the above example, the authenticate() method works fine as long as you just pass instances of MyUser. However, if you now also want to pass a different implementation of User which does not have a getDisplayName() method, the code will break.

Available Fixes

  1. Change the type-hint for the parameter:

    class AuthSystem
    {
        public function authenticate(MyUser $user) { /* ... */ }
    }
    
  2. Add an additional type-check:

    class AuthSystem
    {
        public function authenticate(User $user)
        {
            if ($user instanceof MyUser) {
                $this->logger->info(/** ... */);
            }
    
            // or alternatively
            if ( ! $user instanceof MyUser) {
                throw new \LogicException(
                    '$user must be an instance of MyUser, '
                   .'other instances are not supported.'
                );
            }
    
        }
    }
    
Note: PHP Analyzer uses reverse abstract interpretation to narrow down the types inside the if block in such a case.
  1. Add the method to the interface:

    interface User
    {
        /** @return string */
        public function getPassword();
    
        /** @return string */
        public function getDisplayName();
    }
    
Loading history...
46
            $content = $doc->getContent();
47
        }
48
49
        // Remove empty <ins /> and <del /> tags because browsers hate them
50
        $content = preg_replace('/<(ins|del)[^>]*\/>/', '', $content);
51
52
        return $content;
53
    }
54
55
    /**
56
     * @param string $from
57
     * @param string $to
58
     * @param bool $escape
59
     * @return string
60
     */
61
    public static function compareHTML($from, $to, $escape = false)
62
    {
63
        // First split up the content into words and tags
64
        $set1 = self::getHTMLChunks($from);
65
        $set2 = self::getHTMLChunks($to);
66
67
        // Diff that
68
        $diff = new Diff($set1, $set2);
69
70
        $tagStack[1] = $tagStack[2] = 0;
0 ignored issues
show
Coding Style Comprehensibility introduced by
$tagStack was never initialized. Although not strictly required by PHP, it is generally a good practice to add $tagStack = array(); before regardless.

Adding an explicit array definition is generally preferable to implicit array definition as it guarantees a stable state of the code.

Let’s take a look at an example:

foreach ($collection as $item) {
    $myArray['foo'] = $item->getFoo();

    if ($item->hasBar()) {
        $myArray['bar'] = $item->getBar();
    }

    // do something with $myArray
}

As you can see in this example, the array $myArray is initialized the first time when the foreach loop is entered. You can also see that the value of the bar key is only written conditionally; thus, its value might result from a previous iteration.

This might or might not be intended. To make your intention clear, your code more readible and to avoid accidental bugs, we recommend to add an explicit initialization $myArray = array() either outside or inside the foreach loop.

Loading history...
71
        $rechunked[1] = $rechunked[2] = array();
0 ignored issues
show
Coding Style Comprehensibility introduced by
$rechunked was never initialized. Although not strictly required by PHP, it is generally a good practice to add $rechunked = array(); before regardless.

Adding an explicit array definition is generally preferable to implicit array definition as it guarantees a stable state of the code.

Let’s take a look at an example:

foreach ($collection as $item) {
    $myArray['foo'] = $item->getFoo();

    if ($item->hasBar()) {
        $myArray['bar'] = $item->getBar();
    }

    // do something with $myArray
}

As you can see in this example, the array $myArray is initialized the first time when the foreach loop is entered. You can also see that the value of the bar key is only written conditionally; thus, its value might result from a previous iteration.

This might or might not be intended. To make your intention clear, your code more readible and to avoid accidental bugs, we recommend to add an explicit initialization $myArray = array() either outside or inside the foreach loop.

Loading history...
72
73
        // Go through everything, converting edited tags (and their content) into single chunks.  Otherwise
74
        // the generated HTML gets crusty
75
        foreach ($diff->edits as $edit) {
76
            $lookForTag = false;
77
            $stuffFor = [];
78
            switch ($edit->type) {
79
                case 'copy':
80
                    $lookForTag = false;
81
                    $stuffFor[1] = $edit->orig;
82
                    $stuffFor[2] = $edit->orig;
83
                    break;
84
85 View Code Duplication
                case 'change':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
86
                    $lookForTag = true;
87
                    $stuffFor[1] = $edit->orig;
88
                    $stuffFor[2] = $edit->final;
89
                    break;
90
91 View Code Duplication
                case 'add':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
92
                    $lookForTag = true;
93
                    $stuffFor[1] = null;
94
                    $stuffFor[2] = $edit->final;
95
                    break;
96
97
                case 'delete':
98
                    $lookForTag = true;
99
                    $stuffFor[1] = $edit->orig;
100
                    $stuffFor[2] = null;
101
                    break;
102
            }
103
104
            foreach ($stuffFor as $listName => $chunks) {
105
                if ($chunks) {
106
                    foreach ($chunks as $item) {
107
                        // $tagStack > 0 indicates that we should be tag-building
108
                        if ($tagStack[$listName]) {
109
                            $rechunked[$listName][sizeof($rechunked[$listName])-1] .= ' ' . $item;
110
                        } else {
111
                            $rechunked[$listName][] = $item;
112
                        }
113
114
                        if ($lookForTag
115
                            && !$tagStack[$listName]
116
                            && isset($item[0])
117
                            && $item[0] == "<"
118
                            && substr($item, 0, 2) != "</"
119
                        ) {
120
                            $tagStack[$listName] = 1;
121
                        } elseif ($tagStack[$listName]) {
122
                            if (substr($item, 0, 2) == "</") {
123
                                $tagStack[$listName]--;
124
                            } elseif (isset($item[0]) && $item[0] == "<") {
125
                                $tagStack[$listName]++;
126
                            }
127
                        }
128
                    }
129
                }
130
            }
131
        }
132
133
        // Diff the re-chunked data, turning it into maked up HTML
134
        $diff = new Diff($rechunked[1], $rechunked[2]);
135
        $content = '';
136
        foreach ($diff->edits as $edit) {
137
            $orig = ($escape) ? Convert::raw2xml($edit->orig) : $edit->orig;
138
            $final = ($escape) ? Convert::raw2xml($edit->final) : $edit->final;
139
140
            switch ($edit->type) {
141
                case 'copy':
142
                    $content .= " " . implode(" ", $orig) . " ";
143
                    break;
144
145
                case 'change':
146
                    $content .= " <ins>" . implode(" ", $final) . "</ins> ";
147
                    $content .= " <del>" . implode(" ", $orig) . "</del> ";
148
                    break;
149
150
                case 'add':
151
                    $content .= " <ins>" . implode(" ", $final) . "</ins> ";
152
                    break;
153
154
                case 'delete':
155
                    $content .= " <del>" . implode(" ", $orig) . "</del> ";
156
                    break;
157
            }
158
        }
159
160
        return self::cleanHTML($content);
161
    }
162
163
    /**
164
     * @param string|bool|array $content If passed as an array, values will be concatenated with a comma.
165
     * @return array
166
     */
167
    public static function getHTMLChunks($content)
168
    {
169
        if ($content && !is_string($content) && !is_array($content) && !is_numeric($content) && !is_bool($content)) {
170
            throw new InvalidArgumentException('$content parameter needs to be a string or array');
171
        }
172
        if (is_bool($content)) {
173
            // Convert boolean to strings
174
            $content = $content ? "true" : "false";
175
        }
176
        if (is_array($content)) {
177
            // Convert array to CSV
178
            $content = implode(',', $content);
179
        }
180
181
        $content = str_replace(array("&nbsp;", "<", ">"), array(" "," <", "> "), $content);
182
        $candidateChunks = preg_split("/[\t\r\n ]+/", $content);
183
        $chunks = [];
184
        while ($chunk = each($candidateChunks)) {
185
            $item = $chunk['value'];
186
            if (isset($item[0]) && $item[0] == "<") {
187
                $newChunk = $item;
188
                while ($item[strlen($item)-1] != ">") {
189
                    $chunk = each($candidateChunks);
190
                    if ($chunk === false) {
191
                        break;
192
                    }
193
                    $item = $chunk['value'];
194
                    $newChunk .= ' ' . $item;
195
                }
196
                $chunks[] = $newChunk;
197
            } else {
198
                $chunks[] = $item;
199
            }
200
        }
201
        return $chunks;
202
    }
203
}
204