Total Complexity | 45 |
Total Lines | 187 |
Duplicated Lines | 0 % |
Changes | 0 |
Complex classes like Diff often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use Diff, and based on these observations, apply Extract Interface, too.
1 | <?php |
||
14 | class Diff extends \Diff |
||
15 | { |
||
16 | public static $html_cleaner_class = null; |
||
17 | |||
18 | /** |
||
19 | * Attempt to clean invalid HTML, which messes up diffs. |
||
20 | * This cleans code if possible, using an instance of HTMLCleaner |
||
21 | * |
||
22 | * NB: By default, only extremely simple tidying is performed, |
||
23 | * by passing through DomDocument::loadHTML and saveXML |
||
24 | * |
||
25 | * @param string $content HTML content |
||
26 | * @param HTMLCleaner $cleaner Optional instance of a HTMLCleaner class to |
||
27 | * use, overriding self::$html_cleaner_class |
||
28 | * @return mixed|string |
||
29 | */ |
||
30 | public static function cleanHTML($content, $cleaner = null) |
||
53 | } |
||
54 | |||
55 | /** |
||
56 | * @param string $from |
||
57 | * @param string $to |
||
58 | * @param bool $escape |
||
59 | * @return string |
||
60 | */ |
||
61 | public static function compareHTML($from, $to, $escape = false) |
||
62 | { |
||
63 | // First split up the content into words and tags |
||
64 | $set1 = self::getHTMLChunks($from); |
||
65 | $set2 = self::getHTMLChunks($to); |
||
66 | |||
67 | // Diff that |
||
68 | $diff = new Diff($set1, $set2); |
||
69 | |||
70 | $tagStack[1] = $tagStack[2] = 0; |
||
71 | $rechunked[1] = $rechunked[2] = array(); |
||
72 | |||
73 | // Go through everything, converting edited tags (and their content) into single chunks. Otherwise |
||
74 | // the generated HTML gets crusty |
||
75 | foreach ($diff->edits as $edit) { |
||
76 | $lookForTag = false; |
||
77 | $stuffFor = []; |
||
78 | switch ($edit->type) { |
||
79 | case 'copy': |
||
80 | $lookForTag = false; |
||
81 | $stuffFor[1] = $edit->orig; |
||
82 | $stuffFor[2] = $edit->orig; |
||
83 | break; |
||
84 | |||
85 | case 'change': |
||
86 | $lookForTag = true; |
||
87 | $stuffFor[1] = $edit->orig; |
||
88 | $stuffFor[2] = $edit->final; |
||
89 | break; |
||
90 | |||
91 | case 'add': |
||
92 | $lookForTag = true; |
||
93 | $stuffFor[1] = null; |
||
94 | $stuffFor[2] = $edit->final; |
||
95 | break; |
||
96 | |||
97 | case 'delete': |
||
98 | $lookForTag = true; |
||
99 | $stuffFor[1] = $edit->orig; |
||
100 | $stuffFor[2] = null; |
||
101 | break; |
||
102 | } |
||
103 | |||
104 | foreach ($stuffFor as $listName => $chunks) { |
||
105 | if ($chunks) { |
||
106 | foreach ($chunks as $item) { |
||
107 | // $tagStack > 0 indicates that we should be tag-building |
||
108 | if ($tagStack[$listName]) { |
||
109 | $rechunked[$listName][sizeof($rechunked[$listName])-1] .= ' ' . $item; |
||
110 | } else { |
||
111 | $rechunked[$listName][] = $item; |
||
112 | } |
||
113 | |||
114 | if ($lookForTag |
||
115 | && !$tagStack[$listName] |
||
116 | && isset($item[0]) |
||
117 | && $item[0] == "<" |
||
118 | && substr($item, 0, 2) != "</" |
||
119 | ) { |
||
120 | $tagStack[$listName] = 1; |
||
121 | } elseif ($tagStack[$listName]) { |
||
122 | if (substr($item, 0, 2) == "</") { |
||
123 | $tagStack[$listName]--; |
||
124 | } elseif (isset($item[0]) && $item[0] == "<") { |
||
125 | $tagStack[$listName]++; |
||
126 | } |
||
127 | } |
||
128 | } |
||
129 | } |
||
130 | } |
||
131 | } |
||
132 | |||
133 | // Diff the re-chunked data, turning it into maked up HTML |
||
134 | $diff = new Diff($rechunked[1], $rechunked[2]); |
||
135 | $content = ''; |
||
136 | foreach ($diff->edits as $edit) { |
||
137 | $orig = ($escape) ? Convert::raw2xml($edit->orig) : $edit->orig; |
||
138 | $final = ($escape) ? Convert::raw2xml($edit->final) : $edit->final; |
||
139 | |||
140 | switch ($edit->type) { |
||
141 | case 'copy': |
||
142 | $content .= " " . implode(" ", $orig) . " "; |
||
143 | break; |
||
144 | |||
145 | case 'change': |
||
146 | $content .= " <ins>" . implode(" ", $final) . "</ins> "; |
||
147 | $content .= " <del>" . implode(" ", $orig) . "</del> "; |
||
148 | break; |
||
149 | |||
150 | case 'add': |
||
151 | $content .= " <ins>" . implode(" ", $final) . "</ins> "; |
||
152 | break; |
||
153 | |||
154 | case 'delete': |
||
155 | $content .= " <del>" . implode(" ", $orig) . "</del> "; |
||
156 | break; |
||
157 | } |
||
158 | } |
||
159 | |||
160 | return self::cleanHTML($content); |
||
161 | } |
||
162 | |||
163 | /** |
||
164 | * @param string|bool|array $content If passed as an array, values will be concatenated with a comma. |
||
165 | * @return array |
||
166 | */ |
||
167 | public static function getHTMLChunks($content) |
||
201 | } |
||
202 | } |
||
203 |
The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.
The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.
To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.