Diff::compare()   C
last analyzed

Complexity

Conditions 12
Paths 64

Size

Total Lines 51
Code Lines 30

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 0 Features 0
Metric Value
eloc 30
dl 0
loc 51
c 1
b 0
f 0
rs 6.9666
cc 12
nc 64
nop 3

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/*
4
5
class.Diff.php
6
7
A class containing a diff implementation
8
9
Created by Kate Morley - http://iamkate.com/ - and released under the terms of
10
the CC0 1.0 Universal legal code:
11
12
http://creativecommons.org/publicdomain/zero/1.0/legalcode
13
14
*/
15
16
// A class containing functions for computing diffs and formatting the output.
17
/**
18
 * @codeCoverageIgnore
19
 * Third party library
20
 */
21
class Diff{
22
23
  // define the constants
24
  const UNMODIFIED = 0;
25
  const DELETED    = 1;
26
  const INSERTED   = 2;
27
28
  /* Returns the diff for two strings. The return value is an array, each of
29
   * whose values is an array containing two values: a line (or character, if
30
   * $compareCharacters is true), and one of the constants DIFF::UNMODIFIED (the
31
   * line or character is in both strings), DIFF::DELETED (the line or character
32
   * is only in the first string), and DIFF::INSERTED (the line or character is
33
   * only in the second string). The parameters are:
34
   *
35
   * $string1           - the first string
36
   * $string2           - the second string
37
   * $compareCharacters - true to compare characters, and false to compare
38
   *                      lines; this optional parameter defaults to false
39
   */
40
  public static function compare(
41
      $string1, $string2, $compareCharacters = false){
42
43
    // initialise the sequences and comparison start and end positions
44
    $start = 0;
45
    if ($compareCharacters){
46
      $sequence1 = $string1;
47
      $sequence2 = $string2;
48
      $end1 = strlen($string1) - 1;
49
      $end2 = strlen($string2) - 1;
50
    }else{
51
      $sequence1 = preg_split('/\R/', $string1);
52
      $sequence2 = preg_split('/\R/', $string2);
53
      $end1 = count($sequence1) - 1;
0 ignored issues
show
Bug introduced by
It seems like $sequence1 can also be of type false; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

53
      $end1 = count(/** @scrutinizer ignore-type */ $sequence1) - 1;
Loading history...
54
      $end2 = count($sequence2) - 1;
55
    }
56
57
    // skip any common prefix
58
    while ($start <= $end1 && $start <= $end2
59
        && $sequence1[$start] == $sequence2[$start]){
60
      $start ++;
61
    }
62
63
    // skip any common suffix
64
    while ($end1 >= $start && $end2 >= $start
65
        && $sequence1[$end1] == $sequence2[$end2]){
66
      $end1 --;
67
      $end2 --;
68
    }
69
70
    // compute the table of longest common subsequence lengths
71
    $table = self::computeTable($sequence1, $sequence2, $start, $end1, $end2);
72
73
    // generate the partial diff
74
    $partialDiff =
75
        self::generatePartialDiff($table, $sequence1, $sequence2, $start);
76
77
    // generate the full diff
78
    $diff = array();
79
    for ($index = 0; $index < $start; $index ++){
80
      $diff[] = array($sequence1[$index], self::UNMODIFIED);
81
    }
82
    while (count($partialDiff) > 0) $diff[] = array_pop($partialDiff);
83
    for ($index = $end1 + 1;
84
        $index < ($compareCharacters ? strlen($sequence1) : count($sequence1));
85
        $index ++){
86
      $diff[] = array($sequence1[$index], self::UNMODIFIED);
87
    }
88
89
    // return the diff
90
    return $diff;
91
92
  }
93
94
  /* Returns the diff for two files. The parameters are:
95
   *
96
   * $file1             - the path to the first file
97
   * $file2             - the path to the second file
98
   * $compareCharacters - true to compare characters, and false to compare
99
   *                      lines; this optional parameter defaults to false
100
   */
101
  public static function compareFiles(
102
      $file1, $file2, $compareCharacters = false){
103
104
    // return the diff of the files
105
    return self::compare(
106
        file_get_contents($file1),
107
        file_get_contents($file2),
108
        $compareCharacters);
109
110
  }
111
112
  /* Returns the table of longest common subsequence lengths for the specified
113
   * sequences. The parameters are:
114
   *
115
   * $sequence1 - the first sequence
116
   * $sequence2 - the second sequence
117
   * $start     - the starting index
118
   * $end1      - the ending index for the first sequence
119
   * $end2      - the ending index for the second sequence
120
   */
121
  private static function computeTable(
122
      $sequence1, $sequence2, $start, $end1, $end2){
123
124
    // determine the lengths to be compared
125
    $length1 = $end1 - $start + 1;
126
    $length2 = $end2 - $start + 1;
127
128
    // initialise the table
129
    $table = array(array_fill(0, $length2 + 1, 0));
130
131
    // loop over the rows
132
    for ($index1 = 1; $index1 <= $length1; $index1 ++){
133
134
      // create the new row
135
      $table[$index1] = array(0);
136
137
      // loop over the columns
138
      for ($index2 = 1; $index2 <= $length2; $index2 ++){
139
140
        // store the longest common subsequence length
141
        if ($sequence1[$index1 + $start - 1]
142
            == $sequence2[$index2 + $start - 1]){
143
          $table[$index1][$index2] = $table[$index1 - 1][$index2 - 1] + 1;
144
        }else{
145
          $table[$index1][$index2] =
146
              max($table[$index1 - 1][$index2], $table[$index1][$index2 - 1]);
147
        }
148
149
      }
150
    }
151
152
    // return the table
153
    return $table;
154
155
  }
156
157
  /* Returns the partial diff for the specificed sequences, in reverse order.
158
   * The parameters are:
159
   *
160
   * $table     - the table returned by the computeTable function
161
   * $sequence1 - the first sequence
162
   * $sequence2 - the second sequence
163
   * $start     - the starting index
164
   */
165
  private static function generatePartialDiff(
166
      $table, $sequence1, $sequence2, $start){
167
168
    //  initialise the diff
169
    $diff = array();
170
171
    // initialise the indices
172
    $index1 = count($table) - 1;
173
    $index2 = count($table[0]) - 1;
174
175
    // loop until there are no items remaining in either sequence
176
    while ($index1 > 0 || $index2 > 0){
177
178
      // check what has happened to the items at these indices
179
      if ($index1 > 0 && $index2 > 0
180
          && $sequence1[$index1 + $start - 1]
181
              == $sequence2[$index2 + $start - 1]){
182
183
        // update the diff and the indices
184
        $diff[] = array($sequence1[$index1 + $start - 1], self::UNMODIFIED);
185
        $index1 --;
186
        $index2 --;
187
188
      }elseif ($index2 > 0
189
          && $table[$index1][$index2] == $table[$index1][$index2 - 1]){
190
191
        // update the diff and the indices
192
        $diff[] = array($sequence2[$index2 + $start - 1], self::INSERTED);
193
        $index2 --;
194
195
      }else{
196
197
        // update the diff and the indices
198
        $diff[] = array($sequence1[$index1 + $start - 1], self::DELETED);
199
        $index1 --;
200
201
      }
202
203
    }
204
205
    // return the diff
206
    return $diff;
207
208
  }
209
210
  /* Returns a diff as a string, where unmodified lines are prefixed by '  ',
211
   * deletions are prefixed by '- ', and insertions are prefixed by '+ '. The
212
   * parameters are:
213
   *
214
   * $diff      - the diff array
215
   * $separator - the separator between lines; this optional parameter defaults
216
   *              to "\n"
217
   */
218
  public static function toString($diff, $separator = "\n"){
219
220
    // initialise the string
221
    $string = '';
222
223
    // loop over the lines in the diff
224
    foreach ($diff as $line){
225
226
      // extend the string with the line
227
      switch ($line[1]){
228
        case self::UNMODIFIED : $string .= '  ' . $line[0];break;
229
        case self::DELETED    : $string .= '- ' . $line[0];break;
230
        case self::INSERTED   : $string .= '+ ' . $line[0];break;
231
      }
232
233
      // extend the string with the separator
234
      $string .= $separator;
235
236
    }
237
238
    // return the string
239
    return $string;
240
241
  }
242
243
  /* Returns a diff as an HTML string, where unmodified lines are contained
244
   * within 'span' elements, deletions are contained within 'del' elements, and
245
   * insertions are contained within 'ins' elements. The parameters are:
246
   *
247
   * $diff      - the diff array
248
   * $separator - the separator between lines; this optional parameter defaults
249
   *              to '<br>'
250
   */
251
  public static function toHTML($diff, $separator = '<br>'){
252
253
    // initialise the HTML
254
    $html = '';
255
256
    // loop over the lines in the diff
257
    foreach ($diff as $line){
258
259
      // extend the HTML with the line
260
      switch ($line[1]){
261
        case self::UNMODIFIED : $element = 'span'; break;
262
        case self::DELETED    : $element = 'del';  break;
263
        case self::INSERTED   : $element = 'ins';  break;
264
      }
265
      $html .=
266
          '<' . $element . '>'
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $element does not seem to be defined for all execution paths leading up to this point.
Loading history...
267
          . htmlspecialchars($line[0])
268
          . '</' . $element . '>';
269
270
      // extend the HTML with the separator
271
      $html .= $separator;
272
273
    }
274
275
    // return the HTML
276
    return $html;
277
278
  }
279
280
  /* Returns a diff as an HTML table. The parameters are:
281
   *
282
   * $diff        - the diff array
283
   * $indentation - indentation to add to every line of the generated HTML; this
284
   *                optional parameter defaults to ''
285
   * $separator   - the separator between lines; this optional parameter
286
   *                defaults to '<br>'
287
   */
288
  public static function toTable($diff, $indentation = '', $separator = '<br>'){
289
290
    // initialise the HTML
291
    $html = $indentation . "<table class=\"diff\">\n";
292
293
    // loop over the lines in the diff
294
    $index = 0;
295
    while ($index < count($diff)){
296
297
      // determine the line type
298
      switch ($diff[$index][1]){
299
300
        // display the content on the left and right
301
        case self::UNMODIFIED:
302
          $leftCell =
303
              self::getCellContent(
304
                  $diff, $indentation, $separator, $index, self::UNMODIFIED);
305
          $rightCell = $leftCell;
306
          break;
307
308
        // display the deleted on the left and inserted content on the right
309
        case self::DELETED:
310
          $leftCell =
311
              self::getCellContent(
312
                  $diff, $indentation, $separator, $index, self::DELETED);
313
          $rightCell =
314
              self::getCellContent(
315
                  $diff, $indentation, $separator, $index, self::INSERTED);
316
          break;
317
318
        // display the inserted content on the right
319
        case self::INSERTED:
320
          $leftCell = '';
321
          $rightCell =
322
              self::getCellContent(
323
                  $diff, $indentation, $separator, $index, self::INSERTED);
324
          break;
325
326
      }
327
328
      // extend the HTML with the new row
329
      $html .=
330
          $indentation
331
          . "  <tr>\n"
332
          . $indentation
333
          . '    <td class="diff'
334
          . ($leftCell == $rightCell
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $leftCell does not seem to be defined for all execution paths leading up to this point.
Loading history...
Comprehensibility Best Practice introduced by
The variable $rightCell does not seem to be defined for all execution paths leading up to this point.
Loading history...
335
              ? 'Unmodified'
336
              : ($leftCell == '' ? 'Blank' : 'Deleted'))
337
          . '">'
338
          . $leftCell
339
          . "</td>\n"
340
          . $indentation
341
          . '    <td class="diff'
342
          . ($leftCell == $rightCell
343
              ? 'Unmodified'
344
              : ($rightCell == '' ? 'Blank' : 'Inserted'))
345
          . '">'
346
          . $rightCell
347
          . "</td>\n"
348
          . $indentation
349
          . "  </tr>\n";
350
351
    }
352
353
    // return the HTML
354
    return $html . $indentation . "</table>\n";
355
356
  }
357
358
  /* Returns the content of the cell, for use in the toTable function. The
359
   * parameters are:
360
   *
361
   * $diff        - the diff array
362
   * $indentation - indentation to add to every line of the generated HTML
363
   * $separator   - the separator between lines
364
   * $index       - the current index, passes by reference
365
   * $type        - the type of line
366
   */
367
  private static function getCellContent(
368
      $diff, $indentation, $separator, &$index, $type){
0 ignored issues
show
Unused Code introduced by
The parameter $indentation is not used and could be removed. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unused  annotation

368
      $diff, /** @scrutinizer ignore-unused */ $indentation, $separator, &$index, $type){

This check looks for parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
369
370
    // initialise the HTML
371
    $html = '';
372
373
    // loop over the matching lines, adding them to the HTML
374
    while ($index < count($diff) && $diff[$index][1] == $type){
375
      $html .=
376
          '<span>'
377
          . htmlspecialchars($diff[$index][0])
378
          . '</span>'
379
          . $separator;
380
      $index ++;
381
    }
382
383
    // return the HTML
384
    return $html;
385
386
  }
387
388
}
389