Passed
Pull Request — master (#35)
by David
03:43
created

TableDiff::getRowMatches()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 13
Code Lines 8

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 0
CRAP Score 2

Importance

Changes 3
Bugs 0 Features 1
Metric Value
c 3
b 0
f 1
dl 0
loc 13
ccs 0
cts 8
cp 0
rs 9.4285
cc 1
eloc 8
nc 1
nop 2
crap 2
1
<?php
2
3
namespace Caxy\HtmlDiff\Table;
4
5
use Caxy\HtmlDiff\AbstractDiff;
6
use Caxy\HtmlDiff\HtmlDiff;
7
use Caxy\HtmlDiff\HtmlDiffConfig;
8
use Caxy\HtmlDiff\Operation;
9
10
/**
11
 * Class TableDiff
12
 * @package Caxy\HtmlDiff\Table
13
 */
14
class TableDiff extends AbstractDiff
15
{
16
    /**
17
     * @var null|Table
18
     */
19
    protected $oldTable = null;
20
21
    /**
22
     * @var null|Table
23
     */
24
    protected $newTable = null;
25
26
    /**
27
     * @var null|\DOMElement
28
     */
29
    protected $diffTable = null;
30
31
    /**
32
     * @var null|\DOMDocument
33
     */
34
    protected $diffDom = null;
35
36
    /**
37
     * @var int
38
     */
39
    protected $newRowOffsets = 0;
40
41
    /**
42
     * @var int
43
     */
44
    protected $oldRowOffsets = 0;
45
46
    /**
47
     * @var array
48
     */
49
    protected $cellValues = array();
50
51
    /**
52
     * @var \HTMLPurifier
53
     */
54
    protected $purifier;
55
56
    /**
57
     * @param string              $oldText
58
     * @param string              $newText
59
     * @param HtmlDiffConfig|null $config
60
     *
61
     * @return self
62
     */
63 View Code Duplication
    public static function create($oldText, $newText, HtmlDiffConfig $config = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
64
    {
65
        $diff = new self($oldText, $newText);
66
67
        if (null !== $config) {
68
            $diff->setConfig($config);
69
70
            $this->initPurifier($config->getPurifierCacheLocation());
0 ignored issues
show
Bug introduced by
The variable $this does not exist. Did you forget to declare it?

This check marks access to variables or properties that have not been declared yet. While PHP has no explicit notion of declaring a variable, accessing it before a value is assigned to it is most likely a bug.

Loading history...
71
        }
72
73
        return $diff;
74
    }
75
76
    /**
77
     * TableDiff constructor.
78
     *
79
     * @param string     $oldText
80
     * @param string     $newText
81
     * @param string     $encoding
82
     * @param array|null $specialCaseTags
83
     * @param bool|null  $groupDiffs
84
     */
85
    public function __construct(
86
        $oldText,
87
        $newText,
88
        $encoding = 'UTF-8',
89
        $specialCaseTags = null,
90
        $groupDiffs = null,
91
        $defaultPurifierSerializerCache = null
92
    )
93
    {
94
        parent::__construct($oldText, $newText, $encoding, $specialCaseTags, $groupDiffs);
95
96
        $this->initPurifier($defaultPurifierSerializerCache);
97
    }
98
99
    /**
100
     * Initializes HTMLPurifier with cache location
101
     * @param  null|string $defaultPurifierSerializerCache
102
     * @return void
103
     */
104
    protected function initPurifier($defaultPurifierSerializerCache)
105
    {
106
        $HTMLPurifierConfig = \HTMLPurifier_Config::createDefault();
107
        // Cache for the serializer defaults to inside the HTMLPurifier library
108
        // under the DefinitionCache/Serializer folder.
109
        $HTMLPurifierConfig->set('Cache.SerializerPath', $defaultPurifierSerializerCache);
110
        $this->purifier = new \HTMLPurifier($HTMLPurifierConfig);
111
    }
112
113
    /**
114
     * @return string
115
     */
116 View Code Duplication
    public function build()
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
117
    {
118
        if ($this->hasDiffCache() && $this->getDiffCache()->contains($this->oldText, $this->newText)) {
119
            $this->content = $this->getDiffCache()->fetch($this->oldText, $this->newText);
120
121
            return $this->content;
122
        }
123
124
        $this->buildTableDoms();
125
126
        $this->diffDom = new \DOMDocument();
127
128
        $this->indexCellValues($this->newTable);
0 ignored issues
show
Bug introduced by
It seems like $this->newTable can be null; however, indexCellValues() does not accept null, maybe add an additional type check?

Unless you are absolutely sure that the expression can never be null because of other conditions, we strongly recommend to add an additional type check to your code:

/** @return stdClass|null */
function mayReturnNull() { }

function doesNotAcceptNull(stdClass $x) { }

// With potential error.
function withoutCheck() {
    $x = mayReturnNull();
    doesNotAcceptNull($x); // Potential error here.
}

// Safe - Alternative 1
function withCheck1() {
    $x = mayReturnNull();
    if ( ! $x instanceof stdClass) {
        throw new \LogicException('$x must be defined.');
    }
    doesNotAcceptNull($x);
}

// Safe - Alternative 2
function withCheck2() {
    $x = mayReturnNull();
    if ($x instanceof stdClass) {
        doesNotAcceptNull($x);
    }
}
Loading history...
129
130
        $this->diffTableContent();
131
132
        if ($this->hasDiffCache()) {
133
            $this->getDiffCache()->save($this->oldText, $this->newText, $this->content);
134
        }
135
136
        return $this->content;
137
    }
138
139
    protected function diffTableContent()
140
    {
141
        $this->diffDom = new \DOMDocument();
142
        $this->diffTable = $this->newTable->cloneNode($this->diffDom);
143
        $this->diffDom->appendChild($this->diffTable);
144
145
        $oldRows = $this->oldTable->getRows();
146
        $newRows = $this->newTable->getRows();
147
148
        $oldMatchData = array();
149
        $newMatchData = array();
150
151
        /* @var $oldRow TableRow */
152
        foreach ($oldRows as $oldIndex => $oldRow) {
153
            $oldMatchData[$oldIndex] = array();
154
155
            // Get match percentages
156
            /* @var $newRow TableRow */
157
            foreach ($newRows as $newIndex => $newRow) {
158
                if (!array_key_exists($newIndex, $newMatchData)) {
159
                    $newMatchData[$newIndex] = array();
160
                }
161
162
                // similar_text
163
                $percentage = $this->getMatchPercentage($oldRow, $newRow, $oldIndex, $newIndex);
164
165
                $oldMatchData[$oldIndex][$newIndex] = $percentage;
166
                $newMatchData[$newIndex][$oldIndex] = $percentage;
167
            }
168
        }
169
170
        $matches = $this->getRowMatches($oldMatchData, $newMatchData);
171
        $this->diffTableRowsWithMatches($oldRows, $newRows, $matches);
172
173
        $this->content = $this->htmlFromNode($this->diffTable);
174
    }
175
176
    /**
177
     * @param TableRow[] $oldRows
178
     * @param TableRow[] $newRows
179
     * @param RowMatch[] $matches
180
     */
181
    protected function diffTableRowsWithMatches($oldRows, $newRows, $matches)
182
    {
183
        $operations = array();
184
185
        $indexInOld = 0;
186
        $indexInNew = 0;
187
188
        $oldRowCount = count($oldRows);
189
        $newRowCount = count($newRows);
190
191
        $matches[] = new RowMatch($newRowCount, $oldRowCount, $newRowCount, $oldRowCount);
192
193
        // build operations
194
        foreach ($matches as $match) {
195
            $matchAtIndexInOld = $indexInOld === $match->getStartInOld();
196
            $matchAtIndexInNew = $indexInNew === $match->getStartInNew();
197
198
            $action = 'equal';
199
200
            if (!$matchAtIndexInOld && !$matchAtIndexInNew) {
201
                $action = 'replace';
202
            } elseif ($matchAtIndexInOld && !$matchAtIndexInNew) {
203
                $action = 'insert';
204
            } elseif (!$matchAtIndexInOld && $matchAtIndexInNew) {
205
                $action = 'delete';
206
            }
207
208
            if ($action !== 'equal') {
209
                $operations[] = new Operation(
210
                    $action,
211
                    $indexInOld,
212
                    $match->getStartInOld(),
213
                    $indexInNew,
214
                    $match->getStartInNew()
215
                );
216
            }
217
218
            $operations[] = new Operation(
219
                'equal',
220
                $match->getStartInOld(),
221
                $match->getEndInOld(),
222
                $match->getStartInNew(),
223
                $match->getEndInNew()
224
            );
225
226
            $indexInOld = $match->getEndInOld();
227
            $indexInNew = $match->getEndInNew();
228
        }
229
230
        $appliedRowSpans = array();
231
232
        // process operations
233
        foreach ($operations as $operation) {
234
            switch ($operation->action) {
235
                case 'equal':
236
                    $this->processEqualOperation($operation, $oldRows, $newRows, $appliedRowSpans);
237
                    break;
238
239
                case 'delete':
240
                    $this->processDeleteOperation($operation, $oldRows, $appliedRowSpans);
241
                    break;
242
243
                case 'insert':
244
                    $this->processInsertOperation($operation, $newRows, $appliedRowSpans);
245
                    break;
246
247
                case 'replace':
248
                    $this->processReplaceOperation($operation, $oldRows, $newRows, $appliedRowSpans);
249
                    break;
250
            }
251
        }
252
    }
253
254
    /**
255
     * @param Operation $operation
256
     * @param array     $newRows
257
     * @param array     $appliedRowSpans
258
     * @param bool      $forceExpansion
259
     */
260 View Code Duplication
    protected function processInsertOperation(
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
261
        Operation $operation,
262
        $newRows,
263
        &$appliedRowSpans,
264
        $forceExpansion = false
265
    ) {
266
        $targetRows = array_slice($newRows, $operation->startInNew, $operation->endInNew - $operation->startInNew);
267
        foreach ($targetRows as $row) {
268
            $this->diffAndAppendRows(null, $row, $appliedRowSpans, $forceExpansion);
269
        }
270
    }
271
272
    /**
273
     * @param Operation $operation
274
     * @param array     $oldRows
275
     * @param array     $appliedRowSpans
276
     * @param bool      $forceExpansion
277
     */
278 View Code Duplication
    protected function processDeleteOperation(
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
279
        Operation $operation,
280
        $oldRows,
281
        &$appliedRowSpans,
282
        $forceExpansion = false
283
    ) {
284
        $targetRows = array_slice($oldRows, $operation->startInOld, $operation->endInOld - $operation->startInOld);
285
        foreach ($targetRows as $row) {
286
            $this->diffAndAppendRows($row, null, $appliedRowSpans, $forceExpansion);
287
        }
288
    }
289
290
    /**
291
     * @param Operation $operation
292
     * @param array     $oldRows
293
     * @param array     $newRows
294
     * @param array     $appliedRowSpans
295
     */
296
    protected function processEqualOperation(Operation $operation, $oldRows, $newRows, &$appliedRowSpans)
297
    {
298
        $targetOldRows = array_values(
299
            array_slice($oldRows, $operation->startInOld, $operation->endInOld - $operation->startInOld)
300
        );
301
        $targetNewRows = array_values(
302
            array_slice($newRows, $operation->startInNew, $operation->endInNew - $operation->startInNew)
303
        );
304
305
        foreach ($targetNewRows as $index => $newRow) {
306
            if (!isset($targetOldRows[$index])) {
307
                continue;
308
            }
309
310
            $this->diffAndAppendRows($targetOldRows[$index], $newRow, $appliedRowSpans);
311
        }
312
    }
313
314
    /**
315
     * @param Operation $operation
316
     * @param array     $oldRows
317
     * @param array     $newRows
318
     * @param array     $appliedRowSpans
319
     */
320
    protected function processReplaceOperation(Operation $operation, $oldRows, $newRows, &$appliedRowSpans)
321
    {
322
        $this->processDeleteOperation($operation, $oldRows, $appliedRowSpans, true);
323
        $this->processInsertOperation($operation, $newRows, $appliedRowSpans, true);
324
    }
325
326
    /**
327
     * @param array $oldMatchData
328
     * @param array $newMatchData
329
     *
330
     * @return array
331
     */
332
    protected function getRowMatches($oldMatchData, $newMatchData)
333
    {
334
        $matches = array();
335
336
        $startInOld = 0;
337
        $startInNew = 0;
338
        $endInOld = count($oldMatchData);
339
        $endInNew = count($newMatchData);
340
341
        $this->findRowMatches($newMatchData, $startInOld, $endInOld, $startInNew, $endInNew, $matches);
342
343
        return $matches;
344
    }
345
346
    /**
347
     * @param array $newMatchData
348
     * @param int   $startInOld
349
     * @param int   $endInOld
350
     * @param int   $startInNew
351
     * @param int   $endInNew
352
     * @param array $matches
353
     */
354
    protected function findRowMatches($newMatchData, $startInOld, $endInOld, $startInNew, $endInNew, &$matches)
355
    {
356
        $match = $this->findRowMatch($newMatchData, $startInOld, $endInOld, $startInNew, $endInNew);
357
        if ($match !== null) {
358
            if ($startInOld < $match->getStartInOld() &&
359
                $startInNew < $match->getStartInNew()
360
            ) {
361
                $this->findRowMatches(
362
                    $newMatchData,
363
                    $startInOld,
364
                    $match->getStartInOld(),
365
                    $startInNew,
366
                    $match->getStartInNew(),
367
                    $matches
368
                );
369
            }
370
371
            $matches[] = $match;
372
373
            if ($match->getEndInOld() < $endInOld &&
374
                $match->getEndInNew() < $endInNew
375
            ) {
376
                $this->findRowMatches(
377
                    $newMatchData,
378
                    $match->getEndInOld(),
379
                    $endInOld,
380
                    $match->getEndInNew(),
381
                    $endInNew,
382
                    $matches
383
                );
384
            }
385
        }
386
    }
387
388
    /**
389
     * @param array $newMatchData
390
     * @param int   $startInOld
391
     * @param int   $endInOld
392
     * @param int   $startInNew
393
     * @param int   $endInNew
394
     *
395
     * @return RowMatch|null
396
     */
397
    protected function findRowMatch($newMatchData, $startInOld, $endInOld, $startInNew, $endInNew)
398
    {
399
        $bestMatch = null;
400
        $bestPercentage = 0;
401
402
        foreach ($newMatchData as $newIndex => $oldMatches) {
403
            if ($newIndex < $startInNew) {
404
                continue;
405
            }
406
407
            if ($newIndex >= $endInNew) {
408
                break;
409
            }
410
            foreach ($oldMatches as $oldIndex => $percentage) {
411
                if ($oldIndex < $startInOld) {
412
                    continue;
413
                }
414
415
                if ($oldIndex >= $endInOld) {
416
                    break;
417
                }
418
419
                if ($percentage > $bestPercentage) {
420
                    $bestPercentage = $percentage;
421
                    $bestMatch = array(
422
                        'oldIndex' => $oldIndex,
423
                        'newIndex' => $newIndex,
424
                        'percentage' => $percentage,
425
                    );
426
                }
427
            }
428
        }
429
430
        if ($bestMatch !== null) {
431
            return new RowMatch(
432
                $bestMatch['newIndex'],
433
                $bestMatch['oldIndex'],
434
                $bestMatch['newIndex'] + 1,
435
                $bestMatch['oldIndex'] + 1,
436
                $bestMatch['percentage']
437
            );
438
        }
439
440
        return null;
441
    }
442
443
    /**
444
     * @param TableRow|null $oldRow
445
     * @param TableRow|null $newRow
446
     * @param array         $appliedRowSpans
447
     * @param bool          $forceExpansion
448
     *
449
     * @return array
450
     */
451
    protected function diffRows($oldRow, $newRow, array &$appliedRowSpans, $forceExpansion = false)
452
    {
453
        // create tr dom element
454
        $rowToClone = $newRow ?: $oldRow;
455
        /* @var $diffRow \DOMElement */
456
        $diffRow = $this->diffDom->importNode($rowToClone->getDomNode()->cloneNode(false), false);
457
458
        $oldCells = $oldRow ? $oldRow->getCells() : array();
459
        $newCells = $newRow ? $newRow->getCells() : array();
460
461
        $position = new DiffRowPosition();
462
463
        $extraRow = null;
464
465
        /* @var $expandCells \DOMElement[] */
466
        $expandCells = array();
467
        /* @var $cellsWithMultipleRows \DOMElement[] */
468
        $cellsWithMultipleRows = array();
469
470
        $newCellCount = count($newCells);
471
        while ($position->getIndexInNew() < $newCellCount) {
472
            if (!$position->areColumnsEqual()) {
473
                $type = $position->getLesserColumnType();
474
                if ($type === 'new') {
475
                    $row = $newRow;
476
                    $targetRow = $extraRow;
477
                } else {
478
                    $row = $oldRow;
479
                    $targetRow = $diffRow;
480
                }
481
                if ($row && $targetRow && (!$type === 'old' || isset($oldCells[$position->getIndexInOld()]))) {
482
                    $this->syncVirtualColumns($row, $position, $cellsWithMultipleRows, $targetRow, $type, true);
483
484
                    continue;
485
                }
486
            }
487
488
            /* @var $newCell TableCell */
489
            $newCell = $newCells[$position->getIndexInNew()];
490
            /* @var $oldCell TableCell */
491
            $oldCell = isset($oldCells[$position->getIndexInOld()]) ? $oldCells[$position->getIndexInOld()] : null;
492
493
            if ($oldCell && $newCell->getColspan() != $oldCell->getColspan()) {
494
                if (null === $extraRow) {
495
                    /* @var $extraRow \DOMElement */
496
                    $extraRow = $this->diffDom->importNode($rowToClone->getDomNode()->cloneNode(false), false);
497
                }
498
499
                if ($oldCell->getColspan() > $newCell->getColspan()) {
500
                    $this->diffCellsAndIncrementCounters(
501
                        $oldCell,
502
                        null,
503
                        $cellsWithMultipleRows,
504
                        $diffRow,
505
                        $position,
506
                        true
507
                    );
508
                    $this->syncVirtualColumns($newRow, $position, $cellsWithMultipleRows, $extraRow, 'new', true);
0 ignored issues
show
Bug introduced by
It seems like $newRow defined by parameter $newRow on line 451 can be null; however, Caxy\HtmlDiff\Table\Tabl...f::syncVirtualColumns() does not accept null, maybe add an additional type check?

It seems like you allow that null is being passed for a parameter, however the function which is called does not seem to accept null.

We recommend to add an additional type check (or disallow null for the parameter):

function notNullable(stdClass $x) { }

// Unsafe
function withoutCheck(stdClass $x = null) {
    notNullable($x);
}

// Safe - Alternative 1: Adding Additional Type-Check
function withCheck(stdClass $x = null) {
    if ($x instanceof stdClass) {
        notNullable($x);
    }
}

// Safe - Alternative 2: Changing Parameter
function withNonNullableParam(stdClass $x) {
    notNullable($x);
}
Loading history...
509
                } else {
510
                    $this->diffCellsAndIncrementCounters(
511
                        null,
512
                        $newCell,
513
                        $cellsWithMultipleRows,
514
                        $extraRow,
515
                        $position,
516
                        true
517
                    );
518
                    $this->syncVirtualColumns($oldRow, $position, $cellsWithMultipleRows, $diffRow, 'old', true);
0 ignored issues
show
Bug introduced by
It seems like $oldRow defined by parameter $oldRow on line 451 can be null; however, Caxy\HtmlDiff\Table\Tabl...f::syncVirtualColumns() does not accept null, maybe add an additional type check?

It seems like you allow that null is being passed for a parameter, however the function which is called does not seem to accept null.

We recommend to add an additional type check (or disallow null for the parameter):

function notNullable(stdClass $x) { }

// Unsafe
function withoutCheck(stdClass $x = null) {
    notNullable($x);
}

// Safe - Alternative 1: Adding Additional Type-Check
function withCheck(stdClass $x = null) {
    if ($x instanceof stdClass) {
        notNullable($x);
    }
}

// Safe - Alternative 2: Changing Parameter
function withNonNullableParam(stdClass $x) {
    notNullable($x);
}
Loading history...
519
                }
520
            } else {
521
                $diffCell = $this->diffCellsAndIncrementCounters(
522
                    $oldCell,
523
                    $newCell,
524
                    $cellsWithMultipleRows,
525
                    $diffRow,
526
                    $position
527
                );
528
                $expandCells[] = $diffCell;
529
            }
530
        }
531
532
        $oldCellCount = count($oldCells);
533
        while ($position->getIndexInOld() < $oldCellCount) {
534
            $diffCell = $this->diffCellsAndIncrementCounters(
535
                $oldCells[$position->getIndexInOld()],
536
                null,
537
                $cellsWithMultipleRows,
538
                $diffRow,
539
                $position
540
            );
541
            $expandCells[] = $diffCell;
542
        }
543
544
        if ($extraRow) {
545
            foreach ($expandCells as $expandCell) {
546
                $rowspan = $expandCell->getAttribute('rowspan') ?: 1;
547
                $expandCell->setAttribute('rowspan', 1 + $rowspan);
548
            }
549
        }
550
551
        if ($extraRow || $forceExpansion) {
552
            foreach ($appliedRowSpans as $rowSpanCells) {
553
                /* @var $rowSpanCells \DOMElement[] */
554
                foreach ($rowSpanCells as $extendCell) {
555
                    $rowspan = $extendCell->getAttribute('rowspan') ?: 1;
556
                    $extendCell->setAttribute('rowspan', 1 + $rowspan);
557
                }
558
            }
559
        }
560
561
        if (!$forceExpansion) {
562
            array_shift($appliedRowSpans);
563
            $appliedRowSpans = array_values($appliedRowSpans);
564
        }
565
        $appliedRowSpans = array_merge($appliedRowSpans, array_values($cellsWithMultipleRows));
566
567
        return array($diffRow, $extraRow);
568
    }
569
570
    /**
571
     * @param TableCell|null $oldCell
572
     * @param TableCell|null $newCell
573
     *
574
     * @return \DOMElement
575
     */
576
    protected function getNewCellNode(TableCell $oldCell = null, TableCell $newCell = null)
577
    {
578
        // If only one cell exists, use it
579
        if (!$oldCell || !$newCell) {
580
            $clone = $newCell
581
                ? $newCell->getDomNode()->cloneNode(false)
582
                : $oldCell->getDomNode()->cloneNode(false);
0 ignored issues
show
Bug introduced by
It seems like $oldCell is not always an object, but can also be of type null. Maybe add an additional type check?

If a variable is not always an object, we recommend to add an additional type check to ensure your method call is safe:

function someFunction(A $objectMaybe = null)
{
    if ($objectMaybe instanceof A) {
        $objectMaybe->doSomething();
    }
}
Loading history...
583
        } else {
584
            $oldNode = $oldCell->getDomNode();
585
            $newNode = $newCell->getDomNode();
586
587
            /* @var $clone \DOMElement */
588
            $clone = $newNode->cloneNode(false);
589
590
            $oldRowspan = $oldNode->getAttribute('rowspan') ?: 1;
591
            $oldColspan = $oldNode->getAttribute('colspan') ?: 1;
592
            $newRowspan = $newNode->getAttribute('rowspan') ?: 1;
593
            $newColspan = $newNode->getAttribute('colspan') ?: 1;
594
595
            $clone->setAttribute('rowspan', max($oldRowspan, $newRowspan));
596
            $clone->setAttribute('colspan', max($oldColspan, $newColspan));
597
        }
598
599
        return $this->diffDom->importNode($clone);
600
    }
601
602
    /**
603
     * @param TableCell|null $oldCell
604
     * @param TableCell|null $newCell
605
     * @param bool           $usingExtraRow
606
     *
607
     * @return \DOMElement
608
     */
609
    protected function diffCells($oldCell, $newCell, $usingExtraRow = false)
610
    {
611
        $diffCell = $this->getNewCellNode($oldCell, $newCell);
612
613
        $oldContent = $oldCell ? $this->getInnerHtml($oldCell->getDomNode()) : '';
614
        $newContent = $newCell ? $this->getInnerHtml($newCell->getDomNode()) : '';
615
616
        $htmlDiff = HtmlDiff::create(
617
            mb_convert_encoding($oldContent, 'UTF-8', 'HTML-ENTITIES'),
618
            mb_convert_encoding($newContent, 'UTF-8', 'HTML-ENTITIES'),
619
            $this->config
620
        );
621
        $diff = $htmlDiff->build();
622
623
        $this->setInnerHtml($diffCell, $diff);
624
625
        if (null === $newCell) {
626
            $diffCell->setAttribute('class', trim($diffCell->getAttribute('class').' del'));
627
        }
628
629
        if (null === $oldCell) {
630
            $diffCell->setAttribute('class', trim($diffCell->getAttribute('class').' ins'));
631
        }
632
633
        if ($usingExtraRow) {
634
            $diffCell->setAttribute('class', trim($diffCell->getAttribute('class').' extra-row'));
635
        }
636
637
        return $diffCell;
638
    }
639
640
    protected function buildTableDoms()
641
    {
642
        $this->oldTable = $this->parseTableStructure($this->oldText);
643
        $this->newTable = $this->parseTableStructure($this->newText);
644
    }
645
646
    /**
647
     * @param string $text
648
     *
649
     * @return \DOMDocument
650
     */
651
    protected function createDocumentWithHtml($text)
652
    {
653
        $dom = new \DOMDocument();
654
        $dom->loadHTML(mb_convert_encoding(
655
            $this->purifier->purify(mb_convert_encoding($text, $this->config->getEncoding(), mb_detect_encoding($text))),
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 121 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
656
            'HTML-ENTITIES',
657
            $this->config->getEncoding()
658
        ));
659
660
        return $dom;
661
    }
662
663
    /**
664
     * @param string $text
665
     *
666
     * @return Table
667
     */
668
    protected function parseTableStructure($text)
669
    {
670
        $dom = $this->createDocumentWithHtml($text);
671
672
        $tableNode = $dom->getElementsByTagName('table')->item(0);
673
674
        $table = new Table($tableNode);
0 ignored issues
show
Documentation introduced by
$tableNode is of type object<DOMNode>, but the function expects a null|object<DOMElement>.

It seems like the type of the argument is not accepted by the function/method which you are calling.

In some cases, in particular if PHP’s automatic type-juggling kicks in this might be fine. In other cases, however this might be a bug.

We suggest to add an explicit type cast like in the following example:

function acceptsInteger($int) { }

$x = '123'; // string "123"

// Instead of
acceptsInteger($x);

// we recommend to use
acceptsInteger((integer) $x);
Loading history...
675
676
        $this->parseTable($table);
677
678
        return $table;
679
    }
680
681
    /**
682
     * @param Table         $table
683
     * @param \DOMNode|null $node
684
     */
685
    protected function parseTable(Table $table, \DOMNode $node = null)
686
    {
687
        if ($node === null) {
688
            $node = $table->getDomNode();
689
        }
690
691
        if (!$node->childNodes) {
692
            return;
693
        }
694
695
        foreach ($node->childNodes as $child) {
696
            if ($child->nodeName === 'tr') {
697
                $row = new TableRow($child);
698
                $table->addRow($row);
699
700
                $this->parseTableRow($row);
701
            } else {
702
                $this->parseTable($table, $child);
703
            }
704
        }
705
    }
706
707
    /**
708
     * @param TableRow $row
709
     */
710
    protected function parseTableRow(TableRow $row)
711
    {
712
        $node = $row->getDomNode();
713
714
        foreach ($node->childNodes as $child) {
715
            if (in_array($child->nodeName, array('td', 'th'))) {
716
                $cell = new TableCell($child);
717
                $row->addCell($cell);
718
            }
719
        }
720
    }
721
722
    /**
723
     * @param \DOMNode $node
724
     *
725
     * @return string
726
     */
727
    protected function getInnerHtml($node)
728
    {
729
        $innerHtml = '';
730
        $children = $node->childNodes;
731
732
        foreach ($children as $child) {
733
            $innerHtml .= $this->htmlFromNode($child);
734
        }
735
736
        return $innerHtml;
737
    }
738
739
    /**
740
     * @param \DOMNode $node
741
     *
742
     * @return string
743
     */
744 View Code Duplication
    protected function htmlFromNode($node)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
745
    {
746
        $domDocument = new \DOMDocument();
747
        $newNode = $domDocument->importNode($node, true);
748
        $domDocument->appendChild($newNode);
749
750
        return $domDocument->saveHTML();
751
    }
752
753
    /**
754
     * @param \DOMNode $node
755
     * @param string   $html
756
     */
757
    protected function setInnerHtml($node, $html)
758
    {
759
        // DOMDocument::loadHTML does not allow empty strings.
760
        if (strlen($html) === 0) {
761
            $html = '<span class="empty"></span>';
762
        }
763
764
        $doc = $this->createDocumentWithHtml($html);
765
        $fragment = $node->ownerDocument->createDocumentFragment();
766
        $root = $doc->getElementsByTagName('body')->item(0);
767
        foreach ($root->childNodes as $child) {
768
            $fragment->appendChild($node->ownerDocument->importNode($child, true));
769
        }
770
771
        $node->appendChild($fragment);
772
    }
773
774
    /**
775
     * @param Table $table
776
     */
777
    protected function indexCellValues(Table $table)
778
    {
779
        foreach ($table->getRows() as $rowIndex => $row) {
780
            foreach ($row->getCells() as $cellIndex => $cell) {
781
                $value = trim($cell->getDomNode()->textContent);
782
783
                if (!isset($this->cellValues[$value])) {
784
                    $this->cellValues[$value] = array();
785
                }
786
787
                $this->cellValues[$value][] = new TablePosition($rowIndex, $cellIndex);
788
            }
789
        }
790
    }
791
792
    /**
793
     * @param TableRow        $tableRow
794
     * @param DiffRowPosition $position
795
     * @param array           $cellsWithMultipleRows
796
     * @param \DOMNode        $diffRow
797
     * @param string          $diffType
798
     * @param bool            $usingExtraRow
799
     */
800
    protected function syncVirtualColumns(
801
        $tableRow,
802
        DiffRowPosition $position,
803
        &$cellsWithMultipleRows,
804
        $diffRow,
805
        $diffType,
806
        $usingExtraRow = false
807
    ) {
808
        $currentCell = $tableRow->getCell($position->getIndex($diffType));
809
        while ($position->isColumnLessThanOther($diffType) && $currentCell) {
810
            $diffCell = $diffType === 'new' ? $this->diffCells(null, $currentCell, $usingExtraRow) : $this->diffCells(
811
                $currentCell,
812
                null,
813
                $usingExtraRow
814
            );
815
            // Store cell in appliedRowSpans if spans multiple rows
816
            if ($diffCell->getAttribute('rowspan') > 1) {
817
                $cellsWithMultipleRows[$diffCell->getAttribute('rowspan')][] = $diffCell;
818
            }
819
            $diffRow->appendChild($diffCell);
820
            $position->incrementColumn($diffType, $currentCell->getColspan());
821
            $currentCell = $tableRow->getCell($position->incrementIndex($diffType));
822
        }
823
    }
824
825
    /**
826
     * @param null|TableCell  $oldCell
827
     * @param null|TableCell  $newCell
828
     * @param array           $cellsWithMultipleRows
829
     * @param \DOMElement     $diffRow
830
     * @param DiffRowPosition $position
831
     * @param bool            $usingExtraRow
832
     *
833
     * @return \DOMElement
834
     */
835
    protected function diffCellsAndIncrementCounters(
836
        $oldCell,
837
        $newCell,
838
        &$cellsWithMultipleRows,
839
        $diffRow,
840
        DiffRowPosition $position,
841
        $usingExtraRow = false
842
    ) {
843
        $diffCell = $this->diffCells($oldCell, $newCell, $usingExtraRow);
844
        // Store cell in appliedRowSpans if spans multiple rows
845
        if ($diffCell->getAttribute('rowspan') > 1) {
846
            $cellsWithMultipleRows[$diffCell->getAttribute('rowspan')][] = $diffCell;
847
        }
848
        $diffRow->appendChild($diffCell);
849
850
        if ($newCell !== null) {
851
            $position->incrementIndexInNew();
852
            $position->incrementColumnInNew($newCell->getColspan());
853
        }
854
855
        if ($oldCell !== null) {
856
            $position->incrementIndexInOld();
857
            $position->incrementColumnInOld($oldCell->getColspan());
858
        }
859
860
        return $diffCell;
861
    }
862
863
    /**
864
     * @param TableRow|null $oldRow
865
     * @param TableRow|null $newRow
866
     * @param array         $appliedRowSpans
867
     * @param bool          $forceExpansion
868
     */
869
    protected function diffAndAppendRows($oldRow, $newRow, &$appliedRowSpans, $forceExpansion = false)
870
    {
871
        list($rowDom, $extraRow) = $this->diffRows(
872
            $oldRow,
873
            $newRow,
874
            $appliedRowSpans,
875
            $forceExpansion
876
        );
877
878
        $this->diffTable->appendChild($rowDom);
879
880
        if ($extraRow) {
881
            $this->diffTable->appendChild($extraRow);
882
        }
883
    }
884
885
    /**
886
     * @param TableRow $oldRow
887
     * @param TableRow $newRow
888
     * @param int      $oldIndex
889
     * @param int      $newIndex
890
     *
891
     * @return float|int
892
     */
893
    protected function getMatchPercentage(TableRow $oldRow, TableRow $newRow, $oldIndex, $newIndex)
894
    {
895
        $firstCellWeight = 1.5;
896
        $indexDeltaWeight = 0.25 * (abs($oldIndex - $newIndex));
897
        $thresholdCount = 0;
898
        $minCells = min(count($newRow->getCells()), count($oldRow->getCells()));
899
        $totalCount = ($minCells + $firstCellWeight + $indexDeltaWeight) * 100;
900
        foreach ($newRow->getCells() as $newIndex => $newCell) {
901
            $oldCell = $oldRow->getCell($newIndex);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $oldCell is correct as $oldRow->getCell($newIndex) (which targets Caxy\HtmlDiff\Table\TableRow::getCell()) seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
902
903
            if ($oldCell) {
904
                $percentage = null;
905
                similar_text($oldCell->getInnerHtml(), $newCell->getInnerHtml(), $percentage);
906
907
                if ($percentage > ($this->config->getMatchThreshold() * 0.50)) {
908
                    $increment = $percentage;
909
                    if ($newIndex === 0 && $percentage > 95) {
910
                        $increment = $increment * $firstCellWeight;
911
                    }
912
                    $thresholdCount += $increment;
913
                }
914
            }
915
        }
916
917
        return ($totalCount > 0) ? ($thresholdCount / $totalCount) : 0;
918
    }
919
}
920