Completed
Push — master ( deae46...6c5f37 )
by Colin
01:01
created

Cursor   F

Complexity

Total Complexity 63

Size/Duplication

Total Lines 458
Duplicated Lines 0 %

Coupling/Cohesion

Components 1
Dependencies 1

Test Coverage

Coverage 99.44%

Importance

Changes 0
Metric Value
wmc 63
lcom 1
cbo 1
dl 0
loc 458
ccs 176
cts 177
cp 0.9944
rs 3.36
c 0
b 0
f 0

25 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 7 2
B getNextNonSpacePosition() 0 27 6
A getNextNonSpaceCharacter() 0 4 1
A getIndent() 0 8 2
A isIndented() 0 4 1
B getCharacter() 0 21 6
A peek() 0 4 1
A isBlank() 0 4 2
A advance() 0 4 1
C advanceBy() 0 65 14
A advanceWithoutTabCharacters() 0 9 1
A advanceBySpaceOrTab() 0 12 3
A advanceToNextNonSpaceOrTab() 0 8 1
A advanceToNextNonSpaceOrNewline() 0 22 4
A advanceToEnd() 0 9 1
A getRemainder() 0 20 4
A getLine() 0 4 1
A isAtEnd() 0 4 1
A match() 0 26 3
A saveState() 0 11 1
A restoreState() 0 11 1
A getPosition() 0 4 1
A getPreviousText() 0 4 1
A getSubstring() 0 12 3
A getColumn() 0 4 1

How to fix   Complexity   

Complex Class

Complex classes like Cursor often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Cursor, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
/*
6
 * This file is part of the league/commonmark package.
7
 *
8
 * (c) Colin O'Dell <[email protected]>
9
 *
10
 * For the full copyright and license information, please view the LICENSE
11
 * file that was distributed with this source code.
12
 */
13
14
namespace League\CommonMark\Parser;
15
16
class Cursor
17
{
18
    public const INDENT_LEVEL = 4;
19
20
    /**
21
     * @var string
22
     *
23
     * @psalm-readonly
24
     */
25
    private $line;
26
27
    /**
28
     * @var int
29
     *
30
     * @psalm-readonly
31
     */
32
    private $length;
33
34
    /**
35
     * @var int
36
     *
37
     * It's possible for this to be 1 char past the end, meaning we've parsed all chars and have
38
     * reached the end.  In this state, any character-returning method MUST return null.
39
     */
40
    private $currentPosition = 0;
41
42
    /** @var int */
43
    private $column = 0;
44
45
    /** @var int */
46
    private $indent = 0;
47
48
    /** @var int */
49
    private $previousPosition = 0;
50
51
    /** @var int|null */
52
    private $nextNonSpaceCache;
53
54
    /** @var bool */
55
    private $partiallyConsumedTab = false;
56
57
    /**
58
     * @var bool
59
     *
60
     * @psalm-readonly
61
     */
62
    private $lineContainsTabs;
63
64
    /**
65
     * @var bool
66
     *
67
     * @psalm-readonly
68
     */
69
    private $isMultibyte;
70
71
    /** @var array<int, string> */
72
    private $charCache = [];
73
74
    /**
75
     * @param string $line The line being parsed (ASCII or UTF-8)
76
     */
77 3000
    public function __construct(string $line)
78
    {
79 3000
        $this->line             = $line;
80 3000
        $this->length           = \mb_strlen($line, 'UTF-8') ?: 0;
81 3000
        $this->isMultibyte      = $this->length !== \strlen($line);
82 3000
        $this->lineContainsTabs = \strpos($line, "\t") !== false;
83 3000
    }
84
85
    /**
86
     * Returns the position of the next character which is not a space (or tab)
87
     */
88 2700
    public function getNextNonSpacePosition(): int
89
    {
90 2700
        if ($this->nextNonSpaceCache !== null) {
91 2520
            return $this->nextNonSpaceCache;
92
        }
93
94 2700
        $c    = null;
0 ignored issues
show
Unused Code introduced by
$c is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
95 2700
        $i    = $this->currentPosition;
96 2700
        $cols = $this->column;
97
98 2700
        while (($c = $this->getCharacter($i)) !== null) {
99 2676
            if ($c === ' ') {
100 813
                $i++;
101 813
                $cols++;
102 2634
            } elseif ($c === "\t") {
103 36
                $i++;
104 36
                $cols += 4 - ($cols % 4);
105
            } else {
106 2634
                break;
107
            }
108
        }
109
110 2700
        $nextNonSpace = $c === null ? $this->length : $i;
111 2700
        $this->indent = $cols - $this->column;
112
113 2700
        return $this->nextNonSpaceCache = $nextNonSpace;
114
    }
115
116
    /**
117
     * Returns the next character which isn't a space (or tab)
118
     */
119 2466
    public function getNextNonSpaceCharacter(): ?string
120
    {
121 2466
        return $this->getCharacter($this->getNextNonSpacePosition());
122
    }
123
124
    /**
125
     * Calculates the current indent (number of spaces after current position)
126
     */
127 2580
    public function getIndent(): int
128
    {
129 2580
        if ($this->nextNonSpaceCache === null) {
130 360
            $this->getNextNonSpacePosition();
131
        }
132
133 2580
        return $this->indent;
134
    }
135
136
    /**
137
     * Whether the cursor is indented to INDENT_LEVEL
138
     */
139 2520
    public function isIndented(): bool
140
    {
141 2520
        return $this->getIndent() >= self::INDENT_LEVEL;
142
    }
143
144 2760
    public function getCharacter(?int $index = null): ?string
145
    {
146 2760
        if ($index === null) {
147 2472
            $index = $this->currentPosition;
148
        }
149
150
        // Index out-of-bounds, or we're at the end
151 2760
        if ($index < 0 || $index >= $this->length) {
152 2370
            return null;
153
        }
154
155 2718
        if ($this->isMultibyte) {
156 102
            if (isset($this->charCache[$index])) {
157 48
                return $this->charCache[$index];
158
            }
159
160 102
            return $this->charCache[$index] = \mb_substr($this->line, $index, 1, 'UTF-8');
161
        }
162
163 2622
        return $this->line[$index];
164
    }
165
166
    /**
167
     * Returns the next character (or null, if none) without advancing forwards
168
     */
169 1332
    public function peek(int $offset = 1): ?string
170
    {
171 1332
        return $this->getCharacter($this->currentPosition + $offset);
172
    }
173
174
    /**
175
     * Whether the remainder is blank
176
     */
177 2514
    public function isBlank(): bool
178
    {
179 2514
        return $this->nextNonSpaceCache === $this->length || $this->getNextNonSpacePosition() === $this->length;
180
    }
181
182
    /**
183
     * Move the cursor forwards
184
     */
185 513
    public function advance(): void
186
    {
187 513
        $this->advanceBy(1);
188 513
    }
189
190
    /**
191
     * Move the cursor forwards
192
     *
193
     * @param int  $characters       Number of characters to advance by
194
     * @param bool $advanceByColumns Whether to advance by columns instead of spaces
195
     */
196 2868
    public function advanceBy(int $characters, bool $advanceByColumns = false): void
197
    {
198 2868
        if ($characters === 0) {
199 2577
            $this->previousPosition = $this->currentPosition;
200
201 2577
            return;
202
        }
203
204 2679
        $this->previousPosition  = $this->currentPosition;
205 2679
        $this->nextNonSpaceCache = null;
206
207
        // Optimization to avoid tab handling logic if we have no tabs
208 2679
        if (! $this->lineContainsTabs) {
209 2652
            $this->advanceWithoutTabCharacters($characters);
210
211 2652
            return;
212
        }
213
214 51
        $nextFewChars = $this->isMultibyte ?
215 6
            \mb_substr($this->line, $this->currentPosition, $characters, 'UTF-8') :
216 51
            \substr($this->line, $this->currentPosition, $characters);
217
218
        // Optimization to avoid tab handling logic if we have no tabs
219 51
        if (\strpos($nextFewChars, "\t") === false) {
220 24
            $this->advanceWithoutTabCharacters($characters);
221
222 24
            return;
223
        }
224
225 45
        if ($characters === 1 && ! empty($nextFewChars)) {
226 18
            $asArray = [$nextFewChars];
227 39
        } elseif ($this->isMultibyte) {
228
            /** @var string[] $asArray */
229
            $asArray = \preg_split('//u', $nextFewChars, -1, \PREG_SPLIT_NO_EMPTY);
230
        } else {
231 39
            $asArray = \str_split($nextFewChars);
232
        }
233
234 45
        foreach ($asArray as $relPos => $c) {
235 45
            if ($c === "\t") {
236 45
                $charsToTab = 4 - ($this->column % 4);
237 45
                if ($advanceByColumns) {
238 33
                    $this->partiallyConsumedTab = $charsToTab > $characters;
239 33
                    $charsToAdvance             = $charsToTab > $characters ? $characters : $charsToTab;
240 33
                    $this->column              += $charsToAdvance;
241 33
                    $this->currentPosition     += $this->partiallyConsumedTab ? 0 : 1;
242 33
                    $characters                -= $charsToAdvance;
243
                } else {
244 18
                    $this->partiallyConsumedTab = false;
245 18
                    $this->column              += $charsToTab;
246 18
                    $this->currentPosition++;
247 45
                    $characters--;
248
                }
249
            } else {
250 12
                $this->partiallyConsumedTab = false;
251 12
                $this->currentPosition++;
252 12
                $this->column++;
253 12
                $characters--;
254
            }
255
256 45
            if ($characters <= 0) {
257 45
                break;
258
            }
259
        }
260 45
    }
261
262 2667
    private function advanceWithoutTabCharacters(int $characters): void
263
    {
264 2667
        $length                     = \min($characters, $this->length - $this->currentPosition);
265 2667
        $this->partiallyConsumedTab = false;
266 2667
        $this->currentPosition     += $length;
267 2667
        $this->column              += $length;
268
269 2667
        return;
270
    }
271
272
    /**
273
     * Advances the cursor by a single space or tab, if present
274
     */
275 390
    public function advanceBySpaceOrTab(): bool
276
    {
277 390
        $character = $this->getCharacter();
278
279 390
        if ($character === ' ' || $character === "\t") {
280 378
            $this->advanceBy(1, true);
281
282 378
            return true;
283
        }
284
285 294
        return false;
286
    }
287
288
    /**
289
     * Parse zero or more space/tab characters
290
     *
291
     * @return int Number of positions moved
292
     */
293 2532
    public function advanceToNextNonSpaceOrTab(): int
294
    {
295 2532
        $newPosition = $this->getNextNonSpacePosition();
296 2532
        $this->advanceBy($newPosition - $this->currentPosition);
297 2532
        $this->partiallyConsumedTab = false;
298
299 2532
        return $this->currentPosition - $this->previousPosition;
300
    }
301
302
    /**
303
     * Parse zero or more space characters, including at most one newline.
304
     *
305
     * Tab characters are not parsed with this function.
306
     *
307
     * @return int Number of positions moved
308
     */
309 249
    public function advanceToNextNonSpaceOrNewline(): int
310
    {
311 249
        $remainder = $this->getRemainder();
312
313
        // Optimization: Avoid the regex if we know there are no spaces or newlines
314 249
        if (empty($remainder) || ($remainder[0] !== ' ' && $remainder[0] !== "\n")) {
315 216
            $this->previousPosition = $this->currentPosition;
316
317 216
            return 0;
318
        }
319
320 84
        $matches = [];
321 84
        \preg_match('/^ *(?:\n *)?/', $remainder, $matches, \PREG_OFFSET_CAPTURE);
322
323
        // [0][0] contains the matched text
324
        // [0][1] contains the index of that match
325 84
        $increment = $matches[0][1] + \strlen($matches[0][0]);
326
327 84
        $this->advanceBy($increment);
328
329 84
        return $this->currentPosition - $this->previousPosition;
330
    }
331
332
    /**
333
     * Move the position to the very end of the line
334
     *
335
     * @return int The number of characters moved
336
     */
337 798
    public function advanceToEnd(): int
338
    {
339 798
        $this->previousPosition  = $this->currentPosition;
340 798
        $this->nextNonSpaceCache = null;
341
342 798
        $this->currentPosition = $this->length;
343
344 798
        return $this->currentPosition - $this->previousPosition;
345
    }
346
347 2634
    public function getRemainder(): string
348
    {
349 2634
        if ($this->currentPosition >= $this->length) {
350 555
            return '';
351
        }
352
353 2607
        $prefix   = '';
354 2607
        $position = $this->currentPosition;
355 2607
        if ($this->partiallyConsumedTab) {
356 12
            $position++;
357 12
            $charsToTab = 4 - ($this->column % 4);
358 12
            $prefix     = \str_repeat(' ', $charsToTab);
359
        }
360
361 2607
        $subString = $this->isMultibyte ?
362 72
            \mb_substr($this->line, $position, null, 'UTF-8') :
363 2607
            \substr($this->line, $position);
364
365 2607
        return $prefix . $subString;
366
    }
367
368 1965
    public function getLine(): string
369
    {
370 1965
        return $this->line;
371
    }
372
373 2154
    public function isAtEnd(): bool
374
    {
375 2154
        return $this->currentPosition >= $this->length;
376
    }
377
378
    /**
379
     * Try to match a regular expression
380
     *
381
     * Returns the matching text and advances to the end of that match
382
     */
383 2352
    public function match(string $regex): ?string
384
    {
385 2352
        $subject = $this->getRemainder();
386
387 2352
        if (! \preg_match($regex, $subject, $matches, \PREG_OFFSET_CAPTURE)) {
388 909
            return null;
389
        }
390
391
        // $matches[0][0] contains the matched text
392
        // $matches[0][1] contains the index of that match
393
394 2319
        if ($this->isMultibyte) {
395
            // PREG_OFFSET_CAPTURE always returns the byte offset, not the char offset, which is annoying
396 57
            $offset      = \mb_strlen(\substr($subject, 0, $matches[0][1]), 'UTF-8');
397 57
            $matchLength = \mb_strlen($matches[0][0], 'UTF-8');
398
        } else {
399 2262
            $offset      = $matches[0][1];
400 2262
            $matchLength = \strlen($matches[0][0]);
401
        }
402
403
        // [0][0] contains the matched text
404
        // [0][1] contains the index of that match
405 2319
        $this->advanceBy($offset + $matchLength);
406
407 2319
        return $matches[0][0];
408
    }
409
410
    /**
411
     * Encapsulates the current state of this cursor in case you need to rollback later.
412
     *
413
     * WARNING: Do not parse or use the return value for ANYTHING except for
414
     * passing it back into restoreState(), as the number of values and their
415
     * contents may change in any future release without warning.
416
     */
417 1689
    public function saveState(): CursorState
418
    {
419 1689
        return new CursorState([
420 1689
            $this->currentPosition,
421 1689
            $this->previousPosition,
422 1689
            $this->nextNonSpaceCache,
423 1689
            $this->indent,
424 1689
            $this->column,
425 1689
            $this->partiallyConsumedTab,
426
        ]);
427
    }
428
429
    /**
430
     * Restore the cursor to a previous state.
431
     *
432
     * Pass in the value previously obtained by calling saveState().
433
     */
434 1596
    public function restoreState(CursorState $state): void
435
    {
436
        [
437 1596
            $this->currentPosition,
438 1596
            $this->previousPosition,
439 1596
            $this->nextNonSpaceCache,
440 1596
            $this->indent,
441 1596
            $this->column,
442 1596
            $this->partiallyConsumedTab,
443 1596
        ] = $state->toArray();
444 1596
    }
445
446 735
    public function getPosition(): int
447
    {
448 735
        return $this->currentPosition;
449
    }
450
451 399
    public function getPreviousText(): string
452
    {
453 399
        return \mb_substr($this->line, $this->previousPosition, $this->currentPosition - $this->previousPosition, 'UTF-8');
454
    }
455
456 426
    public function getSubstring(int $start, ?int $length = null): string
457
    {
458 426
        if ($this->isMultibyte) {
459 21
            return \mb_substr($this->line, $start, $length, 'UTF-8');
460
        }
461
462 405
        if ($length !== null) {
463 402
            return \substr($this->line, $start, $length);
464
        }
465
466 3
        return \substr($this->line, $start);
467
    }
468
469 285
    public function getColumn(): int
470
    {
471 285
        return $this->column;
472
    }
473
}
474