Completed
Push — cursor-optimizations ( fb314e )
by Colin
03:21
created

Cursor   C

Complexity

Total Complexity 58

Size/Duplication

Total Lines 535
Duplicated Lines 0 %

Coupling/Cohesion

Components 1
Dependencies 0

Test Coverage

Coverage 100%

Importance

Changes 11
Bugs 0 Features 0
Metric Value
wmc 58
c 11
b 0
f 0
lcom 1
cbo 0
dl 0
loc 535
ccs 184
cts 184
cp 1
rs 6.3005

27 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 8 2
A getFirstNonSpacePosition() 0 6 1
B getNextNonSpacePosition() 0 26 6
A getFirstNonSpaceCharacter() 0 6 1
A getNextNonSpaceCharacter() 0 4 1
A getIndent() 0 6 1
A isIndented() 0 6 1
A getCharacter() 0 13 4
A peek() 0 4 1
A isBlank() 0 4 1
A advance() 0 4 1
C advanceBy() 0 56 12
A advanceBySpaceOrTab() 0 12 3
B advanceWhileMatches() 0 25 5
A advanceToFirstNonSpace() 0 6 1
A advanceToNextNonSpaceOrTab() 0 8 1
A advanceToNextNonSpaceOrNewline() 0 17 2
A advanceToEnd() 0 9 1
A getRemainder() 0 16 3
A getLine() 0 4 1
A isAtEnd() 0 4 1
B match() 0 25 3
A saveState() 0 11 1
A restoreState() 0 11 1
A getPosition() 0 4 1
A getPreviousText() 0 4 1
A getColumn() 0 4 1

How to fix   Complexity   

Complex Class

Complex classes like Cursor often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Cursor, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
/*
4
 * This file is part of the league/commonmark package.
5
 *
6
 * (c) Colin O'Dell <[email protected]>
7
 *
8
 * For the full copyright and license information, please view the LICENSE
9
 * file that was distributed with this source code.
10
 */
11
12
namespace League\CommonMark;
13
14
class Cursor
15
{
16
    const INDENT_LEVEL = 4;
17
18
    /**
19
     * @var string
20
     */
21
    private $line;
22
23
    /**
24
     * @var int
25
     */
26
    private $length;
27
28
    /**
29
     * @var int
30
     *
31
     * It's possible for this to be 1 char past the end, meaning we've parsed all chars and have
32
     * reached the end.  In this state, any character-returning method MUST return null.
33
     */
34
    private $currentPosition = 0;
35
36
    /**
37
     * @var int
38
     */
39
    private $column = 0;
40
41
    /**
42
     * @var int
43
     */
44
    private $indent = 0;
45
46
    /**
47
     * @var int
48
     */
49
    private $previousPosition = 0;
50
51
    /**
52
     * @var int|null
53
     */
54
    private $nextNonSpaceCache;
55
56
    /**
57
     * @var bool
58
     */
59
    private $partiallyConsumedTab = false;
60
61
    /**
62
     * @var string
63
     */
64
    private $encoding;
65
66
    /**
67
     * @var bool
68
     */
69
    private $lineContainsTabs;
70
71
    /**
72
     * @var bool
73
     */
74
    private $isMultibyte;
75
76
    /**
77
     * @param string $line
78
     */
79 2430
    public function __construct($line)
80
    {
81 2430
        $this->line = $line;
82 2430
        $this->encoding = mb_detect_encoding($line, 'ASCII,UTF-8', true) ?: 'ISO-8859-1';
83 2430
        $this->length = mb_strlen($line, $this->encoding);
84 2430
        $this->isMultibyte = $this->length !== strlen($line);
85 2430
        $this->lineContainsTabs = preg_match('/\t/', $line) > 0;
86 2430
    }
87
88
    /**
89
     * Returns the position of the next character which is not a space (or tab)
90
     *
91
     * @deprecated Use getNextNonSpacePosition() instead
92
     *
93
     * @return int
94
     */
95 16
    public function getFirstNonSpacePosition()
96
    {
97 16
        @trigger_error('Cursor::getFirstNonSpacePosition() will be removed in a future 0.x release.  Use getNextNonSpacePosition() instead. See https://github.com/thephpleague/commonmark/issues/280', E_USER_DEPRECATED);
1 ignored issue
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
98
99 16
        return $this->getNextNonSpacePosition();
100
    }
101
102
    /**
103
     * Returns the position of the next character which is not a space (or tab)
104
     *
105
     * @return int
106
     */
107 2112
    public function getNextNonSpacePosition()
108
    {
109 2112
        if ($this->nextNonSpaceCache !== null) {
110 1980
            return $this->nextNonSpaceCache;
111
        }
112
113 2112
        $i = $this->currentPosition;
114 2112
        $cols = $this->column;
115
116 2112
        while (($c = $this->getCharacter($i)) !== null) {
117 2088
            if ($c === ' ') {
118 504
                $i++;
119 504
                $cols++;
120 2074
            } elseif ($c === "\t") {
121 36
                $i++;
122 36
                $cols += (4 - ($cols % 4));
123 24
            } else {
124 2046
                break;
125
            }
126 350
        }
127
128 2112
        $nextNonSpace = ($c === null) ? $this->length : $i;
129 2112
        $this->indent = $cols - $this->column;
130
131 2112
        return $this->nextNonSpaceCache = $nextNonSpace;
132
    }
133
134
    /**
135
     * Returns the next character which isn't a space (or tab)
136
     *
137
     * @deprecated Use getNextNonSpaceCharacter() instead
138
     *
139
     * @return string
140
     */
141 16
    public function getFirstNonSpaceCharacter()
142
    {
143 16
        @trigger_error('Cursor::getFirstNonSpaceCharacter() will be removed in a future 0.x release.  Use getNextNonSpaceCharacter() instead. See https://github.com/thephpleague/commonmark/issues/280', E_USER_DEPRECATED);
1 ignored issue
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
144
145 16
        return $this->getNextNonSpaceCharacter();
146
    }
147
148
    /**
149
     * Returns the next character which isn't a space (or tab)
150
     *
151
     * @return string
152
     */
153 1905
    public function getNextNonSpaceCharacter()
154
    {
155 1905
        return $this->getCharacter($this->getNextNonSpacePosition());
156
    }
157
158
    /**
159
     * Calculates the current indent (number of spaces after current position)
160
     *
161
     * @return int
162
     */
163 1890
    public function getIndent()
164
    {
165 1890
        $this->getNextNonSpacePosition();
166
167 1890
        return $this->indent;
168
    }
169
170
    /**
171
     * Whether the cursor is indented to INDENT_LEVEL
172
     *
173
     * @return bool
174
     */
175 1932
    public function isIndented()
176
    {
177 1932
        $this->getNextNonSpacePosition();
178
179 1932
        return $this->indent >= self::INDENT_LEVEL;
180
    }
181
182
    /**
183
     * @param int|null $index
184
     *
185
     * @return string|null
186
     */
187 2205
    public function getCharacter($index = null)
188
    {
189 2205
        if ($index === null) {
190 1683
            $index = $this->currentPosition;
191 1122
        }
192
193
        // Index out-of-bounds, or we're at the end
194 2205
        if ($index < 0 || $index >= $this->length) {
195 1887
            return;
196
        }
197
198 2163
        return mb_substr($this->line, $index, 1, $this->encoding);
199
    }
200
201
    /**
202
     * Returns the next character (or null, if none) without advancing forwards
203
     *
204
     * @param int $offset
205
     *
206
     * @return string|null
207
     */
208 1014
    public function peek($offset = 1)
209
    {
210 1014
        return $this->getCharacter($this->currentPosition + $offset);
211
    }
212
213
    /**
214
     * Whether the remainder is blank
215
     *
216
     * @return bool
217
     */
218 1950
    public function isBlank()
219
    {
220 1950
        return $this->getNextNonSpacePosition() === $this->length;
221
    }
222
223
    /**
224
     * Move the cursor forwards
225
     */
226 789
    public function advance()
227
    {
228 789
        $this->advanceBy(1);
229 789
    }
230
231
    /**
232
     * Move the cursor forwards
233
     *
234
     * @param int  $characters       Number of characters to advance by
235
     * @param bool $advanceByColumns Whether to advance by columns instead of spaces
236
     */
237 2286
    public function advanceBy($characters, $advanceByColumns = false)
238
    {
239 2286
        if ($characters === 0) {
240 1971
            $this->previousPosition = $this->currentPosition;
241
242 1971
            return;
243
        }
244
245 2142
        $this->previousPosition = $this->currentPosition;
246 2142
        $this->nextNonSpaceCache = null;
247
248 2142
        $nextFewChars = mb_substr($this->line, $this->currentPosition, $characters, $this->encoding);
249
250
        // Optimization to avoid tab handling logic if we have no tabs
251 2142
        if (!$this->lineContainsTabs || preg_match('/\t/', $nextFewChars) === 0) {
252 2130
            $length = min($characters, $this->length - $this->currentPosition);
253 2130
            $this->partiallyConsumedTab = false;
254 2130
            $this->currentPosition += $length;
255 2130
            $this->column += $length;
256
257 2130
            return;
258
        }
259
260 45
        if ($characters === 1 && !empty($nextFewChars)) {
261 18
            $asArray = [$nextFewChars];
262 12
        } else {
263 39
            $asArray = preg_split('//u', $nextFewChars, null, PREG_SPLIT_NO_EMPTY);
264
        }
265
266 45
        foreach ($asArray as $relPos => $c) {
267 45
            if ($c === "\t") {
268 45
                $charsToTab = 4 - ($this->column % 4);
269 45
                if ($advanceByColumns) {
270 33
                    $this->partiallyConsumedTab = $charsToTab > $characters;
271 33
                    $charsToAdvance = $charsToTab > $characters ? $characters : $charsToTab;
272 33
                    $this->column += $charsToAdvance;
273 33
                    $this->currentPosition += $this->partiallyConsumedTab ? 0 : 1;
274 33
                    $characters -= $charsToAdvance;
275 22
                } else {
276 18
                    $this->partiallyConsumedTab = false;
277 18
                    $this->column += $charsToTab;
278 18
                    $this->currentPosition++;
279 27
                    $characters--;
280
                }
281 30
            } else {
282 12
                $this->partiallyConsumedTab = false;
283 12
                $this->currentPosition++;
284 12
                $this->column++;
285 12
                $characters--;
286
            }
287
288 45
            if ($characters <= 0) {
289 45
                break;
290
            }
291 30
        }
292 45
    }
293
294
    /**
295
     * Advances the cursor by a single space or tab, if present
296
     *
297
     * @return bool
298
     */
299 333
    public function advanceBySpaceOrTab()
300
    {
301 333
        $character = $this->getCharacter();
302
303 333
        if ($character === ' ' || $character === "\t") {
304 321
            $this->advanceBy(1, true);
305
306 321
            return true;
307
        }
308
309 249
        return false;
310
    }
311
312
    /**
313
     * Advances the cursor while the given character is matched
314
     *
315
     * @param string   $character                  Character to match
316
     * @param int|null $maximumCharactersToAdvance Maximum number of characters to advance before giving up
317
     *
318
     * @return int Number of positions moved (0 if unsuccessful)
319
     *
320
     * @deprecated Use match() instead
321
     */
322 30
    public function advanceWhileMatches($character, $maximumCharactersToAdvance = null)
323
    {
324 30
        @trigger_error('Cursor::advanceWhileMatches() will be removed in a future 0.x release.  Use match() instead.', E_USER_DEPRECATED);
325
326
        // Calculate how far to advance
327 30
        $start = $this->currentPosition;
328 30
        $newIndex = $start;
329 30
        if ($maximumCharactersToAdvance === null) {
330 12
            $maximumCharactersToAdvance = $this->length;
331 6
        }
332
333 30
        $max = min($start + $maximumCharactersToAdvance, $this->length);
334
335 30
        while ($newIndex < $max && $this->getCharacter($newIndex) === $character) {
336 22
            ++$newIndex;
337 11
        }
338
339 30
        if ($newIndex <= $start) {
340 8
            return 0;
341
        }
342
343 22
        $this->advanceBy($newIndex - $start);
344
345 22
        return $this->currentPosition - $this->previousPosition;
346
    }
347
348
    /**
349
     * Parse zero or more space characters, including at most one newline.
350
     *
351
     * @deprecated Use advanceToNextNonSpaceOrNewline() instead
352
     */
353 36
    public function advanceToFirstNonSpace()
354
    {
355 36
        @trigger_error('Cursor::advanceToFirstNonSpace() will be removed in a future 0.x release.  Use advanceToNextNonSpaceOrTab() or advanceToNextNonSpaceOrNewline() instead. See https://github.com/thephpleague/commonmark/issues/280', E_USER_DEPRECATED);
1 ignored issue
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
356
357 36
        return $this->advanceToNextNonSpaceOrNewline();
358
    }
359
360
    /**
361
     * Parse zero or more space/tab characters
362
     *
363
     * @return int Number of positions moved
364
     */
365 1842
    public function advanceToNextNonSpaceOrTab()
366
    {
367 1842
        $newPosition = $this->getNextNonSpacePosition();
368 1842
        $this->advanceBy($newPosition - $this->currentPosition);
369 1842
        $this->partiallyConsumedTab = false;
370
371 1842
        return $this->currentPosition - $this->previousPosition;
372
    }
373
374
    /**
375
     * Parse zero or more space characters, including at most one newline.
376
     *
377
     * Tab characters are not parsed with this function.
378
     *
379
     * @return int Number of positions moved
380
     */
381 441
    public function advanceToNextNonSpaceOrNewline()
382
    {
383 441
        $matches = [];
384 441
        preg_match('/^ *(?:\n *)?/', $this->getRemainder(), $matches, PREG_OFFSET_CAPTURE);
385
386
        // [0][0] contains the matched text
387
        // [0][1] contains the index of that match
388 441
        $increment = $matches[0][1] + strlen($matches[0][0]);
389
390 441
        if ($increment === 0) {
391 300
            return 0;
392
        }
393
394 300
        $this->advanceBy($increment);
395
396 300
        return $this->currentPosition - $this->previousPosition;
397
    }
398
399
    /**
400
     * Move the position to the very end of the line
401
     *
402
     * @return int The number of characters moved
403
     */
404 84
    public function advanceToEnd()
405
    {
406 84
        $this->previousPosition = $this->currentPosition;
407 84
        $this->nextNonSpaceCache = null;
408
409 84
        $this->currentPosition = $this->length;
410
411 84
        return $this->currentPosition - $this->previousPosition;
412
    }
413
414
    /**
415
     * @return string
416
     */
417 2037
    public function getRemainder()
418
    {
419 2037
        if ($this->currentPosition >= $this->length) {
420 693
            return '';
421
        }
422
423 2022
        $prefix = '';
424 2022
        $position = $this->currentPosition;
425 2022
        if ($this->partiallyConsumedTab) {
426 15
            $position++;
427 15
            $charsToTab = 4 - ($this->column % 4);
428 15
            $prefix = str_repeat(' ', $charsToTab);
429 10
        }
430
431 2022
        return $prefix . mb_substr($this->line, $position, null, $this->encoding);
432
    }
433
434
    /**
435
     * @return string
436
     */
437 1887
    public function getLine()
438
    {
439 1887
        return $this->line;
440
    }
441
442
    /**
443
     * @return bool
444
     */
445 411
    public function isAtEnd()
446
    {
447 411
        return $this->currentPosition >= $this->length;
448
    }
449
450
    /**
451
     * Try to match a regular expression
452
     *
453
     * Returns the matching text and advances to the end of that match
454
     *
455
     * @param string $regex
456
     *
457
     * @return string|null
458
     */
459 1902
    public function match($regex)
460
    {
461 1902
        $subject = $this->getRemainder();
462
463 1902
        $matches = [];
464 1902
        if (!preg_match($regex, $subject, $matches, PREG_OFFSET_CAPTURE)) {
465 1758
            return;
466
        }
467
468
        // $matches[0][0] contains the matched text
469
        // $matches[0][1] contains the index of that match
470
471 1776
        if ($this->isMultibyte) {
472
            // PREG_OFFSET_CAPTURE always returns the byte offset, not the char offset, which is annoying
473 51
            $offset = mb_strlen(mb_strcut($subject, 0, $matches[0][1], $this->encoding), $this->encoding);
474 34
        } else {
475 1725
            $offset = $matches[0][1];
476
        }
477
478
        // [0][0] contains the matched text
479
        // [0][1] contains the index of that match
480 1776
        $this->advanceBy($offset + mb_strlen($matches[0][0], $this->encoding));
481
482 1776
        return $matches[0][0];
483
    }
484
485
    /**
486
     * Encapsulates the current state of this cursor in case you need to rollback later.
487
     *
488
     * WARNING: Do not parse or use the return value for ANYTHING except for
489
     * passing it back into restoreState(), as the number of values and their
490
     * contents may change in any future release without warning.
491
     *
492
     * @return array
493
     */
494 1011
    public function saveState()
495
    {
496
        return [
497 1011
            $this->currentPosition,
498 1011
            $this->previousPosition,
499 1011
            $this->nextNonSpaceCache,
500 1011
            $this->indent,
501 1011
            $this->column,
502 1011
            $this->partiallyConsumedTab,
503 674
        ];
504
    }
505
506
    /**
507
     * Restore the cursor to a previous state.
508
     *
509
     * Pass in the value previously obtained by calling saveState().
510
     *
511
     * @param array $state
512
     */
513 795
    public function restoreState($state)
514
    {
515
        list(
516 795
            $this->currentPosition,
517 795
            $this->previousPosition,
518 795
            $this->nextNonSpaceCache,
519 795
            $this->indent,
520 795
            $this->column,
521 795
            $this->partiallyConsumedTab
522 795
          ) = $state;
523 795
    }
524
525
    /**
526
     * @return int
527
     */
528 648
    public function getPosition()
529
    {
530 648
        return $this->currentPosition;
531
    }
532
533
    /**
534
     * @return string
535
     */
536 870
    public function getPreviousText()
537
    {
538 870
        return mb_substr($this->line, $this->previousPosition, $this->currentPosition - $this->previousPosition, $this->encoding);
539
    }
540
541
    /**
542
     * @return int
543
     */
544 240
    public function getColumn()
545
    {
546 240
        return $this->column;
547
    }
548
}
549