ReferenceParser   B
last analyzed

Complexity

Total Complexity 43

Size/Duplication

Total Lines 325
Duplicated Lines 0 %

Test Coverage

Coverage 93.33%

Importance

Changes 0
Metric Value
wmc 43
eloc 144
dl 0
loc 325
ccs 126
cts 135
cp 0.9333
rs 8.96
c 0
b 0
f 0

10 Methods

Rating   Name   Duplication   Size   Complexity  
B parseLabel() 0 47 7
A parseDestination() 0 25 4
B parse() 0 40 10
A finishReference() 0 14 2
A getReferences() 0 5 1
A parseStartDefinition() 0 16 4
B parseTitle() 0 42 6
A getParagraphContent() 0 3 1
B parseStartTitle() 0 36 7
A hasReferences() 0 3 1

How to fix   Complexity   

Complex Class

Complex classes like ReferenceParser often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use ReferenceParser, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
/*
6
 * This file is part of the league/commonmark package.
7
 *
8
 * (c) Colin O'Dell <[email protected]>
9
 *
10
 * Original code based on the CommonMark JS reference parser (https://bitly.com/commonmark-js)
11
 *  - (c) John MacFarlane
12
 *
13
 * For the full copyright and license information, please view the LICENSE
14
 * file that was distributed with this source code.
15
 */
16
17
namespace League\CommonMark\Reference;
18
19
use League\CommonMark\Parser\Cursor;
20
use League\CommonMark\Util\LinkParserHelper;
21
22
final class ReferenceParser
23
{
24
    // Looking for the start of a definition, i.e. `[`
25
    private const START_DEFINITION = 0;
26
    // Looking for and parsing the label, i.e. `[foo]` within `[foo]`
27
    private const LABEL = 1;
28
    // Parsing the destination, i.e. `/url` in `[foo]: /url`
29
    private const DESTINATION = 2;
30
    // Looking for the start of a title, i.e. the first `"` in `[foo]: /url "title"`
31
    private const START_TITLE = 3;
32
    // Parsing the content of the title, i.e. `title` in `[foo]: /url "title"`
33
    private const TITLE = 4;
34
    // End state, no matter what kind of lines we add, they won't be references
35
    private const PARAGRAPH = 5;
36
37
    /**
38
     * @var string
39
     *
40
     * @psalm-readonly-allow-private-mutation
41
     */
42
    private $paragraph = '';
43
44
    /**
45
     * @var array<int, ReferenceInterface>
46
     *
47
     * @psalm-readonly-allow-private-mutation
48
     */
49
    private $references = [];
50
51
    /**
52
     * @var int
53
     *
54
     * @psalm-readonly-allow-private-mutation
55
     */
56
    private $state = self::START_DEFINITION;
57
58
    /**
59
     * @var string|null
60
     *
61
     * @psalm-readonly-allow-private-mutation
62
     */
63
    private $label;
64
65
    /**
66
     * @var string|null
67
     *
68
     * @psalm-readonly-allow-private-mutation
69
     */
70
    private $destination;
71
72
    /**
73
     * @var string string
74
     *
75
     * @psalm-readonly-allow-private-mutation
76
     */
77
    private $title = '';
78
79
    /**
80
     * @var string|null
81
     *
82
     * @psalm-readonly-allow-private-mutation
83
     */
84
    private $titleDelimiter;
85
86
    /**
87
     * @var bool
88
     *
89
     * @psalm-readonly-allow-private-mutation
90
     */
91
    private $referenceValid = false;
92
93 2457
    public function getParagraphContent(): string
94
    {
95 2457
        return $this->paragraph;
96
    }
97
98
    /**
99
     * @return ReferenceInterface[]
100
     */
101 2457
    public function getReferences(): iterable
102
    {
103 2457
        $this->finishReference();
104
105 2457
        return $this->references;
106
    }
107
108 2352
    public function hasReferences(): bool
109
    {
110 2352
        return $this->references !== [];
111
    }
112
113 2457
    public function parse(string $line): void
114
    {
115 2457
        if ($this->paragraph !== '') {
116 357
            $this->paragraph .= "\n";
117
        }
118
119 2457
        $this->paragraph .= $line;
120
121 2457
        $cursor = new Cursor($line);
122 2457
        while (! $cursor->isAtEnd()) {
123 2457
            $result = false;
124 2457
            switch ($this->state) {
125 2457
                case self::PARAGRAPH:
126
                    // We're in a paragraph now. Link reference definitions can only appear at the beginning, so once
127
                    // we're in a paragraph, there's no going back.
128 336
                    return;
129 2457
                case self::START_DEFINITION:
130 2457
                    $result = $this->parseStartDefinition($cursor);
131 2457
                    break;
132 441
                case self::LABEL:
133 441
                    $result = $this->parseLabel($cursor);
134 441
                    break;
135 249
                case self::DESTINATION:
136 249
                    $result = $this->parseDestination($cursor);
137 249
                    break;
138 138
                case self::START_TITLE:
139 138
                    $result = $this->parseStartTitle($cursor);
140 138
                    break;
141 114
                case self::TITLE:
142 114
                    $result = $this->parseTitle($cursor);
143 114
                    break;
144
                default:
145
                    // this should never happen
146
                    break;
147
            }
148
149 2457
            if (! $result) {
150 2448
                $this->state = self::PARAGRAPH;
151
152 2448
                return;
153
            }
154
        }
155 249
    }
156
157 2457
    private function parseStartDefinition(Cursor $cursor): bool
158
    {
159 2457
        $cursor->advanceToNextNonSpaceOrTab();
160 2457
        if ($cursor->isAtEnd() || $cursor->getCharacter() !== '[') {
161 2109
            return false;
162
        }
163
164 441
        $this->state = self::LABEL;
165 441
        $this->label = '';
166
167 441
        $cursor->advance();
168 441
        if ($cursor->isAtEnd()) {
169 6
            $this->label .= "\n";
170
        }
171
172 441
        return true;
173
    }
174
175 441
    private function parseLabel(Cursor $cursor): bool
176
    {
177 441
        $cursor->advanceToNextNonSpaceOrTab();
178
179 441
        $partialLabel = LinkParserHelper::parsePartialLinkLabel($cursor);
180 441
        if ($partialLabel === null) {
181
            return false;
182
        }
183
184
        \assert($this->label !== null);
185 441
        $this->label .= $partialLabel;
186
187 441
        if ($cursor->isAtEnd()) {
188
            // label might continue on next line
189 9
            $this->label .= "\n";
190
191 9
            return true;
192
        }
193
194 438
        if ($cursor->getCharacter() !== ']') {
195 45
            return false;
196
        }
197
198 417
        $cursor->advance();
199
200
        // end of label
201 417
        if ($cursor->getCharacter() !== ':') {
202 321
            return false;
203
        }
204
205 258
        $cursor->advance();
206
207
        // spec: A link label can have at most 999 characters inside the square brackets
208 258
        if (\mb_strlen($this->label, 'utf-8') > 999) {
209
            return false;
210
        }
211
212
        // spec: A link label must contain at least one non-whitespace character
213 258
        if (\trim($this->label) === '') {
214 6
            return false;
215
        }
216
217 252
        $cursor->advanceToNextNonSpaceOrTab();
218
219 252
        $this->state = self::DESTINATION;
220
221 252
        return true;
222
    }
223
224 249
    private function parseDestination(Cursor $cursor): bool
225
    {
226 249
        $cursor->advanceToNextNonSpaceOrTab();
227
228 249
        $destination = LinkParserHelper::parseLinkDestination($cursor);
229 249
        if ($destination === null) {
230
            return false;
231
        }
232
233 249
        $this->destination = $destination;
234
235 249
        $advanced = $cursor->advanceToNextNonSpaceOrTab();
236 249
        if ($cursor->isAtEnd()) {
237
            // Destination was at end of line, so this is a valid reference for sure (and maybe a title).
238
            // If not at end of line, wait for title to be valid first.
239 141
            $this->referenceValid = true;
240 141
            $this->paragraph      = '';
241 111
        } elseif ($advanced === 0) {
242
            // spec: The title must be separated from the link destination by whitespace
243 3
            return false;
244
        }
245
246 246
        $this->state = self::START_TITLE;
247
248 246
        return true;
249
    }
250
251 138
    private function parseStartTitle(Cursor $cursor): bool
252
    {
253 138
        $cursor->advanceToNextNonSpaceOrTab();
254 138
        if ($cursor->isAtEnd()) {
255
            $this->state = self::START_DEFINITION;
256
257
            return true;
258
        }
259
260 138
        $this->titleDelimiter = null;
261 138
        switch ($c = $cursor->getCharacter()) {
262 138
            case '"':
263 39
            case "'":
264 114
                $this->titleDelimiter = $c;
265 114
                break;
266 24
            case '(':
267
                $this->titleDelimiter = ')';
268
                break;
269
            default:
270
                // no title delimter found
271 24
                break;
272
        }
273
274 138
        if ($this->titleDelimiter !== null) {
275 114
            $this->state = self::TITLE;
276 114
            $cursor->advance();
277 114
            if ($cursor->isAtEnd()) {
278 114
                $this->title .= "\n";
279
            }
280
        } else {
281 24
            $this->finishReference();
282
            // There might be another reference instead, try that for the same character.
283 24
            $this->state = self::START_DEFINITION;
284
        }
285
286 138
        return true;
287
    }
288
289 114
    private function parseTitle(Cursor $cursor): bool
290
    {
291
        \assert($this->titleDelimiter !== null);
292 114
        $title = LinkParserHelper::parsePartialLinkTitle($cursor, $this->titleDelimiter);
293
294 114
        if ($title === null) {
295
            // Invalid title, stop
296
            return false;
297
        }
298
299
        // Did we find the end delimiter?
300 114
        $endDelimiterFound = false;
301 114
        if (\substr($title, -1) === $this->titleDelimiter) {
302 111
            $endDelimiterFound = true;
303
            // Chop it off
304 111
            $title = \substr($title, 0, -1);
305
        }
306
307 114
        $this->title .= $title;
308
309 114
        if (! $endDelimiterFound && $cursor->isAtEnd()) {
310
            // Title still going, continue on next line
311 6
            $this->title .= "\n";
312
313 6
            return true;
314
        }
315
316
        // We either hit the end delimiter or some extra whitespace
317 111
        $cursor->advanceToNextNonSpaceOrTab();
318 111
        if (! $cursor->isAtEnd()) {
319
            // spec: No further non-whitespace characters may occur on the line.
320 6
            return false;
321
        }
322
323 105
        $this->referenceValid = true;
324 105
        $this->finishReference();
325 105
        $this->paragraph = '';
326
327
        // See if there's another definition
328 105
        $this->state = self::START_DEFINITION;
329
330 105
        return true;
331
    }
332
333 2457
    private function finishReference(): void
334
    {
335 2457
        if (! $this->referenceValid) {
336 2448
            return;
337
        }
338
339
        /** @psalm-suppress PossiblyNullArgument -- these can't possibly be null if we're in this state */
340 237
        $this->references[] = new Reference($this->label, $this->destination, $this->title);
0 ignored issues
show
Bug introduced by Colin O'Dell
It seems like $this->label can also be of type null; however, parameter $label of League\CommonMark\Refere...eference::__construct() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

340
        $this->references[] = new Reference(/** @scrutinizer ignore-type */ $this->label, $this->destination, $this->title);
Loading history...
Bug introduced by Colin O'Dell
It seems like $this->destination can also be of type null; however, parameter $destination of League\CommonMark\Refere...eference::__construct() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

340
        $this->references[] = new Reference($this->label, /** @scrutinizer ignore-type */ $this->destination, $this->title);
Loading history...
341
342 237
        $this->label          = null;
343 237
        $this->referenceValid = false;
344 237
        $this->destination    = null;
345 237
        $this->title          = '';
346 237
        $this->titleDelimiter = null;
347 237
    }
348
}
349