Passed
Push — refactor ( 0a349f...63f2db )
by Florian
02:41 queued 31s
created

Parser::tokenizeSegment()   B

Complexity

Conditions 7
Paths 6

Size

Total Lines 31
Code Lines 21

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 14
CRAP Score 7

Importance

Changes 0
Metric Value
eloc 21
c 0
b 0
f 0
dl 0
loc 31
ccs 14
cts 14
cp 1
rs 8.6506
nc 6
nop 3
cc 7
crap 7
1
<?php
2
3
declare(strict_types=1);
4
5
namespace Lead\Router;
6
7
use Lead\Router\Exception\ParserException;
8
9
/**
10
 * Parses route pattern.
11
 *
12
 * The parser can produce a tokens structure from route pattern using `Parser::tokenize()`.
13
 * A tokens structure root node is of the following form:
14
 *
15
 * ```php
16
 * $token = Parser::tokenize('/test/{param}');
17
 * ```
18
 *
19
 * The returned `$token` looks like the following:
20
 * ```
21
 * [
22
 *     'optional' => false,
23
 *     'greedy'   => '',
24
 *     'repeat'   => false,
25
 *     'pattern'  => '/test/{param}',
26
 *     'tokens'   => [
27
 *         '/test/',
28
 *         [
29
 *             'name'      => 'param',
30
 *             'pattern'   => '[^/]+'
31
 *         ]
32
 *     ]
33
 * ]
34
 * ```
35
 *
36
 * Then tokens structures can be compiled to get the regex representation with associated variable.
37
 *
38
 * ```php
39
 * $rule = Parser::compile($token);
40
 * ```
41
 *
42
 * `$rule` looks like the following:
43
 *
44
 * ```
45
 * [
46
 *     '/test/([^/]+)',
47
 *     ['param' => false]
48
 * ]
49
 * ```
50
 */
51
class Parser implements ParserInterface
52
{
53
54
    /**
55
     * Variable capturing block regex.
56
     */
57
    const PLACEHOLDER_REGEX = <<<EOD
0 ignored issues
show
Coding Style introduced by
Visibility must be declared on all constants if your project supports PHP 7.1 or later
Loading history...
58
\{
59
    (
60
        [a-zA-Z][a-zA-Z0-9_]*
61
    )
62
    (?:
63
        :(
64
            [^{}]*
65
            (?:
66
                \{(?-1)\}[^{}]*
67
            )*
68
        )
69
    )?
70
\}
71
EOD;
72
/**
73
     * Tokenizes a route pattern. Optional segments are identified by square brackets.
74
     *
75
     * @param string $pattern A route pattern
76
     * @param string $delimiter The path delimiter.
77
     * @param array The tokens structure root node.
0 ignored issues
show
Bug introduced by
The type Lead\Router\The was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
78
     * @return array
79
     */
80
    public static function tokenize(string $pattern, string $delimiter = '/'): array
81
    {
82
        // Checks if the pattern has some optional segments.
83
        if (count(preg_split('~' . static::PLACEHOLDER_REGEX . '(*SKIP)(*F)|\[~x', $pattern)) > 1) {
0 ignored issues
show
Bug introduced by
It seems like preg_split('~' . static:...P)(*F)|\[~x', $pattern) can also be of type false; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

83
        if (count(/** @scrutinizer ignore-type */ preg_split('~' . static::PLACEHOLDER_REGEX . '(*SKIP)(*F)|\[~x', $pattern)) > 1) {
Loading history...
84 33
            $tokens = static::_tokenizePattern($pattern, $delimiter);
85
        } else {
86 57
            $tokens = static::tokenizeSegment($pattern, $delimiter);
87
        }
88
        return [
89
            'optional' => false,
90
            'greedy'   => '',
91
            'repeat'   => false,
92
            'pattern'  => $pattern,
93
            'tokens'   => $tokens
94 86
        ];
95
    }
96
97
    /**
98
     * Tokenizes patterns.
99
     *
100
     * @param string $pattern   A route pattern
101
     * @param string $delimiter The path delimiter.
102
     * @param array An array of tokens structure.
0 ignored issues
show
Bug introduced by
The type Lead\Router\An was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
103
     * @return array
104
     */
105
    protected static function _tokenizePattern(string $pattern, string $delimiter, &$variable = null): array
106
    {
107 33
        $tokens = [];
108 33
        $index = 0;
0 ignored issues
show
Unused Code introduced by
The assignment to $index is dead and can be removed.
Loading history...
109 33
        $path = '';
0 ignored issues
show
Unused Code introduced by
The assignment to $path is dead and can be removed.
Loading history...
110 33
        $parts = static::split($pattern);
111
        foreach ($parts as $part) {
112
            if (is_string($part)) {
113 32
                $tokens = array_merge($tokens, static::tokenizeSegment($part, $delimiter, $variable));
114 32
                continue;
115
            }
116
117 32
            $greedy = $part[1];
118 32
            $repeat = $greedy === '+' || $greedy === '*';
119 32
            $optional = $greedy === '?' || $greedy === '*';
120 32
            $children = static::_tokenizePattern($part[0], $delimiter, $variable);
121
            $tokens[] = [
122
                'optional' => $optional,
123
                'greedy'   => $greedy ?: '?',
124
                'repeat'   => $repeat ? $variable : false,
125
                'pattern'  => $part[0],
126
                'tokens'   => $children
127 32
            ];
128
        }
129 32
        return $tokens;
130
    }
131
132
    /**
133
     * Tokenizes segments which are patterns with optional segments filtered out.
134
     * Only classic placeholder are supported.
135
     *
136
     * @param string $pattern   A route pattern with no optional segments.
137
     * @param string $delimiter The path delimiter.
138
     * @param array An array of tokens structure.
139
     * @return array
140
     */
141
    protected static function tokenizeSegment($pattern, $delimiter, &$variable = null): array
142
    {
143 86
        $tokens = [];
144 86
        $index = 0;
145 86
        $path = '';
146
        if (preg_match_all('~' . static::PLACEHOLDER_REGEX . '()~x', $pattern, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER)) {
147
            foreach ($matches as $match) {
148 64
                $offset = $match[0][1];
149 64
                $path .= substr($pattern, $index, $offset - $index);
150 64
                $index = $offset + strlen($match[0][0]);
151
                if ($path) {
152 61
                    $tokens[] = $path;
153 61
                    $path = '';
154
                }
155
156 64
                $variable = $match[1][0];
157 64
                $capture = $match[2][0] ?: '[^' . $delimiter . ']+';
158
                $tokens[] = [
159
                    'name'      => $variable,
160
                    'pattern'   => $capture
161 64
                ];
162
            }
163
        }
164
165
        if ($index < strlen($pattern)) {
166 62
            $path .= substr($pattern, $index);
167
            if ($path) {
168 62
                $tokens[] = $path;
169
            }
170
        }
171 86
        return $tokens;
172
    }
173
174
    /**
175
     * Splits a pattern in segments and patterns.
176
     *
177
     * segments will be represented by string value and patterns by an array containing
178
     * the string pattern as first value and the greedy value as second value.
179
     *
180
     * example:
181
     * `/user[/{id}]*` will gives `['/user', ['id', '*']]`
182
     *
183
     * Unfortunately recursive regex matcher can't help here so this function is required.
184
     *
185
     * @param string $pattern A route pattern.
186
     * @param array The split  pattern.
187
     * @return array
188
     */
189
    public static function split(string $pattern): array
190
    {
191 33
        $segments = [];
192 33
        $len = strlen($pattern);
193 33
        $buffer = '';
194 33
        $opened = 0;
195
        for (
196 33
            $i = 0; $i < $len; $i++
197
        ) {
198
            if ($pattern[$i] === '{') {
199
                do {
200 29
                    $buffer .= $pattern[$i++];
201
                    if ($pattern[$i] === '}') {
202 29
                            $buffer .= $pattern[$i];
203 29
                            break;
204
                    }
205 29
                } while ($i < $len);
206
            } elseif ($pattern[$i] === '[') {
207 33
                $opened++;
208
                if ($opened === 1) {
209 33
                    $segments[] = $buffer;
210 33
                    $buffer = '';
211
                } else {
212 33
                    $buffer .= $pattern[$i];
213
                }
214
            } elseif ($pattern[$i] === ']') {
215 33
                $opened--;
216
                if ($opened === 0) {
217 32
                    $greedy = '?';
218
                    if ($i < $len - 1) {
219
                        if ($pattern[$i + 1] === '*' || $pattern[$i + 1] === '+') {
220 13
                            $greedy = $pattern[$i + 1];
221 13
                            $i++;
222
                        }
223
                    }
224 32
                    $segments[] = [$buffer, $greedy];
225 32
                    $buffer = '';
226
                } else {
227 33
                    $buffer .= $pattern[$i];
228
                }
229
            } else {
230 33
                $buffer .= $pattern[$i];
231
            }
232
        }
233
        if ($buffer) {
234 33
            $segments[] = $buffer;
235
        }
236
        if ($opened) {
237 1
            throw ParserException::squareBracketMismatch();
238
        }
239
240 32
        return $segments;
241
    }
242
243
    /**
244
     * Builds a regex from a tokens structure array.
245
     *
246
     * @param array $token A tokens structure root node.
247
     * @return array An array containing the regex pattern and its associated variable names.
248
     */
249
    public static function compile($token): array
250
    {
251 54
        $variables = [];
252 54
        $regex = '';
253
        foreach ($token['tokens'] as $child) {
254
            if (is_string($child)) {
255 54
                $regex .= preg_quote($child, '~');
256
            } elseif (isset($child['tokens'])) {
257 17
                $rule = static::compile($child);
258
                if ($child['repeat']) {
259
                    if (count($rule[1]) > 1) {
260 1
                        throw ParserException::placeholderExceeded();
261
                    }
262 6
                    $regex .= '((?:' . $rule[0] . ')' . $child['greedy'] . ')';
263
                } elseif ($child['optional']) {
264 14
                    $regex .= '(?:' . $rule[0] . ')?';
265
                }
266
                foreach ($rule[1] as $name => $pattern) {
267
                    if (isset($variables[$name])) {
268 1
                        throw ParserException::duplicatePlaceholder($name);
269
                    }
270 15
                    $variables[$name] = $pattern;
271
                }
272
            } else {
273 37
                $name = $child['name'];
274
                if (isset($variables[$name])) {
275 1
                            throw ParserException::duplicatePlaceholder($name);
276
                }
277
                if ($token['repeat']) {
278 7
                    $variables[$name] = $token['pattern'];
279 7
                    $regex .= $child['pattern'];
280
                } else {
281 36
                    $variables[$name] = false;
282 54
                    $regex .= '(' . $child['pattern'] . ')';
283
                }
284
            }
285
        }
286 53
        return [$regex, $variables];
287
    }
288
}
289