Passed
Push — master ( 1265f7...db8617 )
by Alexander
01:41
created

BaseTokenizer::tokenizeOperator()   B

Complexity

Conditions 9
Paths 6

Size

Total Lines 81
Code Lines 53

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 52
CRAP Score 9.0005

Importance

Changes 1
Bugs 0 Features 0
Metric Value
cc 9
eloc 53
c 1
b 0
f 0
nc 6
nop 1
dl 0
loc 81
ccs 52
cts 53
cp 0.9811
crap 9.0005
rs 7.4698

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
declare(strict_types=1);
4
5
namespace Yiisoft\Db\Sqlite\Token;
6
7
/**
8
 * BaseTokenizer splits an SQL query into individual SQL tokens.
9
 *
10
 * It can be used to obtain an addition information from an SQL code.
11
 *
12
 * Usage example:
13
 *
14
 * ```php
15
 * $tokenizer = new SqlTokenizer("SELECT * FROM user WHERE id = 1");
16
 * $root = $tokeinzer->tokenize();
17
 * $sqlTokens = $root->getChildren();
18
 * ```
19
 *
20
 * Tokens are instances of {@see SqlToken}.
21
 */
22
abstract class BaseTokenizer
23
{
24
    /**
25
     * @var string SQL code.
26
     */
27
    private string $sql;
28
29
    /**
30
     * @var int SQL code string length.
31
     */
32
    protected int $length;
33
34
    /**
35
     * @var int SQL code string current offset.
36
     */
37
    protected int $offset;
38
39
    /**
40
     * @var \SplStack stack of active tokens.
41
     */
42
    private $tokenStack;
43
44
    /**
45
     * @var SqlToken|null active token. It's usually a top of the token stack.
46
     */
47
    private ?SqlToken $currentToken = null;
48
49
    /**
50
     * @var string[] cached substrings.
51
     */
52
    private array $substrings;
53
54
    /**
55
     * @var string string current buffer value.
56
     */
57
    private string $buffer = '';
58
59
    /**
60
     * @var SqlToken resulting token of a last {@see tokenize()} call.
61
     */
62
    private ?SqlToken $token = null;
63
64 17
    public function __construct(string $sql)
65
    {
66 17
        $this->sql = $sql;
67 17
    }
68
69
    /**
70
     * Tokenizes and returns a code type token.
71
     *
72
     * @return SqlToken code type token.
73
     */
74 17
    public function tokenize(): SqlToken
75
    {
76 17
        $this->length = mb_strlen($this->sql, 'UTF-8');
77 17
        $this->offset = 0;
78 17
        $this->substrings = [];
79 17
        $this->buffer = '';
80
81 17
        $this->token = new SqlToken();
82 17
        $this->token->setType(SqlToken::TYPE_CODE);
83 17
        $this->token->setContent($this->sql);
84
85 17
        $this->tokenStack = new \SplStack();
86 17
        $this->tokenStack->push($this->token);
87
88 17
        $tk = new SqlToken();
89 17
        $tk->setType(SqlToken::TYPE_STATEMENT);
90 17
        $this->token[] = $tk;
91
92 17
        $this->tokenStack->push($this->token[0]);
93 17
        $this->currentToken = $this->tokenStack->top();
94
95 17
        while (!$this->isEof()) {
96 17
            if ($this->isWhitespace($length) || $this->isComment($length)) {
97 17
                $this->addTokenFromBuffer();
98 17
                $this->advance($length);
99
100 17
                continue;
101
            }
102
103 17
            if ($this->tokenizeOperator($length) || $this->tokenizeDelimitedString($length)) {
104 17
                $this->advance($length);
105
106 17
                continue;
107
            }
108
109 17
            $this->buffer .= $this->substring(1);
110 17
            $this->advance(1);
111
        }
112 17
        $this->addTokenFromBuffer();
113 17
        if ($this->token->getHasChildren() && !$this->token[-1]->getHasChildren()) {
0 ignored issues
show
Bug introduced by
The method getHasChildren() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

113
        if ($this->token->getHasChildren() && !$this->token[-1]->/** @scrutinizer ignore-call */ getHasChildren()) {

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
114 5
            unset($this->token[-1]);
115
        }
116
117 17
        return $this->token;
118
    }
119
120
    /**
121
     * Returns whether there's a whitespace at the current offset.
122
     *
123
     * If this methos returns `true`, it has to set the `$length` parameter to the length of the matched string.
124
     *
125
     * @param int $length length of the matched string.
126
     *
127
     * @return bool whether there's a whitespace at the current offset.
128
     */
129
    abstract protected function isWhitespace(int &$length): bool;
130
131
    /**
132
     * Returns whether there's a commentary at the current offset.
133
     * If this methos returns `true`, it has to set the `$length` parameter to the length of the matched string.
134
     *
135
     * @param int $length length of the matched string.
136
     *
137
     * @return bool whether there's a commentary at the current offset.
138
     */
139
    abstract protected function isComment(int &$length): bool;
140
141
    /**
142
     * Returns whether there's an operator at the current offset.
143
     *
144
     * If this methos returns `true`, it has to set the `$length` parameter to the length of the matched string.
145
     * It may also set `$content` to a string that will be used as a token content.
146
     *
147
     * @param int $length  length of the matched string.
148
     * @param string|null $content optional content instead of the matched string.
149
     *
150
     * @return bool whether there's an operator at the current offset.
151
     */
152
    abstract protected function isOperator(int &$length, ?string &$content): bool;
153
154
    /**
155
     * Returns whether there's an identifier at the current offset.
156
     *
157
     * If this methos returns `true`, it has to set the `$length` parameter to the length of the matched string.
158
     * It may also set `$content` to a string that will be used as a token content.
159
     *
160
     * @param int $length length of the matched string.
161
     * @param string|null $content optional content instead of the matched string.
162
     *
163
     * @return bool whether there's an identifier at the current offset.
164
     */
165
    abstract protected function isIdentifier(int &$length, ?string &$content): bool;
166
167
    /**
168
     * Returns whether there's a string literal at the current offset.
169
     *
170
     * If this methos returns `true`, it has to set the `$length` parameter to the length of the matched string.
171
     * It may also set `$content` to a string that will be used as a token content.
172
     *
173
     * @param int $length  length of the matched string.
174
     * @param string|null $content optional content instead of the matched string.
175
     *
176
     * @return bool whether there's a string literal at the current offset.
177
     */
178
    abstract protected function isStringLiteral(int &$length, ?string &$content): bool;
179
180
    /**
181
     * Returns whether the given string is a keyword.
182
     *
183
     * The method may set `$content` to a string that will be used as a token content.
184
     *
185
     * @param string $string  string to be matched.
186
     * @param string|null $content optional content instead of the matched string.
187
     *
188
     * @return bool whether the given string is a keyword.
189
     */
190
    abstract protected function isKeyword(string $string, ?string &$content): bool;
191
192
    /**
193
     * @param string $sql
194
     */
195
    public function setSql(string $sql): void
196
    {
197
        $this->sql = $sql;
198
    }
199
200
    /**
201
     * Returns whether the longest common prefix equals to the SQL code of the same length at the current offset.
202
     *
203
     * @param string[] $with strings to be tested. The method **will** modify this parameter to speed up lookups.
204
     * @param bool $caseSensitive whether to perform a case sensitive comparison.
205
     * @param int|null $length length of the matched string.
206
     * @param string|null $content matched string.
207
     *
208
     * @return bool whether a match is found.
209
     */
210 17
    protected function startsWithAnyLongest(
211
        array &$with,
212
        bool $caseSensitive,
213
        ?int &$length = null,
214
        ?string &$content = null
215
    ): bool {
216 17
        if (empty($with)) {
217
            return false;
218
        }
219
220 17
        if (!is_array(reset($with))) {
221
            usort($with, function ($string1, $string2) {
222 1
                return mb_strlen($string2, 'UTF-8') - mb_strlen($string1, 'UTF-8');
223 1
            });
224
225 1
            $map = [];
226
227 1
            foreach ($with as $string) {
228 1
                $map[mb_strlen($string, 'UTF-8')][$caseSensitive ? $string : mb_strtoupper($string, 'UTF-8')] = true;
229
            }
230
231 1
            $with = $map;
232
        }
233 17
        foreach ($with as $testLength => $testValues) {
234 17
            $content = $this->substring($testLength, $caseSensitive);
235
236 17
            if (isset($testValues[$content])) {
237 17
                $length = $testLength;
238
239 17
                return true;
240
            }
241
        }
242
243 17
        return false;
244
    }
245
246
    /**
247
     * Returns a string of the given length starting with the specified offset.
248
     *
249
     * @param int $length string length to be returned.
250
     * @param bool $caseSensitive if it's `false`, the string will be uppercased.
251
     * @param int|null $offset SQL code offset, defaults to current if `null` is passed.
252
     *
253
     * @return string result string, it may be empty if there's nothing to return.
254
     */
255 17
    protected function substring(int $length, bool $caseSensitive = true, ?int $offset = null)
256
    {
257 17
        if ($offset === null) {
258 17
            $offset = $this->offset;
259
        }
260
261 17
        if ($offset + $length > $this->length) {
262 17
            return '';
263
        }
264
265 17
        $cacheKey = $offset . ',' . $length;
266
267 17
        if (!isset($this->substrings[$cacheKey . ',1'])) {
268 17
            $this->substrings[$cacheKey . ',1'] = mb_substr($this->sql, $offset, $length, 'UTF-8');
269
        }
270 17
        if (!$caseSensitive && !isset($this->substrings[$cacheKey . ',0'])) {
271
            $this->substrings[$cacheKey . ',0'] = mb_strtoupper($this->substrings[$cacheKey . ',1'], 'UTF-8');
272
        }
273
274 17
        return $this->substrings[$cacheKey . ',' . (int) $caseSensitive];
275
    }
276
277
    /**
278
     * Returns an index after the given string in the SQL code starting with the specified offset.
279
     *
280
     * @param string $string string to be found.
281
     * @param int|null $offset SQL code offset, defaults to current if `null` is passed.
282
     *
283
     * @return int index after the given string or end of string index.
284
     */
285 17
    protected function indexAfter(string $string, ?int $offset = null): int
286
    {
287 17
        if ($offset === null) {
288
            $offset = $this->offset;
289
        }
290
291 17
        if ($offset + mb_strlen($string, 'UTF-8') > $this->length) {
292
            return $this->length;
293
        }
294
295 17
        $afterIndexOf = mb_strpos($this->sql, $string, $offset, 'UTF-8');
296
297 17
        if ($afterIndexOf === false) {
298
            $afterIndexOf = $this->length;
299
        } else {
300 17
            $afterIndexOf += mb_strlen($string, 'UTF-8');
301
        }
302
303 17
        return $afterIndexOf;
304
    }
305
306
    /**
307
     * Determines whether there is a delimited string at the current offset and adds it to the token children.
308
     *
309
     * @param int $length
310
     *
311
     * @return bool
312
     */
313 17
    private function tokenizeDelimitedString(int &$length): bool
314
    {
315 17
        $isIdentifier = $this->isIdentifier($length, $content);
316 17
        $isStringLiteral = !$isIdentifier && $this->isStringLiteral($length, $content);
317
318 17
        if (!$isIdentifier && !$isStringLiteral) {
319 17
            return false;
320
        }
321
322 17
        $this->addTokenFromBuffer();
323
324 17
        $tk = new SqlToken();
325
326 17
        $tk->setType($isIdentifier ? SqlToken::TYPE_IDENTIFIER : SqlToken::TYPE_STRING_LITERAL);
327 17
        $tk->setContent(\is_string($content) ? $content : $this->substring($length));
328 17
        $tk->setStartOffset($this->offset);
329 17
        $tk->setEndOffset($this->offset + $length);
330
331 17
        $this->currentToken[] = $tk;
332
333 17
        return true;
334
    }
335
336
    /**
337
     * Determines whether there is an operator at the current offset and adds it to the token children.
338
     *
339
     * @param int $length
340
     *
341
     * @return bool
342
     */
343 17
    private function tokenizeOperator(int &$length): bool
344
    {
345 17
        if (!$this->isOperator($length, $content)) {
346 17
            return false;
347
        }
348
349 17
        $this->addTokenFromBuffer();
350
351 17
        switch ($this->substring($length)) {
352 17
            case '(':
353 17
                $tk = new SqlToken();
354
355 17
                $tk->setType(SqlToken::TYPE_OPERATOR);
356 17
                $tk->setContent(\is_string($content) ? $content : $this->substring($length));
357 17
                $tk->setStartOffset($this->offset);
358 17
                $tk->setEndOffset($this->offset + $length);
359
360 17
                $this->currentToken[] = $tk;
361
362 17
                $tk1 = new SqlToken();
363 17
                $tk1->setType(SqlToken::TYPE_PARENTHESIS);
364 17
                $this->currentToken[] = $tk1;
365
366 17
                $this->tokenStack->push($this->currentToken[-1]);
367 17
                $this->currentToken = $this->tokenStack->top();
368
369 17
                break;
370
371 17
            case ')':
372 17
                $this->tokenStack->pop();
373 17
                $this->currentToken = $this->tokenStack->top();
374
375 17
                $tk = new SqlToken();
376
377 17
                $tk->setType(SqlToken::TYPE_OPERATOR);
378 17
                $tk->setContent(')');
379 17
                $tk->setStartOffset($this->offset);
380 17
                $tk->setEndOffset($this->offset + $length);
381
382 17
                $this->currentToken[] = $tk;
383
384 17
                break;
385 17
            case ';':
386 5
                if (!$this->currentToken->getHasChildren()) {
0 ignored issues
show
Bug introduced by
The method getHasChildren() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

386
                if (!$this->currentToken->/** @scrutinizer ignore-call */ getHasChildren()) {

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
387
                    break;
388
                }
389
390 5
                $tk = new SqlToken();
391
392 5
                $tk->setType(SqlToken::TYPE_OPERATOR);
393 5
                $tk->setContent(\is_string($content) ? $content : $this->substring($length));
394 5
                $tk->setStartOffset($this->offset);
395 5
                $tk->setEndOffset($this->offset + $length);
396
397 5
                $this->currentToken[] = $tk;
398
399 5
                $this->tokenStack->pop();
400 5
                $this->currentToken = $this->tokenStack->top();
401
402 5
                $tk1 = new SqlToken();
403 5
                $tk1->setType(SqlToken::TYPE_STATEMENT);
404
405 5
                $this->currentToken[] = $tk1;
406 5
                $this->tokenStack->push($this->currentToken[-1]);
407 5
                $this->currentToken = $this->tokenStack->top();
408
409 5
                break;
410
            default:
411 17
                $tk = new SqlToken();
412
413 17
                $tk->setType(SqlToken::TYPE_OPERATOR);
414 17
                $tk->setContent(\is_string($content) ? $content : $this->substring($length));
415 17
                $tk->setStartOffset($this->offset);
416 17
                $tk->setEndOffset($this->offset + $length);
417
418 17
                $this->currentToken[] = $tk;
419
420 17
                break;
421
        }
422
423 17
        return true;
424
    }
425
426
    /**
427
     * Determines a type of text in the buffer, tokenizes it and adds it to the token children.
428
     */
429 17
    private function addTokenFromBuffer(): void
430
    {
431 17
        if ($this->buffer === '') {
432 17
            return;
433
        }
434
435 17
        $isKeyword = $this->isKeyword($this->buffer, $content);
436
437 17
        $tk = new SqlToken();
438
439 17
        $tk->setType($isKeyword ? SqlToken::TYPE_KEYWORD : SqlToken::TYPE_TOKEN);
440 17
        $tk->setContent(\is_string($content) ? $content : $this->buffer);
441 17
        $tk->setStartOffset($this->offset - mb_strlen($this->buffer, 'UTF-8'));
442 17
        $tk->setEndOffset($this->offset);
443
444 17
        $this->currentToken[] = $tk;
445
446 17
        $this->buffer = '';
447 17
    }
448
449
    /**
450
     * Adds the specified length to the current offset.
451
     *
452
     * @param int $length
453
     *
454
     * @throws \InvalidArgumentException
455
     */
456 17
    private function advance(int $length): void
457
    {
458 17
        if ($length <= 0) {
459
            throw new \InvalidArgumentException('Length must be greater than 0.');
460
        }
461
462 17
        $this->offset += $length;
463 17
        $this->substrings = [];
464 17
    }
465
466
    /**
467
     * Returns whether the SQL code is completely traversed.
468
     *
469
     * @return bool
470
     */
471 17
    private function isEof(): bool
472
    {
473 17
        return $this->offset >= $this->length;
474
    }
475
}
476