BufferedQuery   F
last analyzed

Complexity

Total Complexity 68

Size/Duplication

Total Lines 378
Duplicated Lines 0 %

Test Coverage

Coverage 100%

Importance

Changes 2
Bugs 0 Features 0
Metric Value
eloc 138
c 2
b 0
f 0
dl 0
loc 378
ccs 127
cts 127
cp 1
rs 2.96
wmc 68

3 Methods

Rating   Name   Duplication   Size   Complexity  
A setDelimiter() 0 4 1
A __construct() 0 17 1
F extract() 0 287 66

How to fix   Complexity   

Complex Class

Complex classes like BufferedQuery often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use BufferedQuery, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
namespace PhpMyAdmin\SqlParser\Utils;
6
7
use PhpMyAdmin\SqlParser\Context;
8
9
use function array_merge;
10
use function strlen;
11
use function substr;
12
use function trim;
13
14
/**
15
 * Buffer query utilities.
16
 *
17
 * Implements a specialized lexer used to extract statements from large inputs
18
 * that are being buffered. After each statement has been extracted, a lexer or
19
 * a parser may be used.
20
 */
21
class BufferedQuery
22
{
23
    // Constants that describe the current status of the parser.
24
25
    // A string is being parsed.
26
    public const STATUS_STRING = 16; // 0001 0000
27
    public const STATUS_STRING_SINGLE_QUOTES = 17; // 0001 0001
28
    public const STATUS_STRING_DOUBLE_QUOTES = 18; // 0001 0010
29
    public const STATUS_STRING_BACKTICK = 20; // 0001 0100
30
31
    // A comment is being parsed.
32
    public const STATUS_COMMENT = 32; // 0010 0000
33
    public const STATUS_COMMENT_BASH = 33; // 0010 0001
34
    public const STATUS_COMMENT_C = 34; // 0010 0010
35
    public const STATUS_COMMENT_SQL = 36; // 0010 0100
36
37
    /**
38
     * The query that is being processed.
39
     *
40
     * This field can be modified just by appending to it!
41
     */
42
    public string $query = '';
43
44
    /**
45
     * The options of this parser.
46
     *
47
     * @var array<string, bool|string>
48
     * @psalm-var array{delimiter?: non-empty-string, parse_delimiter?: bool, add_delimiter?: bool}
49
     */
50
    public array $options = [];
51
52
    /**
53
     * The last delimiter used.
54
     */
55
    public string $delimiter;
56
57
    /**
58
     * The length of the delimiter.
59
     */
60
    public int $delimiterLen;
61
62
    /**
63
     * The current status of the parser.
64
     */
65
    public int|null $status = null;
66
67
    /**
68
     * The last incomplete query that was extracted.
69
     */
70
    public string $current = '';
71
72
    /**
73
     * @param string                     $query   the query to be parsed
74
     * @param array<string, bool|string> $options the options of this parser
75
     * @psalm-param array{delimiter?: non-empty-string, parse_delimiter?: bool, add_delimiter?: bool} $options
76
     */
77 16
    public function __construct(string $query = '', array $options = [])
78
    {
79
        // Merges specified options with defaults.
80 16
        $this->options = array_merge(
81 16
            [
82
                // The starting delimiter.
83 16
                'delimiter' => ';',
84
                // Whether `DELIMITER` statements should be parsed.
85 16
                'parse_delimiter' => false,
86
                // Whether a delimiter should be added at the end of the statement.
87 16
                'add_delimiter' => false,
88 16
            ],
89 16
            $options,
90 16
        );
91
92 16
        $this->query = $query;
93 16
        $this->setDelimiter($this->options['delimiter']);
94
    }
95
96
    /**
97
     * Sets the delimiter.
98
     *
99
     * Used to update the length of it too.
100
     */
101 16
    public function setDelimiter(string $delimiter): void
102
    {
103 16
        $this->delimiter = $delimiter;
104 16
        $this->delimiterLen = strlen($delimiter);
105
    }
106
107
    /**
108
     * Extracts a statement from the buffer.
109
     *
110
     * @param bool $end whether the end of the buffer was reached
111
     */
112 16
    public function extract(bool $end = false): string|false
113
    {
114
        /**
115
         * The last parsed position.
116
         *
117
         * This is statically defined because it is not used outside anywhere
118
         * outside this method and there is probably a (minor) performance
119
         * improvement to it.
120
         *
121
         * @var int $i
122
         */
123 16
        static $i = 0;
124
125 16
        if (empty($this->query)) {
126 16
            return false;
127
        }
128
129
        /**
130
         * The length of the buffer.
131
         */
132 16
        $len = strlen($this->query);
133
134
        /**
135
         * The last index of the string that is going to be parsed.
136
         *
137
         * There must be a few characters left in the buffer so the parser can
138
         * avoid confusing some symbols that may have multiple meanings.
139
         *
140
         * For example, if the buffer ends in `-` that may be an operator or the
141
         * beginning of a comment.
142
         *
143
         * Another example if the buffer ends in `DELIMITE`. The parser is going
144
         * to require a few more characters because that may be a part of the
145
         * `DELIMITER` keyword or just a column named `DELIMITE`.
146
         *
147
         * Those extra characters are required only if there is more data
148
         * expected (the end of the buffer was not reached).
149
         */
150 16
        $loopLen = $end ? $len : $len - 16;
151
152 16
        for (; $i < $loopLen; ++$i) {
153
            /*
154
             * Handling backslash.
155
             *
156
             * Even if the next character is a special character that should be
157
             * treated differently, because of the preceding backslash, it will
158
             * be ignored.
159
             */
160 16
            if ((($this->status & self::STATUS_COMMENT) === 0) && ($this->query[$i] === '\\')) {
161 8
                $this->current .= $this->query[$i] . ($i + 1 < $len ? $this->query[++$i] : '');
162 8
                continue;
163
            }
164
165
            /*
166
             * Handling special parses statuses.
167
             */
168 16
            if ($this->status === self::STATUS_STRING_SINGLE_QUOTES) {
169
                // Single-quoted strings like 'foo'.
170 10
                if ($this->query[$i] === '\'') {
171 10
                    $this->status = 0;
172
                }
173
174 10
                $this->current .= $this->query[$i];
175 10
                continue;
176
            }
177
178 16
            if ($this->status === self::STATUS_STRING_DOUBLE_QUOTES) {
179
                // Double-quoted strings like "bar".
180 8
                if ($this->query[$i] === '"') {
181 8
                    $this->status = 0;
182
                }
183
184 8
                $this->current .= $this->query[$i];
185 8
                continue;
186
            }
187
188 16
            if ($this->status === self::STATUS_STRING_BACKTICK) {
189 8
                if ($this->query[$i] === '`') {
190 8
                    $this->status = 0;
191
                }
192
193 8
                $this->current .= $this->query[$i];
194 8
                continue;
195
            }
196
197 16
            if (($this->status === self::STATUS_COMMENT_BASH) || ($this->status === self::STATUS_COMMENT_SQL)) {
198
                // Bash-like (#) or SQL-like (-- ) comments end in new line.
199 6
                if ($this->query[$i] === "\n") {
200 6
                    $this->status = 0;
201
                }
202
203 6
                $this->current .= $this->query[$i];
204 6
                continue;
205
            }
206
207 16
            if ($this->status === self::STATUS_COMMENT_C) {
208
                // C-like comments end in */.
209 6
                if (($this->query[$i - 1] === '*') && ($this->query[$i] === '/')) {
210 6
                    $this->status = 0;
211
                }
212
213 6
                $this->current .= $this->query[$i];
214 6
                continue;
215
            }
216
217
            /*
218
             * Checking if a string started.
219
             */
220 16
            if ($this->query[$i] === '\'') {
221 10
                $this->status = self::STATUS_STRING_SINGLE_QUOTES;
222 10
                $this->current .= $this->query[$i];
223 10
                continue;
224
            }
225
226 16
            if ($this->query[$i] === '"') {
227 8
                $this->status = self::STATUS_STRING_DOUBLE_QUOTES;
228 8
                $this->current .= $this->query[$i];
229 8
                continue;
230
            }
231
232 16
            if ($this->query[$i] === '`') {
233 8
                $this->status = self::STATUS_STRING_BACKTICK;
234 8
                $this->current .= $this->query[$i];
235 8
                continue;
236
            }
237
238
            /*
239
             * Checking if a comment started.
240
             */
241 16
            if ($this->query[$i] === '#') {
242 6
                $this->status = self::STATUS_COMMENT_BASH;
243 6
                $this->current .= $this->query[$i];
244 6
                continue;
245
            }
246
247 16
            if ($i + 2 < $len) {
248
                if (
249 16
                    ($this->query[$i] === '-')
250 16
                    && ($this->query[$i + 1] === '-')
251 16
                    && Context::isWhitespace($this->query[$i + 2])
252
                ) {
253 6
                    $this->status = self::STATUS_COMMENT_SQL;
254 6
                    $this->current .= $this->query[$i];
255 6
                    continue;
256
                }
257
258 16
                if (($this->query[$i] === '/') && ($this->query[$i + 1] === '*') && ($this->query[$i + 2] !== '!')) {
259 6
                    $this->status = self::STATUS_COMMENT_C;
260 6
                    $this->current .= $this->query[$i];
261 6
                    continue;
262
                }
263
            }
264
265
            /*
266
             * Handling `DELIMITER` statement.
267
             *
268
             * The code below basically checks for
269
             *     `strtoupper(substr($this->query, $i, 9)) === 'DELIMITER'`
270
             *
271
             * This optimization makes the code about 3 times faster.
272
             *
273
             * `DELIMITER` is not being considered a keyword. The only context
274
             * it has a special meaning is when it is the beginning of a
275
             * statement. This is the reason for the last condition.
276
             */
277
            if (
278 16
                ($i + 9 < $len)
279 16
                && (($this->query[$i] === 'D') || ($this->query[$i] === 'd'))
280 16
                && (($this->query[$i + 1] === 'E') || ($this->query[$i + 1] === 'e'))
281 16
                && (($this->query[$i + 2] === 'L') || ($this->query[$i + 2] === 'l'))
282 16
                && (($this->query[$i + 3] === 'I') || ($this->query[$i + 3] === 'i'))
283 16
                && (($this->query[$i + 4] === 'M') || ($this->query[$i + 4] === 'm'))
284 16
                && (($this->query[$i + 5] === 'I') || ($this->query[$i + 5] === 'i'))
285 16
                && (($this->query[$i + 6] === 'T') || ($this->query[$i + 6] === 't'))
286 16
                && (($this->query[$i + 7] === 'E') || ($this->query[$i + 7] === 'e'))
287 16
                && (($this->query[$i + 8] === 'R') || ($this->query[$i + 8] === 'r'))
288 16
                && Context::isWhitespace($this->query[$i + 9])
289
            ) {
290
                // Saving the current index to be able to revert any parsing
291
                // done in this block.
292 8
                $iBak = $i;
293 8
                $i += 9; // Skipping `DELIMITER`.
294
295
                // Skipping whitespaces.
296 8
                while (($i < $len) && Context::isWhitespace($this->query[$i])) {
297 8
                    ++$i;
298
                }
299
300
                // Parsing the delimiter.
301 8
                $delimiter = '';
302 8
                while (($i < $len) && (! Context::isWhitespace($this->query[$i]))) {
303 8
                    $delimiter .= $this->query[$i++];
304
                }
305
306
                // Checking if the delimiter definition ended.
307
                if (
308 8
                    ($delimiter !== '')
309 8
                    && (($i < $len) && Context::isWhitespace($this->query[$i])
310 8
                    || (($i === $len) && $end))
311
                ) {
312
                    // Saving the delimiter.
313 8
                    $this->setDelimiter($delimiter);
314
315
                    // Whether this statement should be returned or not.
316 8
                    $ret = '';
317 8
                    if (! empty($this->options['parse_delimiter'])) {
318
                        // Appending the `DELIMITER` statement that was just
319
                        // found to the current statement.
320 4
                        $ret = trim(
321 4
                            $this->current . ' ' . substr($this->query, $iBak, $i - $iBak),
322 4
                        );
323
                    }
324
325
                    // Removing the statement that was just extracted from the
326
                    // query.
327 8
                    $this->query = substr($this->query, $i);
328 8
                    $i = 0;
329
330
                    // Resetting the current statement.
331 8
                    $this->current = '';
332
333 8
                    return $ret;
334
                }
335
336
                // Incomplete statement. Reverting
337 2
                $i = $iBak;
338
339 2
                return false;
340
            }
341
342
            /*
343
             * Checking if the current statement finished.
344
             *
345
             * The first letter of the delimiter is being checked as an
346
             * optimization. This code is almost as fast as the one above.
347
             *
348
             * There is no point in checking if two strings match if not even
349
             * the first letter matches.
350
             */
351
            if (
352 16
                ($this->query[$i] === $this->delimiter[0])
353 16
                && (($this->delimiterLen === 1)
354 16
                || (substr($this->query, $i, $this->delimiterLen) === $this->delimiter))
355
            ) {
356
                // Saving the statement that just ended.
357 14
                $ret = $this->current;
358
359
                // If needed, adds a delimiter at the end of the statement.
360 14
                if (! empty($this->options['add_delimiter'])) {
361 10
                    $ret .= $this->delimiter;
362
                }
363
364
                // Removing the statement that was just extracted from the
365
                // query.
366 14
                $this->query = substr($this->query, $i + $this->delimiterLen);
367 14
                $i = 0;
368
369
                // Resetting the current statement.
370 14
                $this->current = '';
371
372
                // Returning the statement.
373 14
                return trim($ret);
374
            }
375
376
            /*
377
             * Appending current character to current statement.
378
             */
379 16
            $this->current .= $this->query[$i];
380
        }
381
382 16
        if ($end && ($i === $len)) {
383
            // If the end of the buffer was reached, the buffer is emptied and
384
            // the current statement that was extracted is returned.
385 12
            $ret = $this->current;
386
387
            // Emptying the buffer.
388 12
            $this->query = '';
389 12
            $i = 0;
390
391
            // Resetting the current statement.
392 12
            $this->current = '';
393
394
            // Returning the statement.
395 12
            return trim($ret);
396
        }
397
398 14
        return '';
399
    }
400
}
401