SQLParser::processSQL()   F
last analyzed

Complexity

Conditions 138
Paths 1193

Size

Total Lines 290
Code Lines 213

Duplication

Lines 12
Ratio 4.14 %

Importance

Changes 2
Bugs 0 Features 0
Metric Value
c 2
b 0
f 0
dl 12
loc 290
rs 2
cc 138
eloc 213
nc 1193
nop 1

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * php-sql-parser.php.
5
 *
6
 * A pure PHP SQL (non validating) parser w/ focus on MySQL dialect of SQL
7
 *
8
 * Copyright (c) 2010-2012, Justin Swanhart
9
 * with contributions by André Rothe <[email protected], [email protected]>
10
 *
11
 * All rights reserved.
12
 *
13
 * Redistribution and use in source and binary forms, with or without modification,
14
 * are permitted provided that the following conditions are met:
15
 *
16
 *   * Redistributions of source code must retain the above copyright notice,
17
 *	 this list of conditions and the following disclaimer.
18
 *   * Redistributions in binary form must reproduce the above copyright notice,
19
 *	 this list of conditions and the following disclaimer in the documentation
20
 *	 and/or other materials provided with the distribution.
21
 *
22
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
23
 * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
24
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
25
 * SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
26
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
27
 * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
28
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
29
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
30
 * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
31
 * DAMAGE.
32
 */
33
namespace SQLParser;
34
35
/**
36
 * This class implements the parser functionality.
37
 *
38
 * @author [email protected]
39
 * @author [email protected]
40
 */
41
class SQLParser extends PHPSQLParserUtils
42
{
43
    private $lexer;
44
45
    public function __construct($sql = false, $calcPositions = false)
46
    {
47
        $this->lexer = new PHPSQLLexer();
48
        if ($sql) {
49
            $this->parse($sql, $calcPositions);
50
        }
51
    }
52
53
    public function parse($sql, $calcPositions = false)
54
    {
55
        #lex the SQL statement
56
        $inputArray = $this->splitSQLIntoTokens($sql);
57
58
        #This is the highest level lexical analysis.  This is the part of the
59
        #code which finds UNION and UNION ALL query parts
60
        $queries = $this->processUnion($inputArray);
61
62
        # If there was no UNION or UNION ALL in the query, then the query is
63
        # stored at $queries[0].
0 ignored issues
show
Unused Code Comprehensibility introduced by
40% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
64
        if (!$this->isUnion($queries)) {
65
            $queries = $this->processSQL($queries[0]);
66
        }
67
68
        # calc the positions of some important tokens
69
        if ($calcPositions) {
70
            $calculator = new PositionCalculator();
71
            $queries = $calculator->setPositionsWithinSQL($sql, $queries);
72
        }
73
74
        # store the parsed queries
75
        $this->parsed = $queries;
0 ignored issues
show
Bug introduced by
The property parsed does not exist. Did you maybe forget to declare it?

In PHP it is possible to write to properties without declaring them. For example, the following is perfectly valid PHP code:

class MyClass { }

$x = new MyClass();
$x->foo = true;

Generally, it is a good practice to explictly declare properties to avoid accidental typos and provide IDE auto-completion:

class MyClass {
    public $foo;
}

$x = new MyClass();
$x->foo = true;
Loading history...
76
77
        return $this->parsed;
78
    }
79
80
    private function processUnion($inputArray)
81
    {
82
        $outputArray = array();
83
84
        #sometimes the parser needs to skip ahead until a particular
85
        #token is found
86
        $skipUntilToken = false;
87
88
        #This is the last type of union used (UNION or UNION ALL)
89
        #indicates a) presence of at least one union in this query
90
        #		  b) the type of union if this is the first or last query
91
        $unionType = false;
92
93
        #Sometimes a "query" consists of more than one query (like a UNION query)
94
        #this array holds all the queries
95
        $queries = array();
96
97
        foreach ($inputArray as $key => $token) {
98
            $trim = trim($token);
99
100
            # overread all tokens till that given token
101
            if ($skipUntilToken) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $skipUntilToken of type false|string is loosely compared to true; this is ambiguous if the string can be empty. You might want to explicitly use !== false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
102
                if ($trim === '') {
103
                    continue; # read the next token
104
                }
105
                if (strtoupper($trim) === $skipUntilToken) {
106
                    $skipUntilToken = false;
107
                    continue; # read the next token
108
                }
109
            }
110
111
            if (strtoupper($trim) !== 'UNION') {
112
                $outputArray[] = $token; # here we get empty tokens, if we remove these, we get problems in parse_sql()
113
                continue;
114
            }
115
116
            $unionType = 'UNION';
117
118
            # we are looking for an ALL token right after UNION
119
            for ($i = $key + 1; $i < count($inputArray); ++$i) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
120
                if (trim($inputArray[$i]) === '') {
121
                    continue;
122
                }
123
                if (strtoupper($inputArray[$i]) !== 'ALL') {
124
                    break;
125
                }
126
                # the other for-loop should overread till "ALL"
127
                $skipUntilToken = 'ALL';
128
                $unionType = 'UNION ALL';
129
            }
130
131
            # store the tokens related to the unionType
132
            $queries[$unionType][] = $outputArray;
133
            $outputArray = array();
134
        }
135
136
        # the query tokens after the last UNION or UNION ALL
137
        # or we don't have an UNION/UNION ALL
138
        if (!empty($outputArray)) {
139
            if ($unionType) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $unionType of type false|string is loosely compared to true; this is ambiguous if the string can be empty. You might want to explicitly use !== false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
140
                $queries[$unionType][] = $outputArray;
141
            } else {
142
                $queries[] = $outputArray;
143
            }
144
        }
145
146
        return $this->processMySQLUnion($queries);
147
    }
148
149
    /** MySQL supports a special form of UNION:
150
     * (select ...)
151
     * union
152
     * (select ...).
153
     *
154
     * This function handles this query syntax.  Only one such subquery
155
     * is supported in each UNION block.  (select)(select)union(select) is not legal.
156
     * The extra queries will be silently ignored.
157
     */
158
    private function processMySQLUnion($queries)
159
    {
160
        $unionTypes = array('UNION', 'UNION ALL');
161
        foreach ($unionTypes as $unionType) {
162
            if (empty($queries[$unionType])) {
163
                continue;
164
            }
165
166
            foreach ($queries[$unionType] as $key => $tokenList) {
167
                foreach ($tokenList as $z => $token) {
168
                    $token = trim($token);
169
                    if ($token === '') {
170
                        continue;
171
                    }
172
173
                    # starts with "(select"
174
                    if (preg_match('/^\\(\\s*select\\s*/i', $token)) {
175
                        $queries[$unionType][$key] = $this->parse($this->removeParenthesisFromStart($token));
176
                        break;
177
                    }
178
179
                    $queries[$unionType][$key] = $this->processSQL($queries[$unionType][$key]);
180
                    break;
181
                }
182
            }
183
        }
184
        # it can be parsed or not
185
        return $queries;
186
    }
187
188
    private function isUnion($queries)
189
    {
190
        $unionTypes = array('UNION', 'UNION ALL');
191
        foreach ($unionTypes as $unionType) {
192
            if (!empty($queries[$unionType])) {
193
                return true;
194
            }
195
        }
196
197
        return false;
198
    }
199
200
    #this function splits up a SQL statement into easy to "parse"
201
    #tokens for the SQL processor
202
    private function splitSQLIntoTokens($sql)
203
    {
204
        return $this->lexer->split($sql);
205
    }
206
207
    /* This function breaks up the SQL statement into logical sections.
208
     Some sections are then further handled by specialized functions.
209
     */
210
    private function processSQL(&$tokens)
211
    {
212
        $prev_category = '';
213
        $token_category = '';
214
        $skip_next = false;
215
        $out = false;
216
217
        $tokenCount = count($tokens);
218
        for ($tokenNumber = 0; $tokenNumber < $tokenCount; ++$tokenNumber) {
219
            $token = $tokens[$tokenNumber];
220
            $trim = trim($token); # this removes also \n and \t!
221
222
            # if it starts with an "(", it should follow a SELECT
223
            if ($trim !== '' && $trim[0] === '(' && $token_category === '') {
224
                $token_category = 'SELECT';
225
            }
226
227
            /* If it isn't obvious, when $skip_next is set, then we ignore the next real
228
             token, that is we ignore whitespace.
229
             */
230
            if ($skip_next) {
231
                if ($trim === '') {
232
                    if ($token_category !== '') { # is this correct??
233
                        $out[$token_category][] = $token;
234
                    }
235
                    continue;
236
                }
237
                #to skip the token we replace it with whitespace
238
                $trim = '';
239
                $token = '';
240
                $skip_next = false;
241
            }
242
243
            $upper = strtoupper($trim);
244
            switch ($upper) {
245
246
            /* Tokens that get their own sections. These keywords have subclauses. */
247
            case 'SELECT':
248
            case 'ORDER':
249
            case 'LIMIT':
250
            case 'SET':
251
            case 'DUPLICATE':
252
            case 'VALUES':
253
            case 'GROUP':
254
            case 'HAVING':
255
            case 'WHERE':
256
            case 'RENAME':
257
            case 'CALL':
258
            case 'PROCEDURE':
259
            case 'FUNCTION':
260
            case 'SERVER':
261
            case 'LOGFILE':
262
            case 'DEFINER':
263
            case 'RETURNS':
264
            case 'TABLESPACE':
265
            case 'TRIGGER':
266
            case 'DO':
267
            case 'PLUGIN':
268
            case 'FROM':
269
            case 'FLUSH':
270
            case 'KILL':
271
            case 'RESET':
272
            case 'START':
273
            case 'STOP':
274
            case 'PURGE':
275
            case 'EXECUTE':
276
            case 'PREPARE':
277
            case 'DEALLOCATE':
278
                if ($trim === 'DEALLOCATE') {
279
                    $skip_next = true;
280
                }
281
                /* this FROM is different from FROM in other DML (not join related) */
282
                if ($token_category === 'PREPARE' && $upper === 'FROM') {
283
                    continue 2;
284
                }
285
286
                $token_category = $upper;
287
                break;
288
289
            case 'DATABASE':
290
            case 'SCHEMA':
291
                if ($prev_category === 'DROP') {
292
                    continue;
293
                }
294
                $token_category = $upper;
295
                break;
296
297 View Code Duplication
            case 'EVENT':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
298
            # issue 71
299
                if ($prev_category === 'DROP' || $prev_category === 'ALTER' || $prev_category === 'CREATE') {
300
                    $token_category = $upper;
301
                }
302
                break;
303
304
            case 'DATA':
305
            # prevent wrong handling of DATA as keyword
306
                if ($prev_category === 'LOAD') {
307
                    $token_category = $upper;
308
                }
309
                break;
310
311
            case 'INTO':
312
            # prevent wrong handling of CACHE within LOAD INDEX INTO CACHE...
313
                if ($prev_category === 'LOAD') {
314
                    $out[$prev_category][] = $upper;
315
                    continue 2;
316
                }
317
                $token_category = $upper;
318
                break;
319
320
            case 'USER':
321
            # prevent wrong processing as keyword
322
                if ($prev_category === 'CREATE' || $prev_category === 'RENAME' || $prev_category === 'DROP') {
323
                    $token_category = $upper;
324
                }
325
                break;
326
327 View Code Duplication
            case 'VIEW':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
328
            # prevent wrong processing as keyword
329
                if ($prev_category === 'CREATE' || $prev_category === 'ALTER' || $prev_category === 'DROP') {
330
                    $token_category = $upper;
331
                }
332
                break;
333
334
            /* These tokens get their own section, but have no subclauses.
335
             These tokens identify the statement but have no specific subclauses of their own. */
336
            case 'DELETE':
337
            case 'ALTER':
338
            case 'INSERT':
339
            case 'REPLACE':
340
            case 'TRUNCATE':
341
            case 'CREATE':
342
            case 'TRUNCATE':
343
            case 'OPTIMIZE':
344
            case 'GRANT':
345
            case 'REVOKE':
346
            case 'SHOW':
347
            case 'HANDLER':
348
            case 'LOAD':
349
            case 'ROLLBACK':
350
            case 'SAVEPOINT':
351
            case 'UNLOCK':
352
            case 'INSTALL':
353
            case 'UNINSTALL':
354
            case 'ANALZYE':
355
            case 'BACKUP':
356
            case 'CHECK':
357
            case 'CHECKSUM':
358
            case 'REPAIR':
359
            case 'RESTORE':
360
            case 'DESCRIBE':
361
            case 'EXPLAIN':
362
            case 'USE':
363
            case 'HELP':
364
                $token_category = $upper; /* set the category in case these get subclauses
365
                                          in a future version of MySQL */
366
                $out[$upper][0] = $upper;
367
                continue 2;
368
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
369
370
            case 'CACHE':
371
                if ($prev_category === '' || $prev_category === 'RESET' || $prev_category === 'FLUSH'
372
                        || $prev_category === 'LOAD') {
373
                    $token_category = $upper;
374
                    continue 2;
375
                }
376
                break;
377
378
            /* This is either LOCK TABLES or SELECT ... LOCK IN SHARE MODE*/
379
            case 'LOCK':
380
                if ($token_category === '') {
381
                    $token_category = $upper;
382
                    $out[$upper][0] = $upper;
383
                } else {
384
                    $trim = 'LOCK IN SHARE MODE';
385
                    $skip_next = true;
386
                    $out['OPTIONS'][] = $trim;
387
                }
388
                continue 2;
389
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
390
391
            case 'USING': /* USING in FROM clause is different from USING w/ prepared statement*/
392
                if ($token_category === 'EXECUTE') {
393
                    $token_category = $upper;
394
                    continue 2;
395
                }
396
                if ($token_category === 'FROM' && !empty($out['DELETE'])) {
397
                    $token_category = $upper;
398
                    continue 2;
399
                }
400
                break;
401
402
            /* DROP TABLE is different from ALTER TABLE DROP ... */
403
            case 'DROP':
404
                if ($token_category !== 'ALTER') {
405
                    $token_category = $upper;
406
                    continue 2;
407
                }
408
                break;
409
410
            case 'FOR':
411
                $skip_next = true;
412
                $out['OPTIONS'][] = 'FOR UPDATE';
413
                continue 2;
414
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
415
416
            case 'UPDATE':
417
                if ($token_category === '') {
418
                    $token_category = $upper;
419
                    continue 2;
420
                }
421
                if ($token_category === 'DUPLICATE') {
422
                    continue 2;
423
                }
424
                break;
425
426
            case 'START':
427
                $trim = 'BEGIN';
0 ignored issues
show
Unused Code introduced by
$trim is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
428
                $out[$upper][0] = $upper;
429
                $skip_next = true;
430
                break;
431
432
            /* These tokens are ignored. */
433
            case 'BY':
434
            case 'ALL':
435
            case 'SHARE':
436
            case 'MODE':
437
            case 'TO':
438
            case ';':
439
                continue 2;
440
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
441
442
            case 'KEY':
443
                if ($token_category === 'DUPLICATE') {
444
                    continue 2;
445
                }
446
                break;
447
448
            /* These tokens set particular options for the statement.  They never stand alone.*/
449
            case 'DISTINCTROW':
0 ignored issues
show
Coding Style introduced by
There must be a comment when fall-through is intentional in a non-empty case body
Loading history...
450
                $trim = 'DISTINCT';
0 ignored issues
show
Unused Code introduced by
$trim is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
451
            case 'DISTINCT':
452
            case 'HIGH_PRIORITY':
453
            case 'LOW_PRIORITY':
454
            case 'DELAYED':
455
            case 'IGNORE':
456
            case 'FORCE':
457
            case 'STRAIGHT_JOIN':
458
            case 'SQL_SMALL_RESULT':
459
            case 'SQL_BIG_RESULT':
460
            case 'QUICK':
461
            case 'SQL_BUFFER_RESULT':
462
            case 'SQL_CACHE':
463
            case 'SQL_NO_CACHE':
464
            case 'SQL_CALC_FOUND_ROWS':
465
                $out['OPTIONS'][] = $upper;
466
                continue 2;
467
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
468
469
            case 'WITH':
470
                if ($token_category === 'GROUP') {
471
                    $skip_next = true;
472
                    $out['OPTIONS'][] = 'WITH ROLLUP';
473
                    continue 2;
474
                }
475
                break;
476
477
            case 'AS':
478
                break;
479
480
            case '':
481
            case ',':
482
            case ';':
483
                break;
484
485
            default:
486
                break;
487
            }
488
489
            # remove obsolete category after union (empty category because of
490
            # empty token before select)
491
            if ($token_category !== '' && ($prev_category === $token_category)) {
492
                $out[$token_category][] = $token;
493
            }
494
495
            $prev_category = $token_category;
496
        }
497
498
        return $this->processSQLParts($out);
499
    }
500
501
    private function processSQLParts($out)
502
    {
503
        if (!$out) {
504
            return false;
505
        }
506
        if (!empty($out['SELECT'])) {
507
            $out['SELECT'] = $this->process_select($out['SELECT']);
508
        }
509
        if (!empty($out['FROM'])) {
510
            $out['FROM'] = $this->process_from($out['FROM']);
511
        }
512
        if (!empty($out['USING'])) {
513
            $out['USING'] = $this->process_from($out['USING']);
514
        }
515
        if (!empty($out['UPDATE'])) {
516
            $out['UPDATE'] = $this->processUpdate($out['UPDATE']);
517
        }
518
        if (!empty($out['GROUP'])) {
519
            # set empty array if we have partial SQL statement
520
            $out['GROUP'] = $this->process_group($out['GROUP'], isset($out['SELECT']) ? $out['SELECT'] : array());
521
        }
522
        if (!empty($out['ORDER'])) {
523
            # set empty array if we have partial SQL statement
524
            $out['ORDER'] = $this->process_order($out['ORDER'], isset($out['SELECT']) ? $out['SELECT'] : array());
525
        }
526
        if (!empty($out['LIMIT'])) {
527
            $out['LIMIT'] = $this->process_limit($out['LIMIT']);
528
        }
529
        if (!empty($out['WHERE'])) {
530
            $out['WHERE'] = $this->process_expr_list($out['WHERE']);
531
        }
532
        if (!empty($out['HAVING'])) {
533
            $out['HAVING'] = $this->process_expr_list($out['HAVING']);
534
        }
535
        if (!empty($out['SET'])) {
536
            $out['SET'] = $this->process_set_list($out['SET'], isset($out['UPDATE']));
537
        }
538
        if (!empty($out['DUPLICATE'])) {
539
            $out['ON DUPLICATE KEY UPDATE'] = $this->process_set_list($out['DUPLICATE']);
0 ignored issues
show
Bug introduced by
The call to process_set_list() misses a required argument $isUpdate.

This check looks for function calls that miss required arguments.

Loading history...
540
            unset($out['DUPLICATE']);
541
        }
542
        if (!empty($out['INSERT'])) {
543
            $out = $this->processInsert($out);
544
        }
545
        if (!empty($out['REPLACE'])) {
546
            $out = $this->processReplace($out);
547
        }
548
        if (!empty($out['DELETE'])) {
549
            $out = $this->process_delete($out);
550
        }
551
        if (!empty($out['VALUES'])) {
552
            $out = $this->process_values($out);
553
        }
554
        if (!empty($out['INTO'])) {
555
            $out = $this->processInto($out);
556
        }
557
        if (!empty($out['DROP'])) {
558
            $out['DROP'] = $this->processDrop($out['DROP']);
559
        }
560
561
        return $out;
562
    }
563
564
    /**
565
     *  A SET list is simply a list of key = value expressions separated by comma (,).
566
     *  This function produces a list of the key/value expressions.
567
     */
568
    private function getAssignment($base_expr)
569
    {
570
        $assignment = $this->process_expr_list($this->splitSQLIntoTokens($base_expr));
571
572
        return array('expr_type' => ExpressionType::EXPRESSION, 'base_expr' => trim($base_expr),
573
                     'sub_tree' => $assignment, );
574
    }
575
576
    private function getVariableType($expression)
577
    {
578
        // $expression must contain only upper-case characters
579
        if ($expression[1] !== '@') {
580
            return ExpressionType::USER_VARIABLE;
581
        }
582
583
        $type = substr($expression, 2, strpos($expression, '.', 2));
584
585
        switch ($type) {
586
        case 'GLOBAL':
587
            $type = ExpressionType::GLOBAL_VARIABLE;
588
            break;
589
        case 'LOCAL':
590
            $type = ExpressionType::LOCAL_VARIABLE;
591
            break;
592
        case 'SESSION':
593
        default:
594
            $type = ExpressionType::SESSION_VARIABLE;
595
            break;
596
        }
597
598
        return $type;
599
    }
600
601
    /**
602
     * It can be UPDATE SET or SET alone.
603
     */
604
    private function process_set_list($tokens, $isUpdate)
605
    {
606
        $result = array();
607
        $baseExpr = '';
608
        $assignment = false;
0 ignored issues
show
Unused Code introduced by
$assignment is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
609
        $varType = false;
610
611
        foreach ($tokens as $token) {
612
            $upper = strtoupper(trim($token));
613
614
            switch ($upper) {
615
            case 'LOCAL':
616
            case 'SESSION':
617
            case 'GLOBAL':
618
                if (!$isUpdate) {
619
                    $varType = $this->getVariableType('@@'.$upper.'.');
620
                    $baseExpr = '';
621
                    continue 2;
622
                }
623
                break;
624
625
            case ',':
626
                $assignment = $this->getAssignment($baseExpr);
627 View Code Duplication
                if (!$isUpdate && $varType !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
628
                    $assignment['sub_tree'][0]['expr_type'] = $varType;
629
                }
630
                $result[] = $assignment;
631
                $baseExpr = '';
632
                $varType = false;
633
                continue 2;
634
635
            default:
636
            }
637
            $baseExpr .= $token;
638
        }
639
640
        if (trim($baseExpr) !== '') {
641
            $assignment = $this->getAssignment($baseExpr);
642 View Code Duplication
            if (!$isUpdate && $varType !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
643
                $assignment['sub_tree'][0]['expr_type'] = $varType;
644
            }
645
            $result[] = $assignment;
646
        }
647
648
        return $result;
649
    }
650
651
    /**
652
     * This function processes the LIMIT section.
653
     * start,end are set.  If only end is provided in the query
654
     * then start is set to 0.
655
     */
656
    private function process_limit($tokens)
657
    {
658
        $rowcount = '';
659
        $offset = '';
660
661
        $comma = -1;
662
        $exchange = false;
663
664
        for ($i = 0; $i < count($tokens); ++$i) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
665
            $trim = trim($tokens[$i]);
666
            if ($trim === ',') {
667
                $comma = $i;
668
                break;
669
            }
670
            if ($trim === 'OFFSET') {
671
                $comma = $i;
672
                $exchange = true;
673
                break;
674
            }
675
        }
676
677 View Code Duplication
        for ($i = 0; $i < $comma; ++$i) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
678
            if ($exchange) {
679
                $rowcount .= $tokens[$i];
680
            } else {
681
                $offset .= $tokens[$i];
682
            }
683
        }
684
685 View Code Duplication
        for ($i = $comma + 1; $i < count($tokens); ++$i) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
686
            if ($exchange) {
687
                $offset .= $tokens[$i];
688
            } else {
689
                $rowcount .= $tokens[$i];
690
            }
691
        }
692
693
        return [
694
            ['expr_type' => 'limit_const', 'base_expr' => trim($offset), 'sub_tree' => false],
695
            ['expr_type' => 'limit_const', 'base_expr' => trim($rowcount), 'sub_tree' => false],
696
        ];
697
    }
698
699
    /**
700
     * This function processes the SELECT section.  It splits the clauses at the commas.
701
     * Each clause is then processed by process_select_expr() and the results are added to
702
     * the expression list.
703
     *
704
     * Finally, at the end, the epxression list is returned.
705
     */
706
    private function process_select(&$tokens)
707
    {
708
        $expression = '';
709
        $expressionList = array();
710
        foreach ($tokens as $token) {
711
            if ($this->isCommaToken($token)) {
712
                $expressionList[] = $this->process_select_expr(trim($expression));
713
                $expression = '';
714
            } else {
715
                $expression .= $token;
716
            }
717
        }
718
        if ($expression) {
719
            $expressionList[] = $this->process_select_expr(trim($expression));
720
        }
721
722
        return $expressionList;
723
    }
724
725
    private function isCommaToken($token)
726
    {
727
        return (trim($token) === ',');
728
    }
729
730
    private function isWhitespaceToken($token)
731
    {
732
        return (trim($token) === '');
733
    }
734
735
    private function isCommentToken($token)
736
    {
737
        return isset($token[0]) && isset($token[1])
738
                && (($token[0] === '-' && $token[1] === '-') || ($token[0] === '/' && $token[1] === '*'));
739
    }
740
741
    private function isColumnReference($out)
742
    {
743
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::COLREF);
744
    }
745
746
    private function isReserved($out)
747
    {
748
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::RESERVED);
749
    }
750
751
    private function isConstant($out)
752
    {
753
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::CONSTANT);
754
    }
755
756
    private function isAggregateFunction($out)
757
    {
758
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::AGGREGATE_FUNCTION);
759
    }
760
761
    private function isFunction($out)
762
    {
763
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::SIMPLE_FUNCTION);
764
    }
765
766
    private function isExpression($out)
767
    {
768
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::EXPRESSION);
769
    }
770
771
    private function isBracketExpression($out)
772
    {
773
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::BRACKET_EXPRESSION);
774
    }
775
776
    private function isSubQuery($out)
777
    {
778
        return (isset($out['expr_type']) && $out['expr_type'] === ExpressionType::SUBQUERY);
779
    }
780
781
    /**
782
     * This fuction processes each SELECT clause.  We determine what (if any) alias
783
     * is provided, and we set the type of expression.
784
     */
785
    private function process_select_expr($expression)
786
    {
787
        $tokens = $this->splitSQLIntoTokens($expression);
788
        $token_count = count($tokens);
789
790
        /* Determine if there is an explicit alias after the AS clause.
791
         If AS is found, then the next non-whitespace token is captured as the alias.
792
         The tokens after (and including) the AS are removed.
793
         */
794
        $base_expr = '';
795
        $stripped = array();
796
        $capture = false;
797
        $alias = false;
798
        $processed = false;
0 ignored issues
show
Unused Code introduced by
$processed is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
799
800
        for ($i = 0; $i < $token_count; ++$i) {
801
            $token = $tokens[$i];
802
            $upper = strtoupper($token);
803
804
            if ($upper === 'AS') {
805
                $alias = array('as' => true, 'name' => '', 'base_expr' => $token);
806
                $tokens[$i] = '';
807
                $capture = true;
808
                continue;
809
            }
810
811
            if (!$this->isWhitespaceToken($upper)) {
812
                $stripped[] = $token;
813
            }
814
815
            // we have an explicit AS, next one can be the alias
816
            // but also a comment!
817
            if ($capture) {
818
                if (!$this->isWhitespaceToken($upper) && !$this->isCommentToken($upper)) {
819
                    $alias['name'] .= $token;
820
                    array_pop($stripped);
821
                }
822
                $alias['base_expr'] .= $token;
823
                $tokens[$i] = '';
824
                continue;
825
            }
826
827
            $base_expr .= $token;
828
        }
829
830
        $stripped = $this->process_expr_list($stripped);
831
832
        # TODO: the last part can also be a comment, don't use array_pop
833
834
        # we remove the last token, if it is a colref,
835
        # it can be an alias without an AS
836
        $last = array_pop($stripped);
837
        if (!$alias && $this->isColumnReference($last)) {
838
839
            # TODO: it can be a comment, don't use array_pop
840
841
            # check the token before the colref
842
            $prev = array_pop($stripped);
843
844
            if ($this->isReserved($prev) || $this->isConstant($prev) || $this->isAggregateFunction($prev)
845
                    || $this->isFunction($prev) || $this->isExpression($prev) || $this->isSubQuery($prev)
846
                    || $this->isColumnReference($prev) || $this->isBracketExpression($prev)) {
847
                $alias = array('as' => false, 'name' => trim($last['base_expr']),
848
                               'base_expr' => trim($last['base_expr']), );
849
                #remove the last token
850
                array_pop($tokens);
851
                $base_expr = implode('', $tokens);
852
            }
853
        }
854
855
        if (!$alias) {
856
            $base_expr = implode('', $tokens);
857
        } else {
858
            /* remove escape from the alias */
859
            $alias['name'] = $this->revokeEscaping(trim($alias['name']));
860
            $alias['base_expr'] = trim($alias['base_expr']);
861
        }
862
863
        # TODO: this is always done with $stripped, how we do it twice?
864
        $processed = $this->process_expr_list($tokens);
865
866
        # if there is only one part, we copy the expr_type
867
        # in all other cases we use "expression" as global type
868
        $type = ExpressionType::EXPRESSION;
869
        if (count($processed) === 1) {
870
            if (!$this->isSubQuery($processed[0])) {
871
                $type = $processed[0]['expr_type'];
872
                $base_expr = $processed[0]['base_expr'];
873
                $processed = $processed[0]['sub_tree']; // it can be FALSE
874
            }
875
        }
876
877
        return array('expr_type' => $type, 'alias' => $alias, 'base_expr' => trim($base_expr),
878
                     'sub_tree' => $processed, );
879
    }
880
881
    /**
882
     * This method handles the FROM clause.
883
     */
884
    private function process_from(&$tokens)
885
    {
886
        $parseInfo = $this->initParseInfoForFrom();
887
        $expr = array();
888
889
        $skip_next = false;
890
        $i = 0;
891
892
        foreach ($tokens as $token) {
893
            $upper = strtoupper(trim($token));
894
895
            if ($skip_next && $token !== '') {
896
                ++$parseInfo['token_count'];
897
                $skip_next = false;
898
                continue;
899
            } else {
900
                if ($skip_next) {
901
                    continue;
902
                }
903
            }
904
905
            switch ($upper) {
906
            case 'OUTER':
907
            case 'LEFT':
908
            case 'RIGHT':
909
            case 'NATURAL':
910
            case 'CROSS':
911
            case ',':
912
            case 'JOIN':
913
            case 'INNER':
914
                break;
915
916
            default:
917
                $parseInfo['expression'] .= $token;
918
                if ($parseInfo['ref_type'] !== false) { # all after ON / USING
919
                    $parseInfo['ref_expr'] .= $token;
920
                }
921
                break;
922
            }
923
924
            switch ($upper) {
925
            case 'AS':
926
                $parseInfo['alias'] = array('as' => true, 'name' => '', 'base_expr' => $token);
927
                ++$parseInfo['token_count'];
928
                $n = 1;
929
                $str = '';
930
                while ($str == '') {
931
                    $parseInfo['alias']['base_expr'] .= ($tokens[$i + $n] === '' ? ' ' : $tokens[$i + $n]);
932
                    $str = trim($tokens[$i + $n]);
933
                    ++$n;
934
                }
935
                $parseInfo['alias']['name'] = $str;
936
                $parseInfo['alias']['base_expr'] = trim($parseInfo['alias']['base_expr']);
937
                continue;
938
939
            case 'INDEX':
940
                if ($token_category == 'CREATE') {
0 ignored issues
show
Bug introduced by
The variable $token_category does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
941
                    $token_category = $upper;
942
                    continue 2;
943
                }
944
945
                break;
946
947
            case 'USING':
948
            case 'ON':
0 ignored issues
show
Coding Style introduced by
There must be a comment when fall-through is intentional in a non-empty case body
Loading history...
949
                $parseInfo['ref_type'] = $upper;
950
                $parseInfo['ref_expr'] = '';
951
952
            case 'CROSS':
953
            case 'USE':
954
            case 'FORCE':
955
            case 'IGNORE':
956
            case 'INNER':
957
            case 'OUTER':
958
                $parseInfo['token_count']++;
959
                continue;
960
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
961
962
            case 'FOR':
963
                $parseInfo['token_count']++;
964
                $skip_next = true;
965
                continue;
966
                break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
967
968
            case 'LEFT':
969
            case 'RIGHT':
970
            case 'STRAIGHT_JOIN':
971
                $parseInfo['next_join_type'] = $upper;
972
                break;
973
974
            case ',':
0 ignored issues
show
Coding Style introduced by
There must be a comment when fall-through is intentional in a non-empty case body
Loading history...
975
                $parseInfo['next_join_type'] = 'CROSS';
976
977
            case 'JOIN':
978
                if ($parseInfo['subquery']) {
979
                    $parseInfo['sub_tree'] = $this->parse($this->removeParenthesisFromStart($parseInfo['subquery']));
980
                    $parseInfo['expression'] = $parseInfo['subquery'];
981
                }
982
983
                $expr[] = $this->processFromExpression($parseInfo);
984
                $parseInfo = $this->initParseInfoForFrom($parseInfo);
985
                break;
986
987
            default:
988
                if ($upper === '') {
989
                    continue; # ends the switch statement!
990
                }
991
992
                if ($parseInfo['token_count'] === 0) {
993
                    if ($parseInfo['table'] === '') {
994
                        $parseInfo['table'] = $token;
995
                    }
996
                } elseif ($parseInfo['token_count'] === 1) {
997
                    $parseInfo['alias'] = array('as' => false, 'name' => trim($token), 'base_expr' => trim($token));
998
                }
999
                ++$parseInfo['token_count'];
1000
                break;
1001
            }
1002
            ++$i;
1003
        }
1004
1005
        $expr[] = $this->processFromExpression($parseInfo);
1006
1007
        return $expr;
1008
    }
1009
1010
    private function initParseInfoForFrom($parseInfo = false)
1011
    {
1012
        # first init
1013
        if ($parseInfo === false) {
1014
            $parseInfo = array('join_type' => '', 'saved_join_type' => 'JOIN');
1015
        }
1016
        # loop init
1017
        return array('expression' => '', 'token_count' => 0, 'table' => '', 'alias' => false, 'join_type' => '',
1018
                     'next_join_type' => '', 'saved_join_type' => $parseInfo['saved_join_type'],
1019
                     'ref_type' => false, 'ref_expr' => false, 'base_expr' => false, 'sub_tree' => false,
1020
                     'subquery' => '', );
1021
    }
1022
1023
    private function processFromExpression(&$parseInfo)
1024
    {
1025
        $res = array();
1026
1027
        # exchange the join types (join_type is save now, saved_join_type holds the next one)
1028
        $parseInfo['join_type'] = $parseInfo['saved_join_type']; # initialized with JOIN
1029
        $parseInfo['saved_join_type'] = ($parseInfo['next_join_type'] ? $parseInfo['next_join_type'] : 'JOIN');
1030
1031
        # we have a reg_expr, so we have to parse it
1032
        if ($parseInfo['ref_expr'] !== false) {
1033
            $unparsed = $this->splitSQLIntoTokens($this->removeParenthesisFromStart($parseInfo['ref_expr']));
1034
1035
            // here we can get a comma separated list
1036
            foreach ($unparsed as $k => $v) {
1037
                if ($this->isCommaToken($v)) {
1038
                    $unparsed[$k] = '';
1039
                }
1040
            }
1041
            $parseInfo['ref_expr'] = $this->process_expr_list($unparsed);
1042
        }
1043
1044
        # there is an expression, we have to parse it
1045
        if (substr(trim($parseInfo['table']), 0, 1) == '(') {
1046
            $parseInfo['expression'] = $this->removeParenthesisFromStart($parseInfo['table']);
1047
1048
            if (preg_match('/^\\s*select/i', $parseInfo['expression'])) {
1049
                $parseInfo['sub_tree'] = $this->parse($parseInfo['expression']);
1050
                $res['expr_type'] = ExpressionType::SUBQUERY;
1051
            } else {
1052
                $tmp = $this->splitSQLIntoTokens($parseInfo['expression']);
1053
                $parseInfo['sub_tree'] = $this->process_from($tmp);
1054
                $res['expr_type'] = ExpressionType::TABLE_EXPRESSION;
1055
            }
1056
        } else {
1057
            $res['expr_type'] = ExpressionType::TABLE;
1058
            $res['table'] = $parseInfo['table'];
1059
        }
1060
1061
        $res['alias'] = $parseInfo['alias'];
1062
        $res['join_type'] = $parseInfo['join_type'];
1063
        $res['ref_type'] = $parseInfo['ref_type'];
1064
        $res['ref_clause'] = $parseInfo['ref_expr'];
1065
        $res['base_expr'] = trim($parseInfo['expression']);
1066
        $res['sub_tree'] = $parseInfo['sub_tree'];
1067
1068
        return $res;
1069
    }
1070
1071
    private function processOrderExpression(&$parseInfo, $select)
1072
    {
1073
        $parseInfo['expr'] = trim($parseInfo['expr']);
1074
1075
        if ($parseInfo['expr'] === '') {
1076
            return false;
1077
        }
1078
1079
        $parseInfo['expr'] = trim($this->revokeEscaping($parseInfo['expr']));
1080
1081
        if (is_numeric($parseInfo['expr'])) {
1082
            $parseInfo['type'] = ExpressionType::POSITION;
1083
        } else {
1084
            #search to see if the expression matches an alias
1085
            foreach ($select as $clause) {
1086
                if (!$clause['alias']) {
1087
                    continue;
1088
                }
1089
                if ($clause['alias']['name'] === $parseInfo['expr']) {
1090
                    $parseInfo['type'] = ExpressionType::ALIAS;
1091
                }
1092
            }
1093
        }
1094
1095
        if ($parseInfo['type'] === ExpressionType::EXPRESSION) {
1096
            $expr = $this->process_select_expr($parseInfo['expr']);
1097
            $expr['direction'] = $parseInfo['dir'];
1098
            unset($expr['alias']);
1099
1100
            return $expr;
1101
        }
1102
1103
        return array('expr_type' => $parseInfo['type'], 'base_expr' => $parseInfo['expr'],
1104
                     'direction' => $parseInfo['dir'], );
1105
    }
1106
1107
    private function initParseInfoForOrder()
1108
    {
1109
        return array('expr' => '', 'dir' => 'ASC', 'type' => ExpressionType::EXPRESSION);
1110
    }
1111
1112
    /**
1113
     * This method handles the ORDER BY clause.
1114
     */
1115
    private function process_order($tokens, $select)
1116
    {
1117
        $out = array();
1118
        $parseInfo = $this->initParseInfoForOrder();
1119
1120
        if (!$tokens) {
1121
            return false;
1122
        }
1123
1124
        foreach ($tokens as $token) {
1125
            $upper = strtoupper(trim($token));
1126
            switch ($upper) {
1127
            case ',':
1128
                $out[] = $this->processOrderExpression($parseInfo, $select);
1129
                $parseInfo = $this->initParseInfoForOrder();
1130
                break;
1131
1132
            case 'DESC':
1133
                $parseInfo['dir'] = 'DESC';
1134
                break;
1135
1136
            case 'ASC':
1137
                $parseInfo['dir'] = 'ASC';
1138
                break;
1139
1140
            default:
1141
                $parseInfo['expr'] .= $token;
1142
1143
            }
1144
        }
1145
1146
        $out[] = $this->processOrderExpression($parseInfo, $select);
1147
1148
        return $out;
1149
    }
1150
1151
    /**
1152
     * This method handles the GROUP BY clause.
1153
     */
1154
    private function process_group($tokens, $select)
1155
    {
1156
        $out = array();
1157
        $parseInfo = $this->initParseInfoForOrder();
1158
1159
        if (!$tokens) {
1160
            return false;
1161
        }
1162
1163
        foreach ($tokens as $token) {
1164
            $trim = strtoupper(trim($token));
1165
            switch ($trim) {
1166
            case ',':
1167
                $parsed = $this->processOrderExpression($parseInfo, $select);
1168
                unset($parsed['direction']);
1169
1170
                $out[] = $parsed;
1171
                $parseInfo = $this->initParseInfoForOrder();
1172
                break;
1173
            default:
1174
                $parseInfo['expr'] .= $token;
1175
1176
            }
1177
        }
1178
1179
        $parsed = $this->processOrderExpression($parseInfo, $select);
1180
        unset($parsed['direction']);
1181
        $out[] = $parsed;
1182
1183
        return $out;
1184
    }
1185
1186
    /**
1187
     * Some sections are just lists of expressions, like the WHERE and HAVING clauses.
1188
     * This function processes these sections.  Recursive.
1189
     */
1190
    private function process_expr_list($tokens)
1191
    {
1192
        $resultList = array();
1193
        $skip_next = false;
1194
        $prev = new ExpressionToken();
1195
1196
        foreach ($tokens as $k => $v) {
1197
            $curr = new ExpressionToken($k, $v);
1198
1199
            if ($curr->isWhitespaceToken()) {
1200
                continue;
1201
            }
1202
1203
            if ($skip_next) {
1204
                # skip the next non-whitespace token
1205
                $skip_next = false;
1206
                continue;
1207
            }
1208
1209
            /* is it a subquery?*/
1210
            if ($curr->isSubQueryToken()) {
1211
                $curr->setSubTree($this->parse($this->removeParenthesisFromStart($curr->getTrim())));
1212
                $curr->setTokenType(ExpressionType::SUBQUERY);
1213
            } elseif ($curr->isEnclosedWithinParenthesis()) {
1214
                /* is it an in-list? */
1215
1216
                $localTokenList = $this->splitSQLIntoTokens($this->removeParenthesisFromStart($curr->getTrim()));
1217
1218
                if ($prev->getUpper() === 'IN') {
1219 View Code Duplication
                    foreach ($localTokenList as $k => $v) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1220
                        $tmpToken = new ExpressionToken($k, $v);
1221
                        if ($tmpToken->isCommaToken()) {
1222
                            unset($localTokenList[$k]);
1223
                        }
1224
                    }
1225
1226
                    $localTokenList = array_values($localTokenList);
1227
                    $curr->setSubTree($this->process_expr_list($localTokenList));
1228
                    $curr->setTokenType(ExpressionType::IN_LIST);
1229
                } elseif ($prev->getUpper() === 'AGAINST') {
1230
                    $match_mode = false;
1231
                    foreach ($localTokenList as $k => $v) {
1232
                        $tmpToken = new ExpressionToken($k, $v);
1233
                        switch ($tmpToken->getUpper()) {
1234
                        case 'WITH':
1235
                            $match_mode = 'WITH QUERY EXPANSION';
1236
                            break;
1237
                        case 'IN':
1238
                            $match_mode = 'IN BOOLEAN MODE';
1239
                            break;
1240
1241
                        default:
1242
                        }
1243
1244
                        if ($match_mode !== false) {
1245
                            unset($localTokenList[$k]);
1246
                        }
1247
                    }
1248
1249
                    $tmpToken = $this->process_expr_list($localTokenList);
1250
1251
                    if ($match_mode !== false) {
1252
                        $match_mode = new ExpressionToken(0, $match_mode);
1253
                        $match_mode->setTokenType(ExpressionType::MATCH_MODE);
1254
                        $tmpToken[] = $match_mode->toArray();
1255
                    }
1256
1257
                    $curr->setSubTree($tmpToken);
1258
                    $curr->setTokenType(ExpressionType::MATCH_ARGUMENTS);
1259
                    $prev->setTokenType(ExpressionType::SIMPLE_FUNCTION);
1260
                } elseif ($prev->isColumnReference() || $prev->isFunction() || $prev->isAggregateFunction()) {
1261
1262
                    # if we have a colref followed by a parenthesis pair,
1263
                    # it isn't a colref, it is a user-function
1264 View Code Duplication
                    foreach ($localTokenList as $k => $v) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1265
                        $tmpToken = new ExpressionToken($k, $v);
1266
                        if ($tmpToken->isCommaToken()) {
1267
                            unset($localTokenList[$k]);
1268
                        }
1269
                    }
1270
1271
                    $localTokenList = array_values($localTokenList);
1272
                    $curr->setSubTree($this->process_expr_list($localTokenList));
1273
1274
                    $prev->setSubTree($curr->getSubTree());
1275
                    if ($prev->isColumnReference()) {
1276
                        $prev->setTokenType(ExpressionType::SIMPLE_FUNCTION);
1277
                    }
1278
1279
                    array_pop($resultList);
1280
                    $curr = $prev;
1281
                }
1282
1283
                # we have parenthesis, but it seems to be an expression
1284
                if ($curr->isUnspecified()) {
1285
                    $curr->setSubTree($this->process_expr_list($localTokenList));
1286
                    $curr->setTokenType(ExpressionType::BRACKET_EXPRESSION);
1287
                }
1288
            } elseif ($curr->isVariableToken()) {
1289
                // a variable
1290
                $curr->setTokenType($this->getVariableType($curr->getUpper()));
1291
                $curr->setSubTree(false);
1292
            } else {
1293
                /* it is either an operator, a colref or a constant */
1294
                switch ($curr->getUpper()) {
1295
1296
                case '*':
1297
                    $curr->setSubTree(false); #no subtree
1298
1299
                    # single or first element of expression list -> all-column-alias
1300
                    if (empty($resultList)) {
1301
                        $curr->setTokenType(ExpressionType::COLREF);
1302
                        break;
1303
                    }
1304
1305
                    # if the last token is colref, const or expression
1306
                    # then * is an operator
1307
                    # but if the previous colref ends with a dot, the * is the all-columns-alias
1308
                    if (!$prev->isColumnReference() && !$prev->isConstant() && !$prev->isExpression()
1309
                            && !$prev->isBracketExpression()) {
1310
                        $curr->setTokenType(ExpressionType::COLREF);
1311
                        break;
1312
                    }
1313
1314
                    if ($prev->isColumnReference() && $prev->endsWith('.')) {
1315
                        $prev->addToken('*'); # tablealias dot *
1316
                        continue 2; # skip the current token
1317
                    }
1318
1319
                    $curr->setTokenType(ExpressionType::OPERATOR);
1320
                    break;
1321
1322
                case 'AND':
1323
                case '&&':
1324
                case 'BETWEEN':
1325
                case 'AND':
1326
                case 'BINARY':
1327
                case '&':
1328
                case '~':
1329
                case '|':
1330
                case '^':
1331
                case 'DIV':
1332
                case '/':
1333
                case '<=>':
1334
                case '=':
1335
                case '>=':
1336
                case '>':
1337
                case 'IS':
1338
                case 'NOT':
1339
                case '<<':
1340
                case '<=':
1341
                case '<':
1342
                case 'LIKE':
1343
                case '%':
1344
                case '!=':
1345
                case '<>':
1346
                case 'REGEXP':
1347
                case '!':
1348
                case '||':
1349
                case 'OR':
1350
                case '>>':
1351
                case 'RLIKE':
1352
                case 'SOUNDS':
1353
                case 'XOR':
1354
                case 'IN':
1355
                    $curr->setSubTree(false);
1356
                    $curr->setTokenType(ExpressionType::OPERATOR);
1357
                    break;
1358
1359
                case 'NULL':
1360
                    $curr->setSubTree(false);
1361
                    $curr->setTokenType(ExpressionType::CONSTANT);
1362
                    break;
1363
1364
                case '-':
1365
                case '+':
1366
                // differ between preceding sign and operator
1367
                    $curr->setSubTree(false);
1368
1369
                    if ($prev->isColumnReference() || $prev->isFunction() || $prev->isAggregateFunction()
1370
                            || $prev->isConstant() || $prev->isSubQuery() || $prev->isExpression()
1371
                            || $prev->isBracketExpression()) {
1372
                        $curr->setTokenType(ExpressionType::OPERATOR);
1373
                    } else {
1374
                        $curr->setTokenType(ExpressionType::SIGN);
1375
                    }
1376
                    break;
1377
1378
                default:
1379
                    $curr->setSubTree(false);
1380
1381
                    switch ($curr->getToken(0)) {
0 ignored issues
show
Documentation introduced by
0 is of type integer, but the function expects a boolean.

It seems like the type of the argument is not accepted by the function/method which you are calling.

In some cases, in particular if PHP’s automatic type-juggling kicks in this might be fine. In other cases, however this might be a bug.

We suggest to add an explicit type cast like in the following example:

function acceptsInteger($int) { }

$x = '123'; // string "123"

// Instead of
acceptsInteger($x);

// we recommend to use
acceptsInteger((integer) $x);
Loading history...
1382
                    case "'":
1383
                    case '"':
1384
                    # it is a string literal
1385
                        $curr->setTokenType(ExpressionType::CONSTANT);
1386
                        break;
1387
                    case '`':
1388
                    # it is an escaped colum name
1389
                        $curr->setTokenType(ExpressionType::COLREF);
1390
                        break;
1391
1392
                    default:
1393
                        if (is_numeric($curr->getToken())) {
1394
                            if ($prev->isSign()) {
1395
                                $prev->addToken($curr->getToken()); # it is a negative numeric constant
1396
                                $prev->setTokenType(ExpressionType::CONSTANT);
1397
                                continue 3;
1398
                                # skip current token
1399
                            } else {
1400
                                $curr->setTokenType(ExpressionType::CONSTANT);
1401
                            }
1402
                        } else {
1403
                            $curr->setTokenType(ExpressionType::COLREF);
1404
                        }
1405
                        break;
1406
                    }
1407
                }
1408
            }
1409
1410
            /* is a reserved word? */
1411
            if (!$curr->isOperator() && !$curr->isInList() && !$curr->isFunction() && !$curr->isAggregateFunction()
1412
                    && in_array($curr->getUpper(), parent::$reserved)) {
1413
                if (in_array($curr->getUpper(), parent::$aggregateFunctions)) {
1414
                    $curr->setTokenType(ExpressionType::AGGREGATE_FUNCTION);
1415
                } elseif ($curr->getUpper() === 'NULL') {
1416
                    // it is a reserved word, but we would like to set it as constant
1417
                    $curr->setTokenType(ExpressionType::CONSTANT);
1418
                } else {
1419
                    if (in_array($curr->getUpper(), parent::$parameterizedFunctions)) {
1420
                        // issue 60: check functions with parameters
1421
                        // -> colref (we check parameters later)
1422
                        // -> if there is no parameter, we leave the colref
1423
                        $curr->setTokenType(ExpressionType::COLREF);
1424
                    } elseif (in_array($curr->getUpper(), parent::$functions)) {
1425
                        $curr->setTokenType(ExpressionType::SIMPLE_FUNCTION);
1426
                    } else {
1427
                        $curr->setTokenType(ExpressionType::RESERVED);
1428
                    }
1429
                }
1430
            }
1431
1432
            if ($curr->isUnspecified()) {
1433
                $curr->setTokenType(ExpressionType::EXPRESSION);
1434
                $curr->setSubTree($this->process_expr_list($this->splitSQLIntoTokens($curr->getTrim())));
1435
            }
1436
1437
            $resultList[] = $curr;
1438
            $prev = $curr;
1439
        } // end of for-loop
1440
1441
        return $this->toArray($resultList);
1442
    }
1443
1444
    /**
1445
     * This method processes UPDATE statements.
1446
     */
1447
    private function processUpdate($tokenList)
1448
    {
1449
        return $this->process_from($tokenList);
1450
    }
1451
1452
    /**
1453
     * This method handles DELETE statements.
1454
     */
1455
    private function process_delete($tokens)
1456
    {
1457
        $tables = array();
1458
        $del = $tokens['DELETE'];
0 ignored issues
show
Unused Code introduced by
$del is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
1459
1460
        foreach ($tokens['DELETE'] as $expression) {
1461
            if ($expression !== 'DELETE' && trim($expression, ' .*') !== '' && !$this->isCommaToken($expression)) {
1462
                $tables[] = trim($expression, '.* ');
1463
            }
1464
        }
1465
1466
        if (empty($tables)) {
1467
            foreach ($tokens['FROM'] as $table) {
1468
                $tables[] = $table['table'];
1469
            }
1470
        }
1471
1472
        $tokens['DELETE'] = array('TABLES' => $tables);
1473
1474
        return $tokens;
1475
    }
1476
1477
    /**
1478
     * This method handles REPLACE statements.
1479
     */
1480
    private function processReplace($tokenList)
1481
    {
1482
        return $this->processInsert($tokenList, 'REPLACE');
0 ignored issues
show
Unused Code introduced by
The call to SQLParser::processInsert() has too many arguments starting with 'REPLACE'.

This check compares calls to functions or methods with their respective definitions. If the call has more arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress.

In this case you can add the @ignore PhpDoc annotation to the duplicate definition and it will be ignored.

Loading history...
1483
    }
1484
1485
    /**
1486
     * This method handles INSERT statements.
1487
     */
1488
    private function processInsert($tokenList)
1489
    {
1490
        return $this->processInsertOrReplace($tokenList, 'INSERT');
1491
    }
1492
1493
    /**
1494
     * This method handles INSERT and REPLACE statements.
1495
     */
1496
    private function processInsertOrReplace($tokenList, $token_category)
1497
    {
1498
        $table = '';
1499
        $cols = array();
1500
1501
        $into = $tokenList['INTO'];
1502
        foreach ($into as $token) {
1503
            if ($this->isWhitespaceToken($token)) {
1504
                continue;
1505
            }
1506
            if ($table === '') {
1507
                $table = $token;
1508
            } elseif (empty($cols)) {
1509
                $cols[] = $token;
1510
            }
1511
        }
1512
1513
        if (empty($cols)) {
1514
            $cols = false;
1515
        } else {
1516
            $columns = explode(',', $this->removeParenthesisFromStart($cols[0]));
1517
            $cols = array();
1518
            foreach ($columns as $k => $v) {
1519
                $cols[] = array('expr_type' => ExpressionType::COLREF, 'base_expr' => trim($v));
1520
            }
1521
        }
1522
1523
        unset($tokenList['INTO']);
1524
        $tokenList[$token_category][0] = array('table' => $table, 'columns' => $cols, 'base_expr' => $table);
1525
1526
        return $tokenList;
1527
    }
1528
1529
    private function process_record($unparsed)
1530
    {
1531
        $unparsed = $this->removeParenthesisFromStart($unparsed);
1532
        $values = $this->splitSQLIntoTokens($unparsed);
1533
1534
        foreach ($values as $k => $v) {
1535
            if ($this->isCommaToken($v)) {
1536
                $values[$k] = '';
1537
            }
1538
        }
1539
1540
        return $this->process_expr_list($values);
1541
    }
1542
1543
    /**
1544
     * This method handles VALUES parts (from INSERT).
1545
     */
1546
    private function process_values($tokens)
1547
    {
1548
        $unparsed = '';
1549
        foreach ($tokens['VALUES'] as $k => $v) {
1550
            if ($this->isWhitespaceToken($v)) {
1551
                continue;
1552
            }
1553
            $unparsed .= $v;
1554
        }
1555
1556
        $values = $this->splitSQLIntoTokens($unparsed);
1557
1558
        $parsed = array();
0 ignored issues
show
Unused Code introduced by
$parsed is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
1559
        foreach ($values as $k => $v) {
1560
            if ($this->isCommaToken($v)) {
1561
                unset($values[$k]);
1562
            } else {
1563
                $values[$k] = array('expr_type' => ExpressionType::RECORD, 'base_expr' => $v,
1564
                                    'data' => $this->process_record($v), );
1565
            }
1566
        }
1567
1568
        $tokens['VALUES'] = array_values($values);
1569
1570
        return $tokens;
1571
    }
1572
1573
    /**
1574
     * TODO: This is a dummy function, we cannot parse INTO as part of SELECT
1575
     * at the moment.
1576
     */
1577
    private function processInto($tokenList)
1578
    {
1579
        $unparsed = $tokenList['INTO'];
1580
        foreach ($unparsed as $k => $token) {
1581
            if ($this->isWhitespaceToken($token) || $this->isCommaToken($token)) {
1582
                unset($unparsed[$k]);
1583
            }
1584
        }
1585
        $tokenList['INTO'] = array_values($unparsed);
1586
1587
        return $tokenList;
1588
    }
1589
1590
    private function processDrop($tokenList)
1591
    {
1592
        $skip = false;
1593
        $warning = true;
1594
        $base_expr = '';
1595
        $expr_type = false;
1596
        $option = false;
1597
        $resultList = array();
1598
1599
        foreach ($tokenList as $k => $v) {
1600
            $token = new ExpressionToken($k, $v);
1601
1602
            if ($token->isWhitespaceToken()) {
1603
                continue;
1604
            }
1605
1606
            if ($skip === true) {
1607
                $skip = false;
1608
                continue;
1609
            }
1610
1611
            switch ($token->getUpper()) {
1612
            case 'VIEW':
1613
            case 'SCHEMA':
1614
            case 'DATABASE':
1615
            case 'TABLE':
1616
                $expr_type = strtolower($token->getTrim());
1617
                break;
1618
1619
            case 'IF':
1620
                $warning = false;
1621
                $skip = true;
1622
                break;
1623
1624
            case 'TEMPORARY':
1625
                $expr_type = ExpressionType::TEMPORARY_TABLE;
1626
                $skip = true;
1627
                break;
1628
1629
            case 'RESTRICT':
1630
            case 'CASCADE':
1631
                $option = $token->getUpper();
1632
                break;
1633
1634
            case ',':
1635
                $resultList[] = array('expr_type' => $expr_type, 'base_expr' => $base_expr);
1636
                $base_expr = '';
1637
                break;
1638
1639
            default:
1640
                $base_expr .= $token->getToken();
1641
            }
1642
        }
1643
1644
        if ($base_expr !== '') {
1645
            $resultList[] = array('expr_type' => $expr_type, 'base_expr' => $base_expr);
1646
        }
1647
1648
        return array('option' => $option, 'warning' => $warning, 'object_list' => $resultList);
1649
    }
1650
}
1651