PHPSQLLexer   C
last analyzed

Complexity

Total Complexity 69

Size/Duplication

Total Lines 326
Duplicated Lines 10.12 %

Coupling/Cohesion

Components 1
Dependencies 3

Importance

Changes 2
Bugs 0 Features 0
Metric Value
wmc 69
c 2
b 0
f 0
lcom 1
cbo 3
dl 33
loc 326
rs 5.6445

11 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 4 1
C split() 0 48 7
C concatUserDefinedVariables() 7 31 7
D concatInlineComments() 17 38 10
C balanceMultilineComments() 9 39 11
A isBacktick() 0 4 3
A balanceBackticks() 0 21 4
B balanceCharacter() 0 23 4
C concatColReferences() 0 47 11
A concatEscapeSequences() 0 17 4
C balanceParenthesis() 0 30 7

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like PHPSQLLexer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use PHPSQLLexer, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
/**
4
 * lexer.php.
5
 *
6
 * This file contains the lexer, which splits the SQL statement just before parsing.
7
 *
8
 * Copyright (c) 2010-2012, Justin Swanhart
9
 * with contributions by André Rothe <[email protected], [email protected]>
10
 *
11
 * All rights reserved.
12
 *
13
 * Redistribution and use in source and binary forms, with or without modification,
14
 * are permitted provided that the following conditions are met:
15
 *
16
 *   * Redistributions of source code must retain the above copyright notice,
17
 *     this list of conditions and the following disclaimer.
18
 *   * Redistributions in binary form must reproduce the above copyright notice,
19
 *     this list of conditions and the following disclaimer in the documentation
20
 *     and/or other materials provided with the distribution.
21
 *
22
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
23
 * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
24
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT
25
 * SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
26
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
27
 * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
28
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
29
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
30
 * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
31
 * DAMAGE.
32
 */
33
namespace SQLParser;
34
35
/**
36
 * This class splits the SQL string into little parts, which the parser can
37
 * use to build the result array.
38
 *
39
 * @author arothe
40
 */
41
class PHPSQLLexer extends PHPSQLParserUtils
42
{
43
    private $splitters;
44
45
    public function __construct()
46
    {
47
        $this->splitters = new LexerSplitter();
48
    }
49
50
    public function split($sql)
51
    {
52
        if (!is_string($sql)) {
53
            throw new InvalidParameterException($sql);
54
        }
55
56
        $tokens = array();
57
        $token = '';
58
59
        $splitLen = $this->splitters->getMaxLengthOfSplitter();
60
        $found = false;
0 ignored issues
show
Unused Code introduced by
$found is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
61
        $len = strlen($sql);
62
        $pos = 0;
63
64
        while ($pos < $len) {
65
            for ($i = $splitLen; $i > 0; --$i) {
66
                $substr = substr($sql, $pos, $i);
67
                if ($this->splitters->isSplitter($substr)) {
68
                    if ($token !== '') {
69
                        $tokens[] = $token;
70
                    }
71
72
                    $tokens[] = $substr;
73
                    $pos += $i;
74
                    $token = '';
75
76
                    continue 2;
77
                }
78
            }
79
80
            $token .= $sql[$pos];
81
            ++$pos;
82
        }
83
84
        if ($token !== '') {
85
            $tokens[] = $token;
86
        }
87
88
        $tokens = $this->concatEscapeSequences($tokens);
89
        $tokens = $this->balanceBackticks($tokens);
90
        $tokens = $this->concatColReferences($tokens);
91
        $tokens = $this->balanceParenthesis($tokens);
92
        $tokens = $this->balanceMultilineComments($tokens);
93
        $tokens = $this->concatInlineComments($tokens);
94
        $tokens = $this->concatUserDefinedVariables($tokens);
95
96
        return $tokens;
97
    }
98
99
    private function concatUserDefinedVariables($tokens)
100
    {
101
        $i = 0;
102
        $cnt = count($tokens);
103
        $userdef = false;
104
105
        while ($i < $cnt) {
106
            if (!isset($tokens[$i])) {
107
                ++$i;
108
                continue;
109
            }
110
111
            $token = $tokens[$i];
112
113 View Code Duplication
            if ($userdef !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
114
                $tokens[$userdef] .= $token;
115
                unset($tokens[$i]);
116
                if ($token !== '@') {
117
                    $userdef = false;
118
                }
119
            }
120
121
            if ($userdef === false && $token === '@') {
122
                $userdef = $i;
123
            }
124
125
            ++$i;
126
        }
127
128
        return array_values($tokens);
129
    }
130
131
    private function concatInlineComments($tokens)
132
    {
133
        $i = 0;
134
        $cnt = count($tokens);
135
        $comment = false;
136
137
        while ($i < $cnt) {
138
            if (!isset($tokens[$i])) {
139
                ++$i;
140
                continue;
141
            }
142
143
            $token = $tokens[$i];
144
145 View Code Duplication
            if ($comment !== false) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
146
                if ($token === "\n" || $token === "\r\n") {
147
                    $comment = false;
148
                } else {
149
                    unset($tokens[$i]);
150
                    $tokens[$comment] .= $token;
151
                }
152
            }
153
154 View Code Duplication
            if (($comment === false) && ($token === '-')) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
155
                if (isset($tokens[$i + 1]) && $tokens[$i + 1] === '-') {
156
                    $comment = $i;
157
                    $tokens[$i] = '--';
158
                    ++$i;
159
                    unset($tokens[$i]);
160
                    continue;
161
                }
162
            }
163
164
            ++$i;
165
        }
166
167
        return array_values($tokens);
168
    }
169
170
    private function balanceMultilineComments($tokens)
171
    {
172
        $i = 0;
173
        $cnt = count($tokens);
174
        $comment = false;
175
176
        while ($i < $cnt) {
177
            if (!isset($tokens[$i])) {
178
                ++$i;
179
                continue;
180
            }
181
182
            $token = $tokens[$i];
183
184
            if ($comment !== false) {
185
                unset($tokens[$i]);
186
                $tokens[$comment] .= $token;
187
                if ($token === '*' && isset($tokens[$i + 1]) && $tokens[$i + 1] === '/') {
188
                    unset($tokens[$i + 1]);
189
                    $tokens[$comment] .= '/';
190
                    $comment = false;
191
                }
192
            }
193
194 View Code Duplication
            if (($comment === false) && ($token === '/')) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
195
                if (isset($tokens[$i + 1]) && $tokens[$i + 1] === '*') {
196
                    $comment = $i;
197
                    $tokens[$i] = '/*';
198
                    ++$i;
199
                    unset($tokens[$i]);
200
                    continue;
201
                }
202
            }
203
204
            ++$i;
205
        }
206
207
        return array_values($tokens);
208
    }
209
210
    private function isBacktick($token)
211
    {
212
        return ($token === "'" || $token === '"' || $token === '`');
213
    }
214
215
    private function balanceBackticks($tokens)
216
    {
217
        $i = 0;
218
        $cnt = count($tokens);
219
        while ($i < $cnt) {
220
            if (!isset($tokens[$i])) {
221
                ++$i;
222
                continue;
223
            }
224
225
            $token = $tokens[$i];
226
227
            if ($this->isBacktick($token)) {
228
                $tokens = $this->balanceCharacter($tokens, $i, $token);
229
            }
230
231
            ++$i;
232
        }
233
234
        return $tokens;
235
    }
236
237
    # backticks are not balanced within one token, so we have
238
    # to re-combine some tokens
239
    private function balanceCharacter($tokens, $idx, $char)
240
    {
241
        $token_count = count($tokens);
242
        $i = $idx + 1;
243
        while ($i < $token_count) {
244
            if (!isset($tokens[$i])) {
245
                ++$i;
246
                continue;
247
            }
248
249
            $token = $tokens[$i];
250
            $tokens[$idx] .= $token;
251
            unset($tokens[$i]);
252
253
            if ($token === $char) {
254
                break;
255
            }
256
257
            ++$i;
258
        }
259
260
        return array_values($tokens);
261
    }
262
263
    /*
264
     * does the token ends with dot?
265
     * concat it with the next token
266
     *
267
     * does the token starts with a dot?
268
     * concat it with the previous token
269
     */
270
    private function concatColReferences($tokens)
271
    {
272
        $cnt = count($tokens);
273
        $i = 0;
274
        while ($i < $cnt) {
275
            if (!isset($tokens[$i])) {
276
                ++$i;
277
                continue;
278
            }
279
280
            if ($tokens[$i][0] === '.') {
281
282
                // concat the previous tokens, till the token has been changed
283
                $k = $i - 1;
284
                $len = strlen($tokens[$i]);
285
                while (($k >= 0) && ($len == strlen($tokens[$i]))) {
286
                    if (!isset($tokens[$k])) { # FIXME: this can be wrong if we have schema . table . column
287
                        --$k;
288
                        continue;
289
                    }
290
                    $tokens[$i] = $tokens[$k].$tokens[$i];
291
                    unset($tokens[$k]);
292
                    --$k;
293
                }
294
            }
295
296
            if ($this->endsWith($tokens[$i], '.')) {
297
298
                // concat the next tokens, till the token has been changed
299
                $k = $i + 1;
300
                $len = strlen($tokens[$i]);
301
                while (($k < $cnt) && ($len == strlen($tokens[$i]))) {
302
                    if (!isset($tokens[$k])) {
303
                        ++$k;
304
                        continue;
305
                    }
306
                    $tokens[$i] .= $tokens[$k];
307
                    unset($tokens[$k]);
308
                    ++$k;
309
                }
310
            }
311
312
            ++$i;
313
        }
314
315
        return array_values($tokens);
316
    }
317
318
    private function concatEscapeSequences($tokens)
319
    {
320
        $tokenCount = count($tokens);
321
        $i = 0;
322
        while ($i < $tokenCount) {
323
            if ($this->endsWith($tokens[$i], '\\')) {
324
                ++$i;
325
                if (isset($tokens[$i])) {
326
                    $tokens[$i - 1] .= $tokens[$i];
327
                    unset($tokens[$i]);
328
                }
329
            }
330
            ++$i;
331
        }
332
333
        return array_values($tokens);
334
    }
335
336
    private function balanceParenthesis($tokens)
337
    {
338
        $token_count = count($tokens);
339
        $i = 0;
340
        while ($i < $token_count) {
341
            if ($tokens[$i] !== '(') {
342
                ++$i;
343
                continue;
344
            }
345
            $count = 1;
346
            for ($n = $i + 1; $n < $token_count; ++$n) {
347
                $token = $tokens[$n];
348
                if ($token === '(') {
349
                    ++$count;
350
                }
351
                if ($token === ')') {
352
                    --$count;
353
                }
354
                $tokens[$i] .= $token;
355
                unset($tokens[$n]);
356
                if ($count === 0) {
357
                    ++$n;
358
                    break;
359
                }
360
            }
361
            $i = $n;
362
        }
363
364
        return array_values($tokens);
365
    }
366
}
367