Tokenizer   F
last analyzed

Complexity

Total Complexity 98

Size/Duplication

Total Lines 372
Duplicated Lines 0 %

Coupling/Cohesion

Components 1
Dependencies 3

Test Coverage

Coverage 96.35%

Importance

Changes 0
Metric Value
wmc 98
lcom 1
cbo 3
dl 0
loc 372
ccs 211
cts 219
cp 0.9635
rs 2
c 0
b 0
f 0

21 Methods

Rating   Name   Duplication   Size   Complexity  
A initTokenizer() 0 5 1
A checkFragment() 0 18 3
B getKeyword() 0 27 8
A expect() 0 8 2
A match() 0 4 1
A createException() 0 4 1
A getColumn() 0 4 1
A getLine() 0 4 1
A end() 0 4 1
A peek() 0 4 1
A lex() 0 7 1
A createUnexpectedException() 0 4 1
A createUnexpectedTokenTypeException() 0 4 1
C skipWhitespace() 0 31 15
B scanWord() 0 19 10
A skipInteger() 0 11 4
A next() 0 6 1
D scan() 0 82 25
A scanNumber() 0 24 5
A getLocation() 0 4 1
C scanString() 0 56 14

How to fix   Complexity   

Complex Class

Complex classes like Tokenizer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Tokenizer, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * Date: 23.11.15
4
 *
5
 * @author Portey Vasil <[email protected]>
6
 */
7
8
namespace Youshido\GraphQL\Parser;
9
10
use Youshido\GraphQL\Exception\Parser\SyntaxErrorException;
11
12
class Tokenizer
13
{
14
    protected $source;
15
    protected $pos = 0;
16
    protected $line = 1;
17
    protected $lineStart = 0;
18
19
    /** @var  Token */
20
    protected $lookAhead;
21
22 108
    protected function initTokenizer($source)
23
    {
24 108
        $this->source    = $source;
25 108
        $this->lookAhead = $this->next();
26 108
    }
27
28
    /**
29
     * @return Token
30
     */
31 108
    protected function next()
32
    {
33 108
        $this->skipWhitespace();
34
35 108
        return $this->scan();
36
    }
37
38 108
    protected function skipWhitespace()
39
    {
40 108
        while ($this->pos < strlen($this->source)) {
41 107
            $ch = $this->source[$this->pos];
42 107
            if ($ch === ' ' || $ch === "\t" || $ch === ',') {
43 104
                $this->pos++;
44 107
            } elseif ($ch === '#') {
45 1
                $this->pos++;
46
                while (
47 1
                    $this->pos < strlen($this->source) &&
48 1
                    ($code = ord($this->source[$this->pos])) &&
49 1
                    $code !== 10 && $code !== 13 && $code !== 0x2028 && $code !== 0x2029
50 1
                ) {
51 1
                    $this->pos++;
52 1
                }
53 107
            } elseif ($ch === "\r") {
54 1
                $this->pos++;
55 1
                if ($this->source[$this->pos] === "\n") {
56 1
                    $this->pos++;
57 1
                }
58 1
                $this->line++;
59 1
                $this->lineStart = $this->pos;
60 107
            } elseif ($ch === "\n") {
61 41
                $this->pos++;
62 41
                $this->line++;
63 41
                $this->lineStart = $this->pos;
64 41
            } else {
65 107
                break;
66
            }
67 104
        }
68 108
    }
69
70
    /**
71
     * @return Token
72
     *
73
     * @throws SyntaxErrorException
74
     */
75 108
    protected function scan()
76
    {
77 108
        if ($this->pos >= strlen($this->source)) {
78 98
            return new Token(Token::TYPE_END, $this->getLine(), $this->getColumn());
79
        }
80
81 107
        $ch = $this->source[$this->pos];
82
        switch ($ch) {
83 107
            case Token::TYPE_LPAREN:
84 61
                ++$this->pos;
85
86 61
                return new Token(Token::TYPE_LPAREN, $this->getLine(), $this->getColumn());
87 107
            case Token::TYPE_RPAREN:
88 54
                ++$this->pos;
89
90 54
                return new Token(Token::TYPE_RPAREN, $this->getLine(), $this->getColumn());
91 107
            case Token::TYPE_LBRACE:
92 105
                ++$this->pos;
93
94 105
                return new Token(Token::TYPE_LBRACE, $this->getLine(), $this->getColumn());
95 107
            case Token::TYPE_RBRACE:
96 98
                ++$this->pos;
97
98 98
                return new Token(Token::TYPE_RBRACE, $this->getLine(), $this->getColumn());
99 106
            case Token::TYPE_COMMA:
100
                ++$this->pos;
101
102
                return new Token(Token::TYPE_COMMA, $this->getLine(), $this->getColumn());
103 106
            case Token::TYPE_LSQUARE_BRACE:
104 16
                ++$this->pos;
105
106 16
                return new Token(Token::TYPE_LSQUARE_BRACE, $this->getLine(), $this->getColumn());
107 106
            case Token::TYPE_RSQUARE_BRACE:
108 15
                ++$this->pos;
109
110 15
                return new Token(Token::TYPE_RSQUARE_BRACE, $this->getLine(), $this->getColumn());
111 106
            case Token::TYPE_REQUIRED:
112 3
                ++$this->pos;
113
114 3
                return new Token(Token::TYPE_REQUIRED, $this->getLine(), $this->getColumn());
115 106
            case Token::TYPE_AT:
116
                ++$this->pos;
117
118
                return new Token(Token::TYPE_AT, $this->getLine(), $this->getColumn());
119 106
            case Token::TYPE_COLON:
120 65
                ++$this->pos;
121
122 65
                return new Token(Token::TYPE_COLON, $this->getLine(), $this->getColumn());
123
124 106
            case Token::TYPE_EQUAL:
125 1
                ++$this->pos;
126
127 1
                return new Token(Token::TYPE_EQUAL, $this->getLine(), $this->getColumn());
128
129 106
            case Token::TYPE_POINT:
130 17
                if ($this->checkFragment()) {
131 16
                    return new Token(Token::TYPE_FRAGMENT_REFERENCE, $this->getLine(), $this->getColumn());
132
                }
133
134 1
                return new Token(Token::TYPE_POINT, $this->getLine(), $this->getColumn());
135
136
137 106
            case Token::TYPE_VARIABLE:
138 13
                ++$this->pos;
139
140 13
                return new Token(Token::TYPE_VARIABLE, $this->getLine(), $this->getColumn());
141
        }
142
143 106
        if ($ch === '_' || ('a' <= $ch && $ch <= 'z') || ('A' <= $ch && $ch <= 'Z')) {
144 105
            return $this->scanWord();
145
        }
146
147 43
        if ($ch === '-' || ('0' <= $ch && $ch <= '9')) {
148 20
            return $this->scanNumber();
149
        }
150
151 29
        if ($ch === '"') {
152 29
            return $this->scanString();
153
        }
154
155 1
        throw $this->createException('Can\t recognize token type');
156
    }
157
158 17
    protected function checkFragment()
159
    {
160 17
        $this->pos++;
161 17
        $ch = $this->source[$this->pos];
162
163 17
        $this->pos++;
164 17
        $nextCh = $this->source[$this->pos];
165
166 17
        $isset = $ch == Token::TYPE_POINT && $nextCh == Token::TYPE_POINT;
167
168 17
        if ($isset) {
169 16
            $this->pos++;
170
171 16
            return true;
172
        }
173
174 1
        return false;
175
    }
176
177 105
    protected function scanWord()
178
    {
179 105
        $start = $this->pos;
180 105
        $this->pos++;
181
182 105
        while ($this->pos < strlen($this->source)) {
183 105
            $ch = $this->source[$this->pos];
184
185 105
            if ($ch === '_' || $ch === '$' || ('a' <= $ch && $ch <= 'z') || ('A' <= $ch && $ch <= 'Z') || ('0' <= $ch && $ch <= '9')) {
186 105
                $this->pos++;
187 105
            } else {
188 104
                break;
189
            }
190 105
        }
191
192 105
        $value = substr($this->source, $start, $this->pos - $start);
193
194 105
        return new Token($this->getKeyword($value), $this->getLine(), $this->getColumn(), $value);
195
    }
196
197 105
    protected function getKeyword($name)
198
    {
199
        switch ($name) {
200 105
            case 'null':
201 9
                return Token::TYPE_NULL;
202
203 105
            case 'true':
204 5
                return Token::TYPE_TRUE;
205
206 105
            case 'false':
207 1
                return Token::TYPE_FALSE;
208
209 105
            case 'query':
210 33
                return Token::TYPE_QUERY;
211
212 104
            case 'fragment':
213 12
                return Token::TYPE_FRAGMENT;
214
215 104
            case 'mutation':
216 17
                return Token::TYPE_MUTATION;
217
218 104
            case 'on':
219 14
                return Token::TYPE_ON;
220
        }
221
222 104
        return Token::TYPE_IDENTIFIER;
223
    }
224
225 104
    protected function expect($type)
226
    {
227 104
        if ($this->match($type)) {
228 104
            return $this->lex();
229
        }
230
231 2
        throw $this->createUnexpectedException($this->peek());
232
    }
233
234 105
    protected function match($type)
235
    {
236 105
        return $this->peek()->getType() === $type;
237
    }
238
239 20
    protected function scanNumber()
240
    {
241 20
        $start = $this->pos;
242 20
        if ($this->source[$this->pos] === '-') {
243 2
            ++$this->pos;
244 2
        }
245
246 20
        $this->skipInteger();
247
248 20
        if (isset($this->source[$this->pos]) && $this->source[$this->pos] === '.') {
249 1
            $this->pos++;
250 1
            $this->skipInteger();
251 1
        }
252
253 20
        $value = substr($this->source, $start, $this->pos - $start);
254
255 20
        if (strpos($value, '.') === false) {
256 20
            $value = (int) $value;
257 20
        } else {
258 1
            $value = (float) $value;
259
        }
260
261 20
        return new Token(Token::TYPE_NUMBER, $this->getLine(), $this->getColumn(), $value);
262
    }
263
264 20
    protected function skipInteger()
265
    {
266 20
        while ($this->pos < strlen($this->source)) {
267 20
            $ch = $this->source[$this->pos];
268 20
            if ('0' <= $ch && $ch <= '9') {
269 20
                $this->pos++;
270 20
            } else {
271 19
                break;
272
            }
273 20
        }
274 20
    }
275
276 10
    protected function createException($message)
277
    {
278 10
        return new SyntaxErrorException(sprintf('%s', $message), $this->getLocation());
279
    }
280
281 12
    protected function getLocation()
282
    {
283 12
        return new Location($this->getLine(), $this->getColumn());
284
    }
285
286 108
    protected function getColumn()
287
    {
288 108
        return $this->pos - $this->lineStart;
289
    }
290
291 108
    protected function getLine()
292
    {
293 108
        return $this->line;
294
    }
295
296
    /*
297
  		http://facebook.github.io/graphql/October2016/#sec-String-Value
298
    */
299 29
    protected function scanString()
300
    {
301 29
        $len = strlen($this->source);
302 29
        $this->pos++;
303
304 29
        $value = '';
305 29
        while ($this->pos < $len) {
306 29
            $ch = $this->source[$this->pos];
307 29
            if ($ch === '"') {
308 28
                $token = new Token(Token::TYPE_STRING, $this->getLine(), $this->getColumn(), $value);
309 28
                $this->pos++;
310
311 28
                return $token;
312
            }
313
            
314 29
            if($ch === '\\' && ($this->pos < ($len - 1))) {
315 1
                $this->pos++;
316 1
                $ch = $this->source[$this->pos];
317
                switch($ch) {
318 1
                    case '"':
319 1
                    case '\\':
320 1
                    case '/':
321 1
                        break;
322 1
                    case 'b':
323 1
                        $ch = sprintf("%c", 8);
324 1
                        break;
325 1
                    case 'f':
326 1
                        $ch = "\f";
327 1
                        break;
328 1
                    case 'n':
329 1
                        $ch = "\n";
330 1
                        break;
331 1
                    case 'r':
332 1
                        $ch = "\r";
333 1
                        break;
334 1
                    case 'u':
335 1
                        $codepoint = substr($this->source, $this->pos + 1, 4);
336 1
                        if( !preg_match('/[0-9A-Fa-f]{4}/', $codepoint)) {
337
                            throw $this->createException(sprintf('Invalid string unicode escape sequece "%s"', $codepoint));
338
                        }
339 1
                        $ch = html_entity_decode("&#x{$codepoint};", ENT_QUOTES, 'UTF-8');
340 1
                        $this->pos += 4;
341 1
                        break;
342
                    default:
343
                        throw $this->createException(sprintf('Unexpected string escaped character "%s"', $ch));
344
                        break;
0 ignored issues
show
Unused Code introduced by
break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
345
346
                }
347 1
            } 
348
            
349 29
            $value .= $ch;
350 29
            $this->pos++;
351 29
        }
352
353 1
        throw $this->createUnexpectedTokenTypeException(Token::TYPE_END);
354
    }
355
356 107
    protected function end()
357
    {
358 107
        return $this->lookAhead->getType() === Token::TYPE_END;
359
    }
360
361 106
    protected function peek()
362
    {
363 106
        return $this->lookAhead;
364
    }
365
366 105
    protected function lex()
367
    {
368 105
        $prev            = $this->lookAhead;
369 105
        $this->lookAhead = $this->next();
370
371 105
        return $prev;
372
    }
373
374 5
    protected function createUnexpectedException(Token $token)
375
    {
376 5
        return $this->createUnexpectedTokenTypeException($token->getType());
377
    }
378
379 9
    protected function createUnexpectedTokenTypeException($tokenType)
380
    {
381 9
        return $this->createException(sprintf('Unexpected token "%s"', Token::tokenName($tokenType)));
382
    }
383
}
384