Passed
Pull Request — master (#315)
by Théo
02:34
created

Tokenizer::getValue()   A

Complexity

Conditions 2
Paths 2

Size

Total Lines 10
Code Lines 4

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 4
dl 0
loc 10
rs 10
c 0
b 0
f 0
cc 2
nc 2
nop 1
1
<?php
2
3
declare(strict_types=1);
4
5
/*
6
 * This file is part of the box project.
7
 *
8
 * (c) Kevin Herrera <[email protected]>
9
 *     Théo Fidry <[email protected]>
10
 *
11
 * This source file is subject to the MIT license that is bundled
12
 * with this source code in the file LICENSE.
13
 */
14
15
namespace KevinGH\Box\Annotation;
16
17
use function array_keys;
18
use function array_merge;
19
use function array_values;
20
use Assert\Assertion;
21
use Doctrine\Common\Annotations\DocLexer;
22
use Hoa\Compiler\Llk\Llk;
23
use Hoa\Compiler\Llk\TreeNode;
24
use Hoa\Compiler\Visitor\Dump;
0 ignored issues
show
Bug introduced by
This use statement conflicts with another class in this namespace, KevinGH\Box\Annotation\Dump. Consider defining an alias.

Let?s assume that you have a directory layout like this:

.
|-- OtherDir
|   |-- Bar.php
|   `-- Foo.php
`-- SomeDir
    `-- Foo.php

and let?s assume the following content of Bar.php:

// Bar.php
namespace OtherDir;

use SomeDir\Foo; // This now conflicts the class OtherDir\Foo

If both files OtherDir/Foo.php and SomeDir/Foo.php are loaded in the same runtime, you will see a PHP error such as the following:

PHP Fatal error:  Cannot use SomeDir\Foo as Foo because the name is already in use in OtherDir/Foo.php

However, as OtherDir/Foo.php does not necessarily have to be loaded and the error is only triggered if it is loaded before OtherDir/Bar.php, this problem might go unnoticed for a while. In order to prevent this error from surfacing, you must import the namespace with a different alias:

// Bar.php
namespace OtherDir;

use SomeDir\Foo as SomeDirFoo; // There is no conflict anymore.
Loading history...
25
use Hoa\File\Read;
26
use function in_array;
27
use KevinGH\Box\Annotation\Exception\Exception;
28
use KevinGH\Box\Annotation\Exception\SyntaxException;
29
use function ltrim;
30
use function strlen;
31
use function strpos;
32
use function trim;
33
34
/**
35
 * Parses annotation tokens from a docblock.
36
 *
37
 * This class will use a lexer to parse out a series of tokens from a given docblock. Each token in the series
38
 * represents a portion of an annotation that was parsed. These tokens can be used to generate alternative
39
 * representations, such as native values.
40
 *
41
 * @private
42
 */
43
final class Tokenizer
44
{
45
    /**
46
     * The namespace aliases.
47
     */
48
    private $aliases = [];
49
50
    /**
51
     * @var int[] The list of valid class identifiers
52
     */
53
    private static $classIdentifiers = [
54
        DocLexer::T_IDENTIFIER,
55
        DocLexer::T_TRUE,
56
        DocLexer::T_FALSE,
57
        DocLexer::T_NULL,
58
    ];
59
60
    /**
61
     * The list of ignored annotation identifiers.
62
     */
63
    private $ignored = [];
64
65
    /**
66
     * Parses the docblock and returns its annotation tokens.
67
     *
68
     * @param string $docblock
69
     * @param array<string, string>  $annotationAliases
70
     *
71
     * Annotation aliases are for when the annotations are imported and aliased via a use statement, for example:
72
     *
73
     * ```php
74
     * use Doctrine\ORM\Mapping as ORM;
75
     * ```
76
     *
77
     * Would require the following alias configuration:
78
     *
79
     * ```php
80
     * $annotationAliases = [
81
     *     'ORM' => 'Doctrine\ORM\Mapping',
82
     * ];
83
     * ```
84
     *
85
     * @return array the list of tokens
86
     */
87
    public function parse(string $docblock, array $annotationAliases = []): TreeNode
88
    {
89
        Assertion::allString($annotationAliases);
90
        Assertion::allString(array_keys($annotationAliases));
91
92
        if (0 !== strpos(ltrim($docblock), '/**')) {
93
            return [];
0 ignored issues
show
Bug Best Practice introduced by
The expression return array() returns the type array which is incompatible with the type-hinted return Hoa\Compiler\Llk\TreeNode.
Loading history...
94
        }
95
96
        $dumper   = new Dump();
0 ignored issues
show
Unused Code introduced by
The assignment to $dumper is dead and can be removed.
Loading history...
97
        $compiler = Llk::load(new Read(__DIR__ . '/../../res/annotation-grammar.pp'));
98
        return $compiler->parse($docblock);
0 ignored issues
show
Bug Best Practice introduced by
The expression return $compiler->parse($docblock) returns the type Hoa\Compiler\Llk\TreeNode which is incompatible with the documented return type array.
Loading history...
99
100
        $nodes = $this->parseDocblock($docblock);
0 ignored issues
show
Unused Code introduced by
$nodes = $this->parseDocblock($docblock) is not reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
101
102
        if (false === ($position = strpos($docblock, '@'))) {
103
            return [];
104
        }
105
106
        if (0 < $position) {
107
            --$position;
108
        }
109
110
        $docblock = substr($docblock, $position);
111
        $docblock = trim($docblock, '*/ ');
112
113
        $lexer = $this->createLexer($docblock);
114
115
        return $this->retrieveAnnotations($lexer, $annotationAliases);
116
    }
117
118
    private function parseDocblock(string $docblock): TreeNode
119
    {
120
        $dumper   = new Dump();
0 ignored issues
show
Unused Code introduced by
The assignment to $dumper is dead and can be removed.
Loading history...
121
        $compiler = Llk::load(new Read(__DIR__ . '/../../res/annotation-grammar.pp'));
122
        $ast   = $compiler->parse($docblock);
123
124
        return $ast;
125
    }
126
127
    private function createLexer(string $docblock): DocLexer
128
    {
129
        $lexer = new DocLexer();
130
131
        $lexer->setInput($docblock);
132
133
        // Start at the first @ symbol
134
        while (null === $lexer->token || DocLexer::T_AT !== $lexer->token['type']) {
135
            $lexer->moveNext();
136
        }
137
138
        return $lexer;
139
    }
140
141
    /**
142
     * Sets the annotation identifiers to ignore.
143
     *
144
     * @param array $ignore the list of ignored identifiers
145
     */
146
    public function ignore(array $ignore): void
147
    {
148
        $this->ignored = $ignore;
149
    }
150
151
    /**
152
     * Returns the tokens for the next annotation.
153
     *
154
     * @return array the tokens
155
     */
156
    private function retrieveAnnotation(DocLexer $lexer): ?array
157
    {
158
        // Get the complete name
159
        $identifier = $this->retrieveIdentifier($lexer);
160
161
        // Skip if necessary
162
        if (in_array($identifier, $this->ignored, true)) {
163
            return null;
164
        }
165
166
        // use alias if applicable
167
        if (false !== ($pos = strpos($identifier, '\\'))) {
168
            $alias = substr($identifier, 0, $pos);
169
170
            if (isset($this->aliases[$alias])) {
171
                $identifier = $this->aliases[$alias]
172
                            .'\\'
173
                            .substr($identifier, $pos + 1);
174
            }
175
        } elseif (isset($this->aliases[$identifier])) {
176
            $identifier = $this->aliases[$identifier];
177
        }
178
179
        // return the @, name, and any values found
180
        return array_merge(
181
            [
182
                [DocLexer::T_AT],
183
                [DocLexer::T_IDENTIFIER, $identifier],
184
            ],
185
            $this->retrieveAnnotationValues($lexer)
186
        );
187
    }
188
189
    private function retrieveAnnotations(DocLexer $lexer, array $annotationAliases): array
190
    {
191
        $tokens = [];
192
193
        while (true) {
194
            /**
195
             * @var string $value
196
             * @var int $type
197
             * @var int $position
198
             */
199
            [$value, $type, $position] = array_values($lexer->token);
0 ignored issues
show
Comprehensibility Best Practice introduced by
This list assign is not used and could be removed.
Loading history...
200
201
            // something about being preceded by a non-catchable pattern
202
            // TODO: what is that comment about?
203
            $position = $lexer->token['position'] + strlen($lexer->token['value']);
0 ignored issues
show
Unused Code introduced by
The assignment to $position is dead and can be removed.
Loading history...
204
205
//            if ((null !== $this->lexer->token)
206
//                && ($this->lexer->lookahead['position'] === $position)) {
207
//                $this->lexer->moveNext();
208
//
209
//                continue;
210
//            }
211
212
//            // make sure we get a valid annotation name
213
//            if ((null === ($glimpse = $this->lexer->glimpse()))
214
//                || ((DocLexer::T_NAMESPACE_SEPARATOR !== $glimpse['type'])
215
//                    && !in_array($glimpse['type'], self::$classIdentifiers, true))) {
216
//                $this->lexer->moveNext();
217
//
218
//                continue;
219
//            }
220
221
            // find them all and merge them to the list
222
            if (null !== ($annotationToken = $this->retrieveAnnotation($lexer))) {
223
                $tokens = array_merge($tokens, $annotationToken);
224
            }
225
226
            if (null === $lexer->lookahead) {
227
                break;
228
            }
229
        }
230
231
        return $tokens;
232
    }
233
234
    /**
235
     * Returns the tokens for the next array of values.
236
     *
237
     * @return array the tokens
238
     */
239
    private function getArray(DocLexer $lexer): array
240
    {
241
        $this->match($lexer, DocLexer::T_OPEN_CURLY_BRACES);
242
243
        $tokens = [
244
            [DocLexer::T_OPEN_CURLY_BRACES],
245
        ];
246
247
        // check if empty array, bail early if it is
248
        if ($lexer->isNextToken(DocLexer::T_CLOSE_CURLY_BRACES)) {
249
            $this->match($lexer, DocLexer::T_CLOSE_CURLY_BRACES);
250
251
            $tokens[] = [DocLexer::T_CLOSE_CURLY_BRACES];
252
253
            return $tokens;
254
        }
255
256
        // collect the first value
257
        $tokens = array_merge($tokens, $this->getArrayEntry($lexer));
258
259
        // collect the remaining values
260
        while ($lexer->isNextToken(DocLexer::T_COMMA)) {
261
            $this->match($lexer, DocLexer::T_COMMA);
262
263
            $tokens[] = [DocLexer::T_COMMA];
264
265
            if ($lexer->isNextToken(DocLexer::T_CLOSE_CURLY_BRACES)) {
266
                break;
267
            }
268
269
            $tokens = array_merge($tokens, $this->getArrayEntry($lexer));
270
        }
271
272
        // end the collection
273
        $this->match($lexer, DocLexer::T_CLOSE_CURLY_BRACES);
274
275
        $tokens[] = [DocLexer::T_CLOSE_CURLY_BRACES];
276
277
        return $tokens;
278
    }
279
280
    /**
281
     * Returns the tokens for the next array entry.
282
     *
283
     * @return array the tokens
284
     */
285
    private function getArrayEntry(DocLexer $lexer): array
286
    {
287
        $glimpse = $lexer->glimpse();
288
        $tokens = [];
289
290
        // append the correct assignment token: ":" or "="
291
        if (DocLexer::T_COLON === $glimpse['type']) {
292
            $token = [DocLexer::T_COLON];
293
        } elseif (DocLexer::T_EQUALS === $glimpse['type']) {
294
            $token = [DocLexer::T_EQUALS];
295
        }
296
297
        // is it an assignment?
298
        if (isset($token)) {
299
            // if the key is a constant, hand off
300
            if ($lexer->isNextToken(DocLexer::T_IDENTIFIER)) {
301
                $tokens = $this->getConstant($lexer);
302
303
            // match only integer and string keys
304
            } else {
305
                $this->matchAny(
306
                    $lexer,
307
                    [
308
                        DocLexer::T_INTEGER,
309
                        DocLexer::T_STRING,
310
                    ]
311
                );
312
313
                $tokens = [
314
                    [
315
                        $lexer->token['type'],
316
                        $lexer->token['value'],
317
                    ],
318
                ];
319
            }
320
321
            $tokens[] = $token;
322
323
            $this->matchAny(
324
                $lexer,
325
                [
326
                    DocLexer::T_COLON,
327
                    DocLexer::T_EQUALS,
328
                ]
329
            );
330
        }
331
332
        // merge in the value
333
        return array_merge($tokens, $this->getPlainValue($lexer));
334
    }
335
336
    /**
337
     * Returns the tokens for the next assigned (key/value) value.
338
     *
339
     * @return array the tokens
340
     */
341
    private function getAssignedValue(DocLexer $lexer): array
342
    {
343
        $this->match($lexer, DocLexer::T_IDENTIFIER);
344
345
        $tokens = [
346
            [DocLexer::T_IDENTIFIER, $lexer->token['value']],
347
            [DocLexer::T_EQUALS],
348
        ];
349
350
        $this->match($lexer, DocLexer::T_EQUALS);
351
352
        return array_merge($tokens, $this->getPlainValue($lexer));
353
    }
354
355
    /**
356
     * Returns the current constant value for the current annotation.
357
     *
358
     * @return array the tokens
359
     */
360
    private function getConstant(DocLexer $lexer): array
361
    {
362
        $identifier = $this->retrieveIdentifier($lexer);
363
        $tokens = [];
364
365
        // check for a special constant type
366
        switch (strtolower($identifier)) {
367
            case 'true':
368
                $tokens[] = [DocLexer::T_TRUE, $identifier];
369
                break;
370
            case 'false':
371
                $tokens[] = [DocLexer::T_FALSE, $identifier];
372
                break;
373
            case 'null':
374
                $tokens[] = [DocLexer::T_NULL, $identifier];
375
                break;
376
            default:
377
                $tokens[] = [DocLexer::T_IDENTIFIER, $identifier];
378
        }
379
380
        return $tokens;
381
    }
382
383
    /**
384
     * Returns the next identifier.
385
     *
386
     * @throws Exception
387
     * @throws SyntaxException if a syntax error is found
388
     *
389
     * @return string the identifier
390
     */
391
    private function retrieveIdentifier(DocLexer $lexer): string
392
    {
393
        // grab the first bit of the identifier
394
        if ($lexer->isNextTokenAny(self::$classIdentifiers)) {
395
            $lexer->moveNext();
396
397
            $name = $lexer->token['value'];
398
        } else {
399
            throw SyntaxException::expectedToken(
400
                'namespace separator or identifier',
401
                null,
402
                $lexer
403
            );
404
        }
405
406
        // grab the remaining bits
407
        $position = $lexer->token['position']
408
                  + strlen($lexer->token['value']);
409
410
        while (($lexer->lookahead['position'] === $position)
411
            && $lexer->isNextToken(DocLexer::T_NAMESPACE_SEPARATOR)) {
412
            $this->match($lexer, DocLexer::T_NAMESPACE_SEPARATOR);
413
            $this->matchAny(self::$classIdentifiers);
0 ignored issues
show
Bug introduced by
self::classIdentifiers of type integer[] is incompatible with the type Doctrine\Common\Annotations\DocLexer expected by parameter $lexer of KevinGH\Box\Annotation\Tokenizer::matchAny(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

413
            $this->matchAny(/** @scrutinizer ignore-type */ self::$classIdentifiers);
Loading history...
Bug introduced by
The call to KevinGH\Box\Annotation\Tokenizer::matchAny() has too few arguments starting with tokens. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

413
            $this->/** @scrutinizer ignore-call */ 
414
                   matchAny(self::$classIdentifiers);

This check compares calls to functions or methods with their respective definitions. If the call has less arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress. Please note the @ignore annotation hint above.

Loading history...
414
415
            $name .= '\\'.$lexer->token['value'];
416
        }
417
418
        return $name;
419
    }
420
421
    /**
422
     * Returns the tokens for the next "plain" value.
423
     *
424
     * @throws Exception
425
     * @throws SyntaxException if a syntax error is found
426
     *
427
     * @return array the tokens
428
     */
429
    private function getPlainValue(DocLexer $lexer): array
430
    {
431
        // check if array, then hand off
432
        if ($lexer->isNextToken(DocLexer::T_OPEN_CURLY_BRACES)) {
433
            return $this->getArray($lexer);
434
        }
435
436
        // check if nested annotation, then hand off
437
        if ($lexer->isNextToken(DocLexer::T_AT)) {
438
            return $this->retrieveAnnotation($lexer);
0 ignored issues
show
Bug Best Practice introduced by
The expression return $this->retrieveAnnotation($lexer) could return the type null which is incompatible with the type-hinted return array. Consider adding an additional type-check to rule them out.
Loading history...
439
        }
440
441
        // check if constant, then hand off
442
        if ($lexer->isNextToken(DocLexer::T_IDENTIFIER)) {
443
            return $this->getConstant($lexer);
444
        }
445
446
        $tokens = [];
447
448
        // determine type, or throw syntax error if unrecognized
449
        switch ($lexer->lookahead['type']) {
450
            case DocLexer::T_FALSE:
451
            case DocLexer::T_FLOAT:
452
            case DocLexer::T_INTEGER:
453
            case DocLexer::T_NULL:
454
            case DocLexer::T_STRING:
455
            case DocLexer::T_TRUE:
456
                $this->match($lexer, $lexer->lookahead['type']);
457
458
                $tokens[] = [
459
                    $lexer->token['type'],
460
                    $lexer->token['value'],
461
                ];
462
463
                break;
464
            default:
465
                throw SyntaxException::expectedToken(
466
                    'PlainValue',
467
                    null,
468
                    $lexer
469
                );
470
        }
471
472
        return $tokens;
473
    }
474
475
    /**
476
     * Returns the tokens for the next value.
477
     *
478
     * @return array the tokens
479
     */
480
    private function getValue(DocLexer $lexer): array
481
    {
482
        $glimpse = $lexer->glimpse();
483
484
        // check if it's an assigned value: @example(assigned="value")
485
        if (DocLexer::T_EQUALS === $glimpse['type']) {
486
            return $this->getAssignedValue($lexer);
487
        }
488
489
        return $this->getPlainValue($lexer);
490
    }
491
492
    /**
493
     * Returns the tokens for all of the values for the current annotation.
494
     *
495
     * @throws Exception
496
     * @throws SyntaxException if a syntax error is found
497
     *
498
     * @return array the tokens
499
     */
500
    private function retrieveAnnotationValues(DocLexer $lexer): array
501
    {
502
        $tokens = [];
503
504
        // check if a value list is given
505
        if ($lexer->isNextToken(DocLexer::T_OPEN_PARENTHESIS)) {
506
            $this->match($lexer, DocLexer::T_OPEN_PARENTHESIS);
507
508
            $tokens[] = [DocLexer::T_OPEN_PARENTHESIS];
509
510
            // skip if we are given an empty list: @example()
511
            if ($lexer->isNextToken(DocLexer::T_CLOSE_PARENTHESIS)) {
512
                $this->match($lexer, DocLexer::T_CLOSE_PARENTHESIS);
513
514
                $tokens[] = [DocLexer::T_CLOSE_PARENTHESIS];
515
516
                return $tokens;
517
            }
518
519
            // skip if no list is given
520
        } else {
521
            return $tokens;
522
        }
523
524
        // collect the first value
525
        $tokens = array_merge($tokens, $this->getValue($lexer));
526
527
        // check for comma separated values and collect those too
528
        while ($lexer->isNextToken(DocLexer::T_COMMA)) {
529
            $this->match($lexer, DocLexer::T_COMMA);
530
531
            $tokens[] = [DocLexer::T_COMMA];
532
533
            $token = $lexer->lookahead;
534
            $value = $this->getValue($lexer);
535
536
            // no multiple trailing commas
537
            if (empty($value)) {
538
                throw SyntaxException::expectedToken('Value', $token);
539
            }
540
541
            $tokens = array_merge($tokens, $value);
542
        }
543
544
        // end the list
545
        $this->match($lexer, DocLexer::T_CLOSE_PARENTHESIS);
546
547
        $tokens[] = [DocLexer::T_CLOSE_PARENTHESIS];
548
549
        return $tokens;
550
    }
551
552
    /**
553
     * Matches the next token and advances.
554
     *
555
     * @param int $token the next token to match
556
     *
557
     * @throws Exception
558
     * @throws SyntaxException if a syntax error is found
559
     *
560
     * @return null|array TRUE if the next token matches, FALSE if not
561
     */
562
    private function match(DocLexer $lexer, $token): bool
563
    {
564
        if (!$lexer->isNextToken($token)) {
565
            throw SyntaxException::expectedToken(
566
                $lexer->getLiteral($token),
567
                null,
568
                $lexer
569
            );
570
        }
571
572
        return $lexer->moveNext();
0 ignored issues
show
Bug Best Practice introduced by
The expression return $lexer->moveNext() returns the type boolean which is incompatible with the documented return type array|null.
Loading history...
573
    }
574
575
    /**
576
     * Matches any one of the tokens and advances.
577
     *
578
     * @param array $tokens the list of tokens
579
     *
580
     * @throws Exception
581
     * @throws SyntaxException if a syntax error is found
582
     *
583
     * @return bool TRUE if the next token matches, FALSE if not
584
     */
585
    private function matchAny(DocLexer $lexer, array $tokens): bool
586
    {
587
        if (!$lexer->isNextTokenAny($tokens)) {
588
            throw SyntaxException::expectedToken(
589
                implode(
590
                    ' or ',
591
                    array_map([$lexer, 'getLiteral'], $tokens)
592
                ),
593
                null,
594
                $lexer
595
            );
596
        }
597
598
        return $lexer->moveNext();
599
    }
600
}
601