Completed
Push — master ( e8aa43...f91003 )
by personal
19s queued 13s
created

Tokenizer::__construct()   A

Complexity

Conditions 2
Paths 2

Size

Total Lines 7
Code Lines 4

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 1 Features 0
Metric Value
dl 0
loc 7
rs 9.4285
c 1
b 1
f 0
cc 2
eloc 4
nc 2
nop 1
1
<?php
2
3
/*
4
 * (c) Jean-François Lépine <https://twitter.com/Halleck45>
5
 *
6
 * For the full copyright and license information, please view the LICENSE
7
 * file that was distributed with this source code.
8
 */
9
10
namespace Hal\Component\Token;
11
12
/**
13
 * Tokenize file
14
 *
15
 * @author Jean-François Lépine <https://twitter.com/Halleck45>
16
 */
17
class Tokenizer {
18
19
    /**
20
     * Tokenize file
21
     *
22
     * @param $filename
23
     * @return TokenCollection
24
     */
25
    public function tokenize($filename) {
26
27
        $size = filesize($filename);
28
        $limit = 102400; // around 100 Ko
29
        if ($size > $limit) {
30
            $tokens = $this->tokenizeLargeFile($filename);
31
        } else {
32
            $tokens = token_get_all($this->cleanup(file_get_contents($filename)));
33
        }
34
35
        return new TokenCollection($tokens);
0 ignored issues
show
Bug introduced by
It seems like $tokens defined by $this->tokenizeLargeFile($filename) on line 30 can also be of type object<Hal\Component\Token\TokenCollection>; however, Hal\Component\Token\TokenCollection::__construct() does only seem to accept array, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
36
    }
37
38
    /**
39
     * Tokenize large files
40
     *
41
     * @param $filename
42
     * @return TokenCollection
43
     */
44
    protected function tokenizeLargeFile($filename) {
45
        // run in another process to allow catching fatal errors due to memory issues with "token_get_all()" function
46
        // @see https://github.com/Halleck45/PhpMetrics/issues/139
47
        // @see https://github.com/Halleck45/PhpMetrics/issues/13
48
        $code = <<<EOT
49
\$c = file_get_contents("$filename");
50
\$c = preg_replace("!(<\?\s)!", "<?php ", \$c);
51
echo serialize(token_get_all(\$c));
52
EOT;
53
        $output = shell_exec('php -r \''.$code.'\'');
54
        $tokens = unserialize($output);
55
        if (false === $tokens) {
56
            throw new NoTokenizableException(sprintf('Cannot tokenize "%s". This file is probably too big. Please try to increase memory_limit', $filename));
57
        }
58
        return $tokens;
59
    }
60
61
    /**
62
     * Clean php source
63
     *
64
     * @param $content
65
     * @return string
66
     */
67
    private function cleanup($content) {
68
        // replacing short open tags by <?php
69
        // if file contains short open tags but short_open_tags='Off' in php.ini bug can occur
70
        // @see https://github.com/Halleck45/PhpMetrics/issues/154
71
        return preg_replace('!(<\?\s)!', '<?php ', $content);
72
    }
73
74
}
75