Extractor::scan()   C
last analyzed

Complexity

Conditions 14
Paths 20

Size

Total Lines 54
Code Lines 27

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 14
eloc 27
c 0
b 0
f 0
nc 20
nop 1
dl 0
loc 54
rs 6.7343

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * Koch Framework
5
 * Jens-André Koch © 2005 - onwards.
6
 *
7
 * This file is part of "Koch Framework".
8
 *
9
 * License: GNU/GPL v2 or any later version, see LICENSE file.
10
 *
11
 * This program is free software; you can redistribute it and/or modify
12
 * it under the terms of the GNU General Public License as published by
13
 * the Free Software Foundation; either version 2 of the License, or
14
 * (at your option) any later version.
15
 *
16
 * This program is distributed in the hope that it will be useful,
17
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
18
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
19
 * GNU General Public License for more details.
20
 *
21
 * You should have received a copy of the GNU General Public License
22
 * along with this program. If not, see <http://www.gnu.org/licenses/>.
23
 */
24
25
namespace Koch\Localization\Adapter\Gettext;
26
27
/**
28
 * Gettext Extractor.
29
 *
30
 * Gettext extraction is normally performed by the "xgettext" tool.
31
 * http://www.gnu.org/software/hello/manual/gettext/xgettext-Invocation.html
32
 *
33
 * This is a php implementation of a gettext extractor on basis of preg_matching.
34
 * The extractor matches certain translation functions, like translate('term') or t('term') or _('term').
35
 * and their counterparts in templates, often {t('term')} or {_('term')}.
36
 */
37
class Extractor
38
{
39
    /**
40
     * @var resource
41
     */
42
    public $logHandler;
43
44
    /**
45
     * @var array
46
     */
47
    public $inputFiles = [];
48
49
    /**
50
     * @var array
51
     */
52
    public $extractors = [
53
        'php' => ['PHP'],
54
        'tpl' => ['PHP', 'Template'],
55
    ];
56
57
    /**
58
     * @var array
59
     */
60
    public $data = [];
61
62
    /**
63
     *  @var array
64
     */
65
    protected $extractorStore = [];
66
67
    /**
68
     * Log setup.
69
     *
70
     * @param string|bool $logToFile Bool or path of custom log file
71
     */
72
    public function __construct($logToFile = false)
73
    {
74
        // default log file
75
        if (false === $logToFile) {
76
            $this->logHandler = fopen(ROOT_LOGS . 'gettext-extractor.log', 'w');
77
        } else { // custom log file
78
            $this->logHandler = fopen($logToFile, 'w');
79
        }
80
    }
81
82
    /**
83
     * Close the log handler if needed.
84
     */
85
    public function __destruct()
86
    {
87
        if (is_resource($this->logHandler)) {
88
            fclose($this->logHandler);
89
        }
90
    }
91
92
    /**
93
     * Writes messages into log or dumps them on screen.
94
     *
95
     * @param string $message
96
     *
97
     * @return string Html Log Message if logHandler resource is false.
0 ignored issues
show
Documentation introduced by
Should the return type not be string|null?

This check compares the return type specified in the @return annotation of a function or method doc comment with the types returned by the function and raises an issue if they mismatch.

Loading history...
98
     */
99
    public function log($message)
100
    {
101
        if (is_resource($this->logHandler)) {
102
            fwrite($this->logHandler, $message . "\n");
103
        } else {
104
            echo $message . "\n <br/>";
105
        }
106
    }
107
108
    /**
109
     * Exception factory.
110
     *
111
     * @param string $message
112
     *
113
     * @throws \Koch\Exception\Exception
114
     */
115
    protected function throwException($message)
116
    {
117
        if (empty($message)) {
118
            $message = 'Something unexpected occured. See Koch_Gettext_Extractor log for details.';
119
        }
120
121
        $this->log($message);
122
123
        throw new \Koch\Exception\Exception($message);
124
    }
125
126
    /**
127
     * Scans given files or directories (recursively) for input files.
128
     *
129
     * @param string $resource File or directory
130
     */
131
    protected function scan($resource)
132
    {
133
        if (false === is_dir($resource) and false === is_file($resource)) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
134
            $this->throwException('Resource ' . $resource . ' is not a directory or file.');
135
        }
136
137
        if (true === is_file($resource)) {
138
            $this->inputFiles[] = realpath($resource);
139
140
            return;
141
        }
142
143
        // It's a directory
144
        $resource = realpath($resource);
145
146
        if (false === $resource) {
147
            return;
148
        }
149
150
        $iterator = dir($resource);
151
152
        if (false === $iterator) {
153
            return;
154
        }
155
156
        while (false !== ($entry = $iterator->read())) {
157
            if ($entry === '.' or $entry === '..' or  $entry === '.svn') {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as or instead of || is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
158
                continue;
159
            }
160
161
            $path = $resource . DIRECTORY_SEPARATOR . $entry;
162
163
            if (false === is_readable($path)) {
164
                continue;
165
            }
166
167
            if (true === is_dir($path)) {
168
                $this->scan($path);
169
                continue;
170
            }
171
172
            if (true === is_file($path)) {
173
                $info = pathinfo($path);
174
175
                if (false === isset($this->extractors[$info['extension']])) {
176
                    continue;
177
                }
178
179
                $this->inputFiles[] = realpath($path);
180
            }
181
        }
182
183
        $iterator->close();
184
    }
185
186
    /**
187
     * Extracts gettext keys from multiple input files using multiple extraction adapters.
188
     *
189
     * @param array $inputFiles Array, defining a set of files.
190
     *
191
     * @return array All gettext keys of all input files.
192
     */
193
    protected function extract($inputFiles)
194
    {
195
        foreach ($inputFiles as $inputFile) {
196
            if (false === file_exists($inputFile)) {
197
                $this->throwException('Invalid input file specified: ' . $inputFile);
198
            }
199
200
            if (false === is_readable($inputFile)) {
201
                $this->throwException('Input file is not readable: ' . $inputFile);
202
            }
203
204
            $this->log('Extracting data from file ' . $inputFile);
205
206
            // Check file extension
207
            $fileExtension = pathinfo($inputFile, PATHINFO_EXTENSION);
208
209
            foreach ($this->extractors as $extension => $extractor) {
210
                // check, if the extractor handles a file extension like this
211
                if ($fileExtension !== $extension) {
212
                    continue;
213
                }
214
215
                $this->log('Processing file ' . $inputFile);
216
217
                foreach ($extractor as $extractorName) {
218
                    $extractor     = $this->getExtractor($extractorName);
219
                    $extractorData = $extractor->extract($inputFile);
220
221
                    $this->log(' Extractor ' . $extractorName . ' applied.');
222
223
                    // do not merge if incomming array is empty
224
                    if (false === empty($extractorData)) {
225
                        $this->data = array_merge_recursive($this->data, $extractorData);
226
                    }
227
                }
228
            }
229
        }
230
231
        $this->log('Data exported successfully');
232
233
        return $this->data;
234
    }
235
236
    /**
237
     * Factory Method - Gets an instance of a Koch_Gettext_Extractor.
238
     *
239
     * @param string $extractor
240
     *
241
     * @return object Extractor Object implementing Koch\Localization\Gettext\Extractor_Interface
242
     */
243
    public function getExtractor($extractor)
244
    {
245
        // build classname
246
        $extractor = ucfirst($extractor);
247
248
        // attach namespace
249
        $class = 'Koch\Localization\Gettext\Extractors\\' . ucfirst($extractor);
250
251
        // was loaded before?
252
        if ($this->extractors[$extractor] !== null) {
253
            return $this->extractors[$extractor];
254
        } else {
255
            // /framework/Koch/Localization/Adapter/Gettext/Extractors/*NAME*.php
256
            $file = __DIR__ . '/Adapter/Gettext/Extractors/' . $extractor . '.php';
257
258
            if (true === is_file($file)) {
259
                include_once $file;
260
            } else {
261
                $this->throwException('Extractor file ' . $file . ' not found.');
262
            }
263
264
            if (false === class_exists($class)) {
265
                $this->throwException('File loaded, but Class ' . $extractor . ' not inside.');
266
            }
267
        }
268
269
        $this->extractors[$extractor] = new $class();
270
271
        $this->log('Extractor ' . $extractor . ' loaded.');
272
273
        return $this->extractors[$extractor];
274
    }
275
276
    /**
277
     * Assigns an extractor to an extension.
278
     *
279
     * @param string $extension
280
     * @param string $extractor
281
     *
282
     * @return null|Extractor
283
     */
284
    public function setExtractor($extension, $extractor)
285
    {
286
        // not already set
287
        if (false === isset($this->extractor[$extension]) and
0 ignored issues
show
Bug introduced by
The property extractor does not seem to exist. Did you mean extractors?

An attempt at access to an undefined property has been detected. This may either be a typographical error or the property has been renamed but there are still references to its old name.

If you really want to allow access to undefined properties, you can define magic methods to allow access. See the php core documentation on Overloading.

Loading history...
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
288
            false === in_array($extractor, $this->extractor[$extension], true)) {
0 ignored issues
show
Bug introduced by
The property extractor does not seem to exist. Did you mean extractors?

An attempt at access to an undefined property has been detected. This may either be a typographical error or the property has been renamed but there are still references to its old name.

If you really want to allow access to undefined properties, you can define magic methods to allow access. See the php core documentation on Overloading.

Loading history...
289
            $this->extractor[$extension][] = $extractor;
0 ignored issues
show
Bug introduced by
The property extractor does not seem to exist. Did you mean extractors?

An attempt at access to an undefined property has been detected. This may either be a typographical error or the property has been renamed but there are still references to its old name.

If you really want to allow access to undefined properties, you can define magic methods to allow access. See the php core documentation on Overloading.

Loading history...
290
        } else { // already set
291
292
            return $this;
293
        }
294
    }
295
296
    /**
297
     * Removes all extractor settings.
298
     *
299
     * @return Extractor
300
     */
301
    public function removeAllExtractors()
302
    {
303
        $this->extractor = [];
0 ignored issues
show
Bug introduced by
The property extractor does not seem to exist. Did you mean extractors?

An attempt at access to an undefined property has been detected. This may either be a typographical error or the property has been renamed but there are still references to its old name.

If you really want to allow access to undefined properties, you can define magic methods to allow access. See the php core documentation on Overloading.

Loading history...
304
305
        return $this;
306
    }
307
308
    /**
309
     * Saves extracted data into gettext file.
310
     *
311
     * @param string $file
312
     * @param array  $data
0 ignored issues
show
Documentation introduced by
Should the type for parameter $data not be array|null? Also, consider making the array more specific, something like array<String>, or String[].

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive. In addition it looks for parameters that have the generic type array and suggests a stricter type like array<String>.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
313
     *
314
     * @return Extractor
315
     */
316
    public function save($file, $data = null)
317
    {
318
        if ($data === null) {
319
            $data = $this->data;
320
        }
321
322
        // get dirname and check if dirs exist, else create it
323
        $dir = dirname($file);
324
        if (false === is_dir($dir) and false === @mkdir($dir, 0777, true)) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
325
            $this->throwException('ERROR: make directory failed!');
326
        }
327
328
        // check file permissions on output file
329
        if (true === is_file($file) and false === is_writable($file)) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
Using logical operators such as and instead of && is generally not recommended.

PHP has two types of connecting operators (logical operators, and boolean operators):

  Logical Operators Boolean Operator
AND - meaning and &&
OR - meaning or ||

The difference between these is the order in which they are executed. In most cases, you would want to use a boolean operator like &&, or ||.

Let’s take a look at a few examples:

// Logical operators have lower precedence:
$f = false or true;

// is executed like this:
($f = false) or true;


// Boolean operators have higher precedence:
$f = false || true;

// is executed like this:
$f = (false || true);

Logical Operators are used for Control-Flow

One case where you explicitly want to use logical operators is for control-flow such as this:

$x === 5
    or die('$x must be 5.');

// Instead of
if ($x !== 5) {
    die('$x must be 5.');
}

Since die introduces problems of its own, f.e. it makes our code hardly testable, and prevents any kind of more sophisticated error handling; you probably do not want to use this in real-world code. Unfortunately, logical operators cannot be combined with throw at this point:

// The following is currently a parse error.
$x === 5
    or throw new RuntimeException('$x must be 5.');

These limitations lead to logical operators rarely being of use in current PHP code.

Loading history...
330
            $this->throwException('ERROR: Output file is not writable!');
331
        }
332
333
        // write data formatted to file
334
        file_put_contents($file, $this->formatData($data));
335
336
        $this->log('Output file ' . $file . ' created.');
337
338
        return $this;
339
    }
340
341
    /**
342
     * Returns a fileheader for a gettext portable object file.
343
     *
344
     * @param bool $return_as_string True, returns a string (default) and false returns an array.
345
     *
346
     * @return mixed Array or String. Returns string by default.
0 ignored issues
show
Documentation introduced by
Consider making the return type a bit more specific; maybe use string|string[].

This check looks for the generic type array as a return type and suggests a more specific type. This type is inferred from the actual code.

Loading history...
347
     */
348
    public static function getPOFileHeader($return_as_string = true)
0 ignored issues
show
Coding Style introduced by
$return_as_string does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
349
    {
350
        $output   = [];
351
        $output[] = '// Gettext Portable Object Translation File.';
352
        $output[] = '#';
353
        $output[] = '// Koch Framework';
354
        $output[] = '// Copyright © Jens-André Koch 2005 - onwards.';
355
        $output[] = '// The file is distributed under the GNU/GPL v2 or any later version.';
356
        $output[] = '#';
357
        $output[] = 'msgid ""';
358
        $output[] = 'msgstr ""';
359
        $output[] = '"Project-Id-Version: Koch Framework ' . APPLICATION_VERSION . '\n"';
360
        $output[] = '"POT-Creation-Date: ' . date('Y-m-d H:iO') . '\n"';
361
        $output[] = '"PO-Revision-Date: ' . date('Y-m-d H:iO') . '\n"';
362
        $output[] = '"Content-Type: text/plain; charset=UTF-8\n"';
363
        // @todo http://trac.clansuite.com/ticket/224 - fetch plural form from locale description array
364
        $output[] = '"Plural-Forms: nplurals=2; plural=(n != 1);\n"';
365
        $output[] = '';
366
367
        if ($return_as_string) {
0 ignored issues
show
Coding Style introduced by
$return_as_string does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
368
            return implode("\n", $output);
369
        } else { // return array
370
371
            return $output;
372
        }
373
    }
374
375
    /**
376
     * Formats fetched data to gettext portable object syntax.
377
     *
378
     * @param array $data
379
     *
380
     * @return string
381
     */
382
    protected function formatData($data)
383
    {
384
        $pluralMatchRegexp = '#\%([0-9]+\$)*d#';
385
386
        $output = [];
0 ignored issues
show
Unused Code introduced by
$output is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
387
        $output = self::getPOFileHeader(false);
388
389
        ksort($data);
390
391
        foreach ($data as $key => $files) {
392
            ksort($files);
393
394
            $slashed_key = self::addSlashes($key);
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
395
396
            foreach ($files as $file) {
397
                $output[] = '#: ' . $file; // = reference
398
            }
399
400
            $output[] = 'msgid "' . $slashed_key . '"';
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
401
402
            // check for plural
403
            if (0 < preg_match($pluralMatchRegexp, $key)) {
404
                $output[] = 'msgid_plural "' . $slashed_key . '"';
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
405
                $output[] = 'msgstr[0] "' . $slashed_key . '"';
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
406
                $output[] = 'msgstr[1] "' . $slashed_key . '"';
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
407
            } else { // no plural
408
                $output[] = 'msgstr "' . $slashed_key . '"';
0 ignored issues
show
Coding Style introduced by
$slashed_key does not seem to conform to the naming convention (^[a-z][a-zA-Z0-9]*$).

This check examines a number of code elements and verifies that they conform to the given naming conventions.

You can set conventions for local variables, abstract classes, utility classes, constant, properties, methods, parameters, interfaces, classes, exceptions and special methods.

Loading history...
409
            }
410
411
            $output[] = '';
412
        }
413
414
        return implode("\n", $output);
415
    }
416
417
    /**
418
     * Escapes a string without breaking gettext syntax.
419
     *
420
     * @param string $string
421
     *
422
     * @return string
423
     */
424
    public static function addSlashes($string)
425
    {
426
        return addcslashes($string, self::ESCAPE_CHARS);
427
    }
428
}
429