Completed
Push — feature/5.0.1 ( fcaa59 )
by Raúl
01:20
created

Parser::parseProperty()   C

Complexity

Conditions 7
Paths 12

Size

Total Lines 27
Code Lines 18

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
c 0
b 0
f 0
dl 0
loc 27
rs 6.7272
cc 7
eloc 18
nc 12
nop 2
1
<?php
2
3
namespace Sepia\PoParser;
4
5
use Sepia\PoParser\Catalog\EntryFactory;
6
use Sepia\PoParser\Exception\ParseException;
7
use Sepia\PoParser\SourceHandler\FileSystem;
8
use Sepia\PoParser\SourceHandler\SourceHandler;
9
use Sepia\PoParser\SourceHandler\StringSource;
10
11
/**
12
 *    Copyright (c) 2012 Raúl Ferràs [email protected]
13
 *    All rights reserved.
14
 *
15
 *    Redistribution and use in source and binary forms, with or without
16
 *    modification, are permitted provided that the following conditions
17
 *    are met:
18
 *    1. Redistributions of source code must retain the above copyright
19
 *       notice, this list of conditions and the following disclaimer.
20
 *    2. Redistributions in binary form must reproduce the above copyright
21
 *       notice, this list of conditions and the following disclaimer in the
22
 *       documentation and/or other materials provided with the distribution.
23
 *    3. Neither the name of copyright holders nor the names of its
24
 *       contributors may be used to endorse or promote products derived
25
 *       from this software without specific prior written permission.
26
 *
27
 *    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
28
 *    ''AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
29
 *    TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
30
 *    PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL COPYRIGHT HOLDERS OR CONTRIBUTORS
31
 *    BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
32
 *    CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
33
 *    SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
34
 *    INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
35
 *    CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
36
 *    ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
37
 *    POSSIBILITY OF SUCH DAMAGE.
38
 *
39
 * https://github.com/raulferras/PHP-po-parser
40
 *
41
 * Class to parse .po file and extract its strings.
42
 *
43
 * @version 5.0
44
 */
45
class Parser
46
{
47
    /** @var SourceHandler */
48
    protected $sourceHandler;
49
50
    /** @var int */
51
    protected $lineNumber;
52
53
    /** @var string */
54
    protected $property;
55
56
    /**
57
     * Reads and parses a string
58
     *
59
     * @param string $string po content
60
     *
61
     * @throws \Exception.
62
     * @return Parser
63
     */
64
    public static function parseString($string)
65
    {
66
        $parser = new Parser(new StringSource($string));
67
        $parser->parse();
68
69
        return $parser;
70
    }
71
72
    /**
73
     * Reads and parses a file
74
     *
75
     * @param string $filePath
76
     *
77
     * @throws \Exception.
78
     * @return Catalog
79
     */
80
    public static function parseFile($filePath)
81
    {
82
        $parser = new Parser(new FileSystem($filePath));
83
84
        return $parser->parse();
85
    }
86
87
    public function __construct(SourceHandler $sourceHandler)
88
    {
89
        $this->sourceHandler = $sourceHandler;
90
    }
91
92
    /**
93
     * Reads and parses strings of a .po file.
94
     *
95
     * @param SourceHandler . Optional
96
     *
97
     * @throws \Exception, \InvalidArgumentException, ParseException
98
     * @return Catalog
99
     */
100
    public function parse()
101
    {
102
        $catalog = new Catalog();
103
        $this->lineNumber = 0;
104
        $entry = array();
105
        $this->mode = null;     // current mode
0 ignored issues
show
Bug introduced by
The property mode does not exist. Did you maybe forget to declare it?

In PHP it is possible to write to properties without declaring them. For example, the following is perfectly valid PHP code:

class MyClass { }

$x = new MyClass();
$x->foo = true;

Generally, it is a good practice to explictly declare properties to avoid accidental typos and provide IDE auto-completion:

class MyClass {
    public $foo;
}

$x = new MyClass();
$x->foo = true;
Loading history...
106
        $this->property = null; // current property
107
108
        // Flags
109
        $headersFound = false;
110
111
        // A new entry has been just inserted.
112
113
        // Used to remember last key in a multiline previous entry.
114
        $lastPreviousKey = null;
0 ignored issues
show
Unused Code introduced by
$lastPreviousKey is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
115
        $state = null;
0 ignored issues
show
Unused Code introduced by
$state is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
116
117
        while (!$this->sourceHandler->ended()) {
0 ignored issues
show
Coding Style introduced by
Blank line found at start of control structure
Loading history...
118
119
            $line = trim($this->sourceHandler->getNextLine());
120
121
            if ($this->shouldIgnoreLine($line, $entry)) {
122
                $this->lineNumber++;
123
                continue;
124
            }
125
126
            if ($this->shouldCloseEntry($line, $entry)) {
127
                if (!$headersFound && $this->isHeader($entry)) {
128
                    $headersFound = true;
129
                    $catalog->addHeaders(array_filter(explode('\\n', $entry['msgstr'])));
130
                } else {
131
                    $catalog->addEntry(EntryFactory::createFromArray($entry));
132
                }
133
134
                $entry = array();
135
                $this->mode = null;
136
                $this->property = null;
137
138
                if (empty($line)) {
139
                    $this->lineNumber++;
140
                    continue;
141
                }
142
            }
143
144
            $firstChar = strlen($line) > 0 ? $line[0] : '';
145
146
            switch ($firstChar) {
147
                case '#':
148
                    $entry = $this->parseComment($line, $entry);
149
                    break;
150
151
                case 'm':
152
                    $entry = $this->parseProperty($line, $entry);
153
                    break;
154
155
                case '"':
156
                    $entry = $this->parseMultiline($line, $entry);
157
                    break;
158
            }
159
160
            $this->lineNumber++;
161
            continue;
162
163
164
            $split = preg_split('/\s+/ ', $line, 2);
0 ignored issues
show
Unused Code introduced by
$split = preg_split('/\\s+/ ', $line, 2); does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
165
            $lineKey = $split[0];
166
167
            if (empty($line) && count($entry) === 0) {
168
                $lineNumber++;
169
                continue;
170
            }
171
172
173
            // If a blank line is found, or a new msgid when already got one
174
            if ($line === '' || ($lineKey === 'msgid' && isset($entry['msgid']))) {
175
                // Two consecutive blank lines
176
                if ($justNewEntry) {
177
                    $lineNumber++;
178
                    continue;
179
                }
180
181
                if ($firstLine) {
182
                    $firstLine = false;
0 ignored issues
show
Unused Code introduced by
$firstLine is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
183
                    if (self::isHeader($entry)) {
184
                        $catalog->addHeaders(array_filter(explode('\\n', $entry['msgstr'])));
185
                    } else {
186
                        $catalog->addEntry(EntryFactory::createFromArray($entry));
187
                    }
188
                } else {
189
                    // A new entry is found!
190
                    $catalog->addEntry(EntryFactory::createFromArray($entry));
191
                }
192
193
                $entry = array();
194
                $state = null;
195
                $justNewEntry = true;
0 ignored issues
show
Unused Code introduced by
$justNewEntry is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
196
                $lastPreviousKey = null;
197
                if ($line === '') {
198
                    $lineNumber++;
199
                    continue;
200
                }
201
            }
202
203
            $justNewEntry = false; // ?
0 ignored issues
show
Unused Code introduced by
$justNewEntry is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
204
            $data = isset($split[1]) ? $split[1] : null;
205
206
            switch ($lineKey) {
207
                // Flagged translation
208
                case '#,':
209
                    $entry['flags'] = preg_split('/,\s*/', $data);
210
                    break;
211
212
                // # Translator comments
213
                case '#':
214
                    $entry['tcomment'] = !isset($entry['tcomment']) ? array() : $entry['tcomment'];
215
                    $entry['tcomment'][] = $data;
216
                    break;
217
218
                // #. Comments extracted from source code
219
                case '#.':
220
                    $entry['ccomment'] = !isset($entry['ccomment']) ? array() : $entry['ccomment'];
221
                    $entry['ccomment'][] = $data;
222
                    break;
223
224
                // Reference
225
                case '#:':
226
                    $entry['reference'][] = addslashes($data);
227
                    break;
228
229
230
                case '#|':      // #| Previous untranslated string
231
                case '#~':      // #~ Old entry
232
                case '#~|':     // #~| Previous-Old untranslated string. Reported by @Cellard
233
                    switch ($lineKey) {
234
                        case '#|':
235
                            $lineKey = 'previous';
236
                            break;
237
                        case '#~':
238
                            $lineKey = 'obsolete';
239
                            break;
240
                        case '#~|':
241
                            $lineKey = 'previous-obsolete';
242
                            break;
243
                    }
244
245
                    $tmpParts = explode(' ', $data);
246
                    $tmpKey = $tmpParts[0];
247
248
                    if (!in_array($tmpKey, array('msgid', 'msgid_plural', 'msgstr', 'msgctxt'), true)) {
249
                        // If there is a multi-line previous string we must remember
250
                        // what key was first line.
251
                        $tmpKey = $lastPreviousKey;
252
                        $str = $data;
253
                    } else {
254
                        $str = implode(' ', array_slice($tmpParts, 1));
255
                    }
256
257
                    //array('obsolete' => true, 'msgid' => '', 'msgstr' => '');
258
259
                    if (strpos($lineKey, 'obsolete') !== false) {
260
                        $entry['obsolete'] = true;
261
                        switch ($tmpKey) {
262
                            case 'msgid':
263
                                if (!isset($entry['msgid'])) {
264
                                    $entry['msgid'] = '';
265
                                }
266
                                $entry['msgid'] .= trim($str, '"');
267
                                $lastPreviousKey = $tmpKey;
0 ignored issues
show
Unused Code introduced by
$lastPreviousKey is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
268
                                break;
269
270
                            case 'msgstr':
271
                                if (!isset($entry['msgstr'])) {
272
                                    $entry['msgstr'] = '';
273
                                }
274
                                $entry['msgstr'] .= trim($str, '"');
275
                                $lastPreviousKey = $tmpKey;
0 ignored issues
show
Unused Code introduced by
$lastPreviousKey is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
276
                                break;
277
278
                            case 'msgctxt':
279
                                $entry['msgctxt'] = trim($str, '"');
280
                                $lastPreviousKey = $tmpKey;
0 ignored issues
show
Unused Code introduced by
$lastPreviousKey is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
281
                                break;
282
283
                            default:
284
                                break;
285
                        }
286
                    } else {
287
                        $entry[$lineKey] = isset($entry[$lineKey]) ? $entry[$lineKey] : array(
288
                            'msgid' => '',
289
                            'msgstr' => '',
290
                        );
291
                    }
292
293
                    if ($lineKey !== 'obsolete') {
294
                        switch ($tmpKey) {
295
                            case 'msgid':
296
                            case 'msgid_plural':
297
                            case 'msgstr':
298
                                if (!isset($entry[$lineKey][$tmpKey])) {
299
                                    $entry[$lineKey][$tmpKey] = '';
300
                                }
301
                                $entry[$lineKey][$tmpKey] .= trim($str, '"');
302
                                $lastPreviousKey = $tmpKey;
0 ignored issues
show
Unused Code introduced by
$lastPreviousKey is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
303
                                break;
304
305
                            default:
306
                                $entry[$lineKey][$tmpKey] = $str;
307
                                break;
308
                        }
309
                    }
310
                    break;
311
312
313
                // context
314
                // Allows disambiguations of different messages that have same msgid.
315
                // Example:
316
                //
317
                // #: tools/observinglist.cpp:700
318
                // msgctxt "First letter in 'Scope'"
319
                // msgid "S"
320
                // msgstr ""
321
                //
322
                // #: skycomponents/horizoncomponent.cpp:429
323
                // msgctxt "South"
324
                // msgid "S"
325
                // msgstr ""
326
                case 'msgctxt':
327
                    // untranslated-string
328
                case 'msgid':
329
                    // untranslated-string-plural
330
                case 'msgid_plural':
331
                    $state = $lineKey;
332
                    if (!isset($entry[$state])) {
333
                        $entry[$state] = '';
334
                    }
335
336
                    $entry[$state] .= trim($data, '"');
337
                    break;
338
                // translated-string
339
                case 'msgstr':
340
                    $state = 'msgstr';
341
                    $entry[$state] = trim($data, '"');
342
                    break;
343
344
                default:
345
                    if (strpos($lineKey, 'msgstr[') !== false) {
346
                        // translated-string-case-n
347
                        $state = $lineKey;
348
                        $entry[$state] = trim($data, '"');
349
                    } else {
350
                        // "multiline" lines
351
                        switch ($state) {
352
                            case 'msgctxt':
353
                            case 'msgid':
354
                            case 'msgid_plural':
355
                            case (strpos($state, 'msgstr[') !== false):
356
                                if (!isset($entry[$state])) {
357
                                    $entry[$state] = '';
358
                                }
359
360
                                if (is_string($entry[$state])) {
361
                                    // Convert it to array
362
                                    //$entry[$state] = array($entry[$state]);
363
                                    $entry[$state] = trim($entry[$state], '"');
364
                                }
365
                                $entry[$state] .= trim($line, '"');
366
                                break;
367
368
                            case 'msgstr':
369
                                // Special fix where msgid is ""
370
                                $entry['msgstr'] .= trim($line, '"');
371
                                /*if ($entry['msgid'] === "\"\"") {
372
                                    $entry['msgstr'].= trim($line, '"');
373
                                } else {
374
                                    $entry['msgstr'].= $line;
375
                                }*/
376
                                break;
377
378
                            default:
379
                                throw new \Exception(
380
                                    'Parser: Parse error! Unknown key "'.$lineKey.'" on line '.($lineNumber + 1)
381
                                );
382
                        }
383
                    }
384
                    break;
385
            }
386
387
            $lineNumber++;
388
        }
389
        $this->sourceHandler->close();
390
391
        // add final entry
392
        if (count($entry)) {
393
            $catalog->addEntry(EntryFactory::createFromArray($entry));
394
        }
395
396
        return $catalog;
397
    }
398
399
    protected function shouldIgnoreLine($line, array $entry)
400
    {
401
        return empty($line) && count($entry) === 0;
402
    }
403
404
405
    protected function shouldCloseEntry($line, array $entry)
406
    {
407
        $lineKey = '';
408
409
        return ($line === '' || ($lineKey === 'msgid' && isset($entry['msgid'])));
410
    }
411
412
    /**
413
     * Checks if entry is a header by
414
     *
415
     * @param array $entry
416
     *
417
     * @return bool
418
     */
419
    protected function isHeader(array $entry)
420
    {
421
        if (empty($entry) || !isset($entry['msgstr'])) {
422
            return false;
423
        }
424
425
        if (!isset($entry['msgid']) || !empty($entry['msgid'])) {
426
            return false;
427
        }
428
429
        $headerKeys = array(
430
            'Project-Id-Version:' => false,
431
            'Report-Msgid-Bugs-To:' => false,
432
            'POT-Creation-Date:' => false,
433
            'PO-Revision-Date:' => false,
434
            'Last-Translator:' => false,
435
            'Language-Team:' => false,
436
            'MIME-Version:' => false,
437
            'Content-Type:' => false,
438
            'Content-Transfer-Encoding:' => false,
439
            'Plural-Forms:' => false,
440
        );
441
        $count = count($headerKeys);
442
        $keys = array_keys($headerKeys);
443
444
        $headerItems = 0;
445
        $lines = explode("\\n", $entry['msgstr']);
446
447
        foreach ($lines as $str) {
448
            $tokens = explode(':', $str);
449
            $tokens[0] = trim($tokens[0], '"').':';
450
451
            if (in_array($tokens[0], $keys, true)) {
452
                $headerItems++;
453
                unset($headerKeys[$tokens[0]]);
454
                $keys = array_keys($headerKeys);
455
            }
456
        }
457
458
        return $headerItems === $count;
459
    }
460
461
    /**
462
     * @param string $line
463
     *
464
     * @return array
465
     */
466
    protected function getProperty($line)
467
    {
468
        $tokens = preg_split('/\s+/ ', $line, 2);
469
470
        return $tokens;
471
    }
472
473
    /**
474
     * @param string $line
475
     * @param array  $entry
476
     *
477
     * @return array
478
     * @throws ParseException
479
     */
480
    private function parseProperty($line, array $entry)
481
    {
482
        list($key, $value) = $this->getProperty($line);
483
484
        if (!isset($entry[$key])) {
485
            $entry[$key] = '';
486
        }
487
488
        switch (true) {
489
            case $key === 'msgctxt':
490
            case $key === 'msgid':
491
            case $key === 'msgid_plural':
492
            case $key === 'msgstr':
493
                $entry[$key].= trim($value, '"');
494
                $this->property = $key;
495
                break;
496
497
            case strpos($key, 'msgstr[') !== false:
498
                $entry[$key].= trim($value, '"');
499
                break;
500
501
            default:
502
                throw new ParseException(sprintf('Could not parse %s at line %d', $key, $this->lineNumber));
503
        }
504
505
        return $entry;
506
    }
507
508
    /**
509
     * @param string $line
510
     * @param array  $entry
511
     *
512
     * @return array
513
     * @throws ParseException
514
     */
515
    private function parseMultiline($line, $entry)
516
    {
517
        switch ($this->property) {
518
            case 'msgctxt':
519
            case 'msgid':
520
            case 'msgid_plural':
521
            case 'msgstr':
522
                $entry[$this->property].= trim($line, '"');
523
                break;
524
525
            default:
526
                throw new ParseException(
527
                    sprintf('Error parsing property %s as multiline.', $this->property)
528
                );
529
        }
530
531
        return $entry;
532
    }
533
534
    /**
535
     * @param string $line
536
     * @param array  $entry
537
     *
538
     * @return array
539
     */
540
    private function parseComment($line, $entry)
0 ignored issues
show
Unused Code introduced by
The parameter $line is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
541
    {
542
        return $entry;
543
    }
544
}
545