Completed
Pull Request — master (#579)
by Richard
07:15
created

MessageFormatter::tokenizePattern()   C

Complexity

Conditions 12
Paths 37

Size

Total Lines 40
Code Lines 28

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 28
CRAP Score 12

Importance

Changes 0
Metric Value
cc 12
eloc 28
nc 37
nop 1
dl 0
loc 40
ccs 28
cts 28
cp 1
crap 12
rs 5.1612
c 0
b 0
f 0

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
/**
3
 * This code is derived from code in the Yii framework which stipulates this notice.
4
 *
5
 * The Yii framework is free software. It is released under the terms of the following BSD License.
6
 *
7
 * Copyright © 2008-2018 by Yii Software LLC, All rights reserved.
8
 *
9
 * Redistribution and use in source and binary forms, with or without modification, are permitted provided that the
10
 * following conditions are met:
11
 *
12
 * - Redistributions of source code must retain the above copyright notice, this list of
13
 *   conditions and the following disclaimer.
14
 * - Redistributions in binary form must reproduce the above copyright notice, this list of
15
 *   conditions and the following disclaimer in the documentation and/or other materials provided
16
 *   with the distribution.
17
 * - Neither the name of Yii Software LLC nor the names of its contributors may be used to endorse
18
 *   or promote products derived from this software without specific prior written permission.
19
 *
20
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS
21
 * OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
22
 * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
23
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
25
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
26
 * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY
27
 * WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28
 *
29
 * @link http://www.yiiframework.com/
30
 * @copyright Copyright (c) 2008 Yii Software LLC
31
 * @license http://www.yiiframework.com/license/
32
 */
33
34
namespace Xoops\Locale;
35
36
use Xoops\Core\Exception\NotSupportedException;
37
38
/**
39
 * MessageFormatter allows formatting messages via [ICU message format](http://userguide.icu-project.org/formatparse/messages).
40
 *
41
 * This class enhances the message formatter class provided by the PHP intl extension.
42
 *
43
 * The following enhancements are provided:
44
 *
45
 * - It accepts named arguments and mixed numeric and named arguments.
46
 * - Issues no error when an insufficient number of arguments have been provided. Instead, the placeholders will not be
47
 *   substituted.
48
 * - Fixes PHP 5.5 weird placeholder replacement in case no arguments are provided at all (https://bugs.php.net/bug.php?id=65920).
49
 * - Offers limited support for message formatting in case PHP intl extension is not installed.
50
 *   However it is highly recommended that you install [PHP intl extension](http://php.net/manual/en/book.intl.php) if you want
51
 *   to use MessageFormatter features.
52
 *
53
 *   The fallback implementation only supports the following message formats:
54
 *   - plural formatting for english ('one' and 'other' selectors)
55
 *   - select format
56
 *   - simple parameters
57
 *   - integer number parameters
58
 *
59
 *   The fallback implementation does NOT support the ['apostrophe-friendly' syntax](http://www.php.net/manual/en/messageformatter.formatmessage.php).
60
 *   Also messages that are working with the fallback implementation are not necessarily compatible with the
61
 *   PHP intl MessageFormatter so do not rely on the fallback if you are able to install intl extension somehow.
62
 *
63
 * @property string $errorCode Code of the last error. This property is read-only.
64
 * @property string $errorMessage Description of the last error. This property is read-only.
65
 *
66
 * @author Alexander Makarov <[email protected]>
67
 * @author Carsten Brandt <[email protected]>
68
 * @since 2.0
69
 */
70
class MessageFormatter
71
{
72
    private $_errorCode = 0;
73
    private $_errorMessage = '';
74
75
76
    /**
77
     * Get the error code from the last operation.
78
     * @link http://php.net/manual/en/messageformatter.geterrorcode.php
79
     * @return string Code of the last error.
80
     */
81
    public function getErrorCode()
82
    {
83
        return $this->_errorCode;
84
    }
85
86
    /**
87
     * Get the error text from the last operation.
88
     * @link http://php.net/manual/en/messageformatter.geterrormessage.php
89
     * @return string Description of the last error.
90
     */
91 35
    public function getErrorMessage()
92
    {
93 35
        return $this->_errorMessage;
94
    }
95
96
    /**
97
     * Formats a message via [ICU message format](http://userguide.icu-project.org/formatparse/messages).
98
     *
99
     * It uses the PHP intl extension's [MessageFormatter](http://www.php.net/manual/en/class.messageformatter.php)
100
     * and works around some issues.
101
     * If PHP intl is not installed a fallback will be used that supports a subset of the ICU message format.
102
     *
103
     * @param string $pattern The pattern string to insert parameters into.
104
     * @param array $params The array of name value pairs to insert into the format string.
105
     * @param string $language The locale to use for formatting locale-dependent parts
106
     * @return string|false The formatted pattern string or `false` if an error occurred
107
     */
108 28
    public function format($pattern, $params, $language)
109
    {
110 28
        $this->_errorCode = 0;
111 28
        $this->_errorMessage = '';
112
113 28
        if ($params === []) {
114 4
            return $pattern;
115
        }
116
117 24
        if (!class_exists('MessageFormatter', false)) {
118
            return $this->fallbackFormat($pattern, $params, $language);
119
        }
120
121
        // replace named arguments (https://github.com/yiisoft/yii2/issues/9678)
122 24
        $newParams = [];
123 24
        $pattern = $this->replaceNamedArguments($pattern, $params, $newParams);
124 24
        $params = $newParams;
125
126
        try {
127 24
            $formatter = new \MessageFormatter($language, $pattern);
128
129 22
            if ($formatter === null) {
130
                // formatter may be null in PHP 5.x
131
                $this->_errorCode = intl_get_error_code();
132
                $this->_errorMessage = 'Message pattern is invalid: ' . intl_get_error_message();
133 22
                return false;
134
            }
135 2
        } catch (\IntlException $e) {
136
            // IntlException is thrown since PHP 7
137 2
            $this->_errorCode = $e->getCode();
138 2
            $this->_errorMessage = 'Message pattern is invalid: ' . $e->getMessage();
139 2
            return false;
140
        } catch (\Exception $e) {
141
            // Exception is thrown by HHVM
142
            $this->_errorCode = $e->getCode();
143
            $this->_errorMessage = 'Message pattern is invalid: ' . $e->getMessage();
144
            return false;
145
        }
146
147 22
        $result = $formatter->format($params);
148
149 22
        if ($result === false) {
150
            $this->_errorCode = $formatter->getErrorCode();
151
            $this->_errorMessage = $formatter->getErrorMessage();
152
            return false;
153
        }
154
155 22
        return $result;
156
    }
157
158
    /**
159
     * Parses an input string according to an [ICU message format](http://userguide.icu-project.org/formatparse/messages) pattern.
160
     *
161
     * It uses the PHP intl extension's [MessageFormatter::parse()](http://www.php.net/manual/en/messageformatter.parsemessage.php)
162
     * and adds support for named arguments.
163
     * Usage of this method requires PHP intl extension to be installed.
164
     *
165
     * @param string $pattern The pattern to use for parsing the message.
166
     * @param string $message The message to parse, conforming to the pattern.
167
     * @param string $language The locale to use for formatting locale-dependent parts
168
     * @return array|bool An array containing items extracted, or `FALSE` on error.
169
     * @throws \Xoops\Core\Exception\NotSupportedException when PHP intl extension is not installed.
170
     */
171 7
    public function parse($pattern, $message, $language)
172
    {
173 7
        $this->_errorCode = 0;
174 7
        $this->_errorMessage = '';
175
176 7
        if (!class_exists('MessageFormatter', false)) {
177
            throw new NotSupportedException('You have to install PHP intl extension to use this feature.');
178
        }
179
180
        // replace named arguments
181 7
        if (($tokens = self::tokenizePattern($pattern)) === false) {
182
            $this->_errorCode = -1;
183
            $this->_errorMessage = 'Message pattern is invalid.';
184
185
            return false;
186
        }
187 7
        $map = [];
188 7
        foreach ($tokens as $i => $token) {
189 7
            if (is_array($token)) {
190 7
                $param = trim($token[0]);
191 7
                if (!isset($map[$param])) {
192 7
                    $map[$param] = count($map);
193
                }
194 7
                $token[0] = $map[$param];
195 7
                $tokens[$i] = '{' . implode(',', $token) . '}';
196
            }
197
        }
198 7
        $pattern = implode('', $tokens);
199 7
        $map = array_flip($map);
200
201 7
        $formatter = new \MessageFormatter($language, $pattern);
202 7
        if ($formatter === null) {
203
            $this->_errorCode = -1;
204
            $this->_errorMessage = 'Message pattern is invalid.';
205
206
            return false;
207
        }
208 7
        $result = $formatter->parse($message);
209 7
        if ($result === false) {
210
            $this->_errorCode = $formatter->getErrorCode();
211
            $this->_errorMessage = $formatter->getErrorMessage();
212
213
            return false;
214
        }
215
216 7
        $values = [];
217 7
        foreach ($result as $key => $value) {
218 7
            $values[$map[$key]] = $value;
219
        }
220
221 7
        return $values;
222
    }
223
224
    /**
225
     * Replace named placeholders with numeric placeholders and quote unused.
226
     *
227
     * @param string $pattern The pattern string to replace things into.
228
     * @param array $givenParams The array of values to insert into the format string.
229
     * @param array $resultingParams Modified array of parameters.
230
     * @param array $map
231
     * @return string The pattern string with placeholders replaced.
232
     */
233 24
    private function replaceNamedArguments($pattern, $givenParams, &$resultingParams = [], &$map = [])
234
    {
235 24
        if (($tokens = self::tokenizePattern($pattern)) === false) {
236 1
            return false;
0 ignored issues
show
Bug Best Practice introduced by
The expression return false returns the type false which is incompatible with the documented return type string.
Loading history...
237
        }
238 23
        foreach ($tokens as $i => $token) {
239 23
            if (!is_array($token)) {
240 23
                continue;
241
            }
242 23
            $param = trim($token[0]);
243 23
            if (array_key_exists($param, $givenParams)) {
244
                // if param is given, replace it with a number
245 22
                if (!isset($map[$param])) {
246 22
                    $map[$param] = count($map);
247
                    // make sure only used params are passed to format method
248 22
                    $resultingParams[$map[$param]] = $givenParams[$param];
249
                }
250 22
                $token[0] = $map[$param];
251 22
                $quote = '';
252
            } else {
253
                // quote unused token
254 2
                $quote = "'";
255
            }
256 23
            $type = isset($token[1]) ? trim($token[1]) : 'none';
257
            // replace plural and select format recursively
258 23
            if ($type === 'plural' || $type === 'select') {
259 14
                if (!isset($token[2])) {
260
                    return false;
0 ignored issues
show
Bug Best Practice introduced by
The expression return false returns the type false which is incompatible with the documented return type string.
Loading history...
261
                }
262 14
                if (($subtokens = self::tokenizePattern($token[2])) === false) {
263
                    return false;
0 ignored issues
show
Bug Best Practice introduced by
The expression return false returns the type false which is incompatible with the documented return type string.
Loading history...
264
                }
265 14
                $c = count($subtokens);
0 ignored issues
show
Bug introduced by
It seems like $subtokens can also be of type true; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

265
                $c = count(/** @scrutinizer ignore-type */ $subtokens);
Loading history...
266 14
                for ($k = 0; $k + 1 < $c; $k++) {
267 14
                    if (is_array($subtokens[$k]) || !is_array($subtokens[++$k])) {
268
                        return false;
0 ignored issues
show
Bug Best Practice introduced by
The expression return false returns the type false which is incompatible with the documented return type string.
Loading history...
269
                    }
270 14
                    $subpattern = $this->replaceNamedArguments(implode(',', $subtokens[$k]), $givenParams, $resultingParams, $map);
271 14
                    $subtokens[$k] = $quote . '{' . $quote . $subpattern . $quote . '}' . $quote;
272
                }
273 14
                $token[2] = implode('', $subtokens);
0 ignored issues
show
Bug introduced by
It seems like $subtokens can also be of type true; however, parameter $pieces of implode() does only seem to accept array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

273
                $token[2] = implode('', /** @scrutinizer ignore-type */ $subtokens);
Loading history...
274
            }
275 23
            $tokens[$i] = $quote . '{' . $quote . implode(',', $token) . $quote . '}' . $quote;
276
        }
277
278 23
        return implode('', $tokens);
279
    }
280
281
    /**
282
     * Fallback implementation for MessageFormatter::formatMessage.
283
     * @param string $pattern The pattern string to insert things into.
284
     * @param array $args The array of values to insert into the format string
285
     * @param string $locale The locale to use for formatting locale-dependent parts
286
     * @return false|string The formatted pattern string or `false` if an error occurred
287
     */
288
    protected function fallbackFormat($pattern, $args, $locale)
289
    {
290
        if (($tokens = self::tokenizePattern($pattern)) === false) {
291
            $this->_errorCode = -1;
292
            $this->_errorMessage = 'Message pattern is invalid.';
293
294
            return false;
295
        }
296
        foreach ($tokens as $i => $token) {
297
            if (is_array($token)) {
298
                if (($tokens[$i] = $this->parseToken($token, $args, $locale)) === false) {
299
                    $this->_errorCode = -1;
300
                    $this->_errorMessage = 'Message pattern is invalid.';
301
302
                    return false;
303
                }
304
            }
305
        }
306
307
        return implode('', $tokens);
308
    }
309
310
    /**
311
     * Tokenizes a pattern by separating normal text from replaceable patterns.
312
     * @param string $pattern patter to tokenize
313
     * @return array|bool array of tokens or false on failure
314
     */
315 31
    private static function tokenizePattern($pattern)
316
    {
317 31
        $charset = \XoopsLocale::getCharset();
318 31
        $depth = 1;
319 31
        if (($start = $pos = mb_strpos($pattern, '{', 0, $charset)) === false) {
320 13
            return [$pattern];
321
        }
322 31
        $tokens = [mb_substr($pattern, 0, $pos, $charset)];
323 31
        while (true) {
324 31
            $open = mb_strpos($pattern, '{', $pos + 1, $charset);
325 31
            $close = mb_strpos($pattern, '}', $pos + 1, $charset);
326 31
            if ($open === false && $close === false) {
327 30
                break;
328
            }
329 31
            if ($open === false) {
330 30
                $open = mb_strlen($pattern, $charset);
331
            }
332 31
            if ($close > $open) {
333 22
                $depth++;
334 22
                $pos = $open;
335
            } else {
336 31
                $depth--;
337 31
                $pos = $close;
338
            }
339 31
            if ($depth === 0) {
340 31
                $tokens[] = explode(',', mb_substr($pattern, $start + 1, $pos - $start - 1, $charset), 3);
341 31
                $start = $pos + 1;
342 31
                $tokens[] = mb_substr($pattern, $start, $open - $start, $charset);
343 31
                $start = $open;
344
            }
345
346 31
            if ($depth !== 0 && ($open === false || $close === false)) {
347 1
                break;
348
            }
349
        }
350 31
        if ($depth !== 0) {
351 1
            return false;
352
        }
353
354 30
        return $tokens;
355
    }
356
357
    /**
358
     * Parses a token.
359
     * @param array $token the token to parse
360
     * @param array $args arguments to replace
361
     * @param string $locale the locale
362
     * @return bool|string parsed token or false on failure
363
     * @throws \Xoops\Core\Exception\NotSupportedException when unsupported formatting is used.
364
     */
365
    private function parseToken($token, $args, $locale)
366
    {
367
        // parsing pattern based on ICU grammar:
368
        // http://icu-project.org/apiref/icu4c/classMessageFormat.html#details
369
        $charset = \XoopsLocale::getCharset();
370
        $param = trim($token[0]);
371
        if (isset($args[$param])) {
372
            $arg = $args[$param];
373
        } else {
374
            return '{' . implode(',', $token) . '}';
375
        }
376
        $type = isset($token[1]) ? trim($token[1]) : 'none';
377
        switch ($type) {
378
            case 'date':
379
            case 'time':
380
            case 'spellout':
381
            case 'ordinal':
382
            case 'duration':
383
            case 'choice':
384
            case 'selectordinal':
385
                throw new NotSupportedException("Message format '$type' is not supported. You have to install PHP intl extension to use this feature.");
386
            case 'number':
387
                $format = isset($token[2]) ? trim($token[2]) : null;
388
                if (is_numeric($arg) && ($format === null || $format === 'integer')) {
389
                    $number = number_format($arg);
390
                    if ($format === null && ($pos = strpos($arg, '.')) !== false) {
391
                        // add decimals with unknown length
392
                        $number .= '.' . substr($arg, $pos + 1);
393
                    }
394
395
                    return $number;
396
                }
397
                throw new NotSupportedException("Message format 'number' is only supported for integer values. You have to install PHP intl extension to use this feature.");
398
            case 'none':
399
                return $arg;
400
            case 'select':
401
                /* http://icu-project.org/apiref/icu4c/classicu_1_1SelectFormat.html
402
                selectStyle = (selector '{' message '}')+
403
                */
404
                if (!isset($token[2])) {
405
                    return false;
406
                }
407
                $select = self::tokenizePattern($token[2]);
408
                $c = count($select);
0 ignored issues
show
Bug introduced by
It seems like $select can also be of type boolean; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

408
                $c = count(/** @scrutinizer ignore-type */ $select);
Loading history...
409
                $message = false;
410
                for ($i = 0; $i + 1 < $c; $i++) {
411
                    if (is_array($select[$i]) || !is_array($select[$i + 1])) {
412
                        return false;
413
                    }
414
                    $selector = trim($select[$i++]);
415
                    if ($message === false && $selector === 'other' || $selector == $arg) {
0 ignored issues
show
introduced by
Consider adding parentheses for clarity. Current Interpretation: {currentAssign}, Probably Intended Meaning: {alternativeAssign}
Loading history...
416
                        $message = implode(',', $select[$i]);
417
                    }
418
                }
419
                if ($message !== false) {
420
                    return $this->fallbackFormat($message, $args, $locale);
421
                }
422
                break;
423
            case 'plural':
424
                /* http://icu-project.org/apiref/icu4c/classicu_1_1PluralFormat.html
425
                pluralStyle = [offsetValue] (selector '{' message '}')+
426
                offsetValue = "offset:" number
427
                selector = explicitValue | keyword
428
                explicitValue = '=' number  // adjacent, no white space in between
429
                keyword = [^[[:Pattern_Syntax:][:Pattern_White_Space:]]]+
430
                message: see MessageFormat
431
                */
432
                if (!isset($token[2])) {
433
                    return false;
434
                }
435
                $plural = self::tokenizePattern($token[2]);
436
                $c = count($plural);
437
                $message = false;
438
                $offset = 0;
439
                for ($i = 0; $i + 1 < $c; $i++) {
440
                    if (is_array($plural[$i]) || !is_array($plural[$i + 1])) {
441
                        return false;
442
                    }
443
                    $selector = trim($plural[$i++]);
444
445
                    if ($i == 1 && strncmp($selector, 'offset:', 7) === 0) {
446
                        $offset = (int) trim(mb_substr($selector, 7, ($pos = mb_strpos(str_replace(["\n", "\r", "\t"], ' ', $selector), ' ', 7, $charset)) - 7, $charset));
447
                        $selector = trim(mb_substr($selector, $pos + 1, mb_strlen($selector, $charset), $charset));
448
                    }
449
                    if ($message === false && $selector === 'other' ||
450
                        $selector[0] === '=' && (int) mb_substr($selector, 1, mb_strlen($selector, $charset), $charset) === $arg ||
451
                        $selector === 'one' && $arg - $offset == 1
452
                    ) {
453
                        $message = implode(',', str_replace('#', $arg - $offset, $plural[$i]));
454
                    }
455
                }
456
                if ($message !== false) {
457
                    return $this->fallbackFormat($message, $args, $locale);
458
                }
459
                break;
460
        }
461
462
        return false;
463
    }
464
}
465