Passed
Pull Request — master (#39)
by Alexander
07:04
created

StringHelper::byteSubstring()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 3
Code Lines 1

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 2
CRAP Score 1

Importance

Changes 0
Metric Value
cc 1
eloc 1
nc 1
nop 3
dl 0
loc 3
ccs 2
cts 2
cp 1
crap 1
rs 10
c 0
b 0
f 0
1
<?php
2
3
declare(strict_types=1);
4
5
namespace Yiisoft\Strings;
6
7
use function array_slice;
8
use function htmlspecialchars;
9
use function mb_strlen;
10
use function mb_strtolower;
11
use function mb_strtoupper;
12
use function mb_substr;
13
14
/**
15
 * Provides static methods allowing you to deal with strings more efficiently.
16
 */
17
final class StringHelper
18
{
19
    /**
20
     * Returns the number of bytes in the given string.
21
     * This method ensures the string is treated as a byte array even if `mbstring.func_overload` is turned on
22
     * by using {@see mb_strlen()}.
23
     * @param string|null $input The string being measured for length.
24
     * @return int The number of bytes in the given string.
25
     */
26 40
    public static function byteLength(?string $input): int
27
    {
28 40
        return mb_strlen((string)$input, '8bit');
29
    }
30
31
    /**
32
     * Returns the portion of string specified by the start and length parameters.
33
     * This method ensures the string is treated as a byte array by using `mb_substr()`.
34
     * @param string $input The input string. Must be one character or longer.
35
     * @param int $start The starting position.
36
     * @param int|null $length The desired portion length. If not specified or `null`, there will be
37
     * no limit on length i.e. the output will be until the end of the string.
38
     * @return string The extracted part of string, or FALSE on failure or an empty string.
39
     * @see http://www.php.net/manual/en/function.substr.php
40
     */
41 1
    public static function byteSubstring(string $input, int $start, int $length = null): string
42
    {
43 1
        return mb_substr($input, $start, $length ?? mb_strlen($input, '8bit'), '8bit');
44
    }
45
46
    /**
47
     * Returns the trailing name component of a path.
48
     * This method is similar to the php function `basename()` except that it will
49
     * treat both \ and / as directory separators, independent of the operating system.
50
     * This method was mainly created to work on php namespaces. When working with real
51
     * file paths, PHP's `basename()` should work fine for you.
52
     * Note: this method is not aware of the actual filesystem, or path components such as "..".
53
     *
54
     * @param string $path A path string.
55
     * @param string $suffix If the name component ends in suffix this will also be cut off.
56
     * @return string The trailing name component of the given path.
57
     * @see http://www.php.net/manual/en/function.basename.php
58
     */
59 1
    public static function baseName(string $path, string $suffix = ''): string
60
    {
61 1
        $length = mb_strlen($suffix);
62 1
        if ($length > 0 && mb_substr($path, -$length) === $suffix) {
63 1
            $path = mb_substr($path, 0, -$length);
64
        }
65 1
        $path = rtrim(str_replace('\\', '/', $path), '/\\');
66 1
        $position = mb_strrpos($path, '/');
67 1
        if ($position !== false) {
68 1
            return mb_substr($path, $position + 1);
69
        }
70
71 1
        return $path;
72
    }
73
74
    /**
75
     * Returns parent directory's path.
76
     * This method is similar to `dirname()` except that it will treat
77
     * both \ and / as directory separators, independent of the operating system.
78
     *
79
     * @param string $path A path string.
80
     * @return string The parent directory's path.
81
     * @see http://www.php.net/manual/en/function.basename.php
82
     */
83 1
    public static function directoryName(string $path): string
84
    {
85 1
        $position = mb_strrpos(str_replace('\\', '/', $path), '/');
86 1
        if ($position !== false) {
87 1
            return mb_substr($path, 0, $position);
88
        }
89
90 1
        return '';
91
    }
92
93
    /**
94
     * Check if given string starts with specified substring.
95
     * Binary and multibyte safe.
96
     *
97
     * @param string $input Input string.
98
     * @param string|null $with Part to search inside the $string.
99
     * @return bool Returns true if first input starts with second input, false otherwise.
100
     */
101 19
    public static function startsWith(string $input, ?string $with): bool
102
    {
103 19
        $bytes = static::byteLength($with);
104 19
        if ($bytes === 0) {
105 3
            return true;
106
        }
107
108 16
        return strncmp($input, $with, $bytes) === 0;
109
    }
110
111
    /**
112
     * Check if given string starts with specified substring ignoring case.
113
     * Binary and multibyte safe.
114
     *
115
     * @param string $input Input string.
116
     * @param string|null $with Part to search inside the $string.
117
     * @return bool Returns true if first input starts with second input, false otherwise.
118
     */
119 1
    public static function startsWithIgnoringCase(string $input, ?string $with): bool
120
    {
121 1
        $bytes = static::byteLength($with);
122 1
        if ($bytes === 0) {
123 1
            return true;
124
        }
125
126 1
        return static::lowercase(static::substring($input, 0, $bytes, '8bit')) === static::lowercase($with);
0 ignored issues
show
Bug introduced by
It seems like $with can also be of type null; however, parameter $string of Yiisoft\Strings\StringHelper::lowercase() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

126
        return static::lowercase(static::substring($input, 0, $bytes, '8bit')) === static::lowercase(/** @scrutinizer ignore-type */ $with);
Loading history...
127
    }
128
129
    /**
130
     * Check if given string ends with specified substring.
131
     * Binary and multibyte safe.
132
     *
133
     * @param string $input Input string to check.
134
     * @param string|null $with Part to search inside of the $string.
135
     * @return bool Returns true if first input ends with second input, false otherwise.
136
     */
137 19
    public static function endsWith(string $input, ?string $with): bool
138
    {
139 19
        $bytes = static::byteLength($with);
140 19
        if ($bytes === 0) {
141 3
            return true;
142
        }
143
144
        // Warning check, see http://php.net/manual/en/function.substr-compare.php#refsect1-function.substr-compare-returnvalues
145 16
        if (static::byteLength($input) < $bytes) {
146 3
            return false;
147
        }
148
149 13
        return substr_compare($input, $with, -$bytes, $bytes) === 0;
150
    }
151
152
    /**
153
     * Check if given string ends with specified substring.
154
     * Binary and multibyte safe.
155
     *
156
     * @param string $input Input string to check.
157
     * @param string|null $with Part to search inside of the $string.
158
     * @return bool Returns true if first input ends with second input, false otherwise.
159
     */
160 1
    public static function endsWithIgnoringCase(string $input, ?string $with): bool
161
    {
162 1
        $bytes = static::byteLength($with);
163 1
        if ($bytes === 0) {
164 1
            return true;
165
        }
166
167 1
        return static::lowercase(mb_substr($input, -$bytes, mb_strlen($input, '8bit'), '8bit')) === static::lowercase($with);
0 ignored issues
show
Bug introduced by
It seems like $with can also be of type null; however, parameter $string of Yiisoft\Strings\StringHelper::lowercase() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

167
        return static::lowercase(mb_substr($input, -$bytes, mb_strlen($input, '8bit'), '8bit')) === static::lowercase(/** @scrutinizer ignore-type */ $with);
Loading history...
168
    }
169
170
    /**
171
     * Truncates a string from the beginning to the number of characters specified.
172
     *
173
     * @param string $input String to process.
174
     * @param int $length Maximum length of the truncated string including trim marker.
175
     * @param string $trimMarker String to append to the beginning.
176
     * @param string $encoding The encoding to use, defaults to "UTF-8".
177
     * @return string
178
     */
179 1
    public static function truncateBegin(string $input, int $length, string $trimMarker = '…', string $encoding = 'UTF-8'): string
180
    {
181 1
        $inputLength = mb_strlen($input, $encoding);
182
183 1
        if ($inputLength <= $length) {
184
            return $input;
185
        }
186
187 1
        $trimMarkerLength = mb_strlen($trimMarker, $encoding);
188 1
        return self::replaceSubstring($input, $trimMarker, 0, -$length + $trimMarkerLength, $encoding);
189
    }
190
191
    /**
192
     * Truncates a string in the middle. Keeping start and end.
193
     * `StringHelper::truncateMiddle('Hello world number 2', 8)` produces "Hell…r 2".
194
     *
195
     * @param string $input The string to truncate.
196
     * @param int $length Maximum length of the truncated string including trim marker.
197
     * @param string $trimMarker String to append in the middle of truncated string.
198
     * @param string $encoding The encoding to use, defaults to "UTF-8".
199
     * @return string The truncated string.
200
     */
201 2
    public static function truncateMiddle(string $input, int $length, string $trimMarker = '…', string $encoding = 'UTF-8'): string
202
    {
203 2
        $inputLength = mb_strlen($input, $encoding);
204
205 2
        if ($inputLength <= $length) {
206 1
            return $input;
207
        }
208
209 1
        $trimMarkerLength = mb_strlen($trimMarker, $encoding);
210 1
        $start = (int)ceil(($length - $trimMarkerLength) / 2);
211 1
        $end = $length - $start - $trimMarkerLength;
212
213 1
        return self::replaceSubstring($input, $trimMarker, $start, -$end, $encoding);
214
    }
215
216
    /**
217
     * Truncates a string from the end to the number of characters specified.
218
     *
219
     * @param string $input The string to truncate.
220
     * @param int $length Maximum length of the truncated string including trim marker.
221
     * @param string $trimMarker String to append to the end of truncated string.
222
     * @param string $encoding The encoding to use, defaults to "UTF-8".
223
     * @return string The truncated string.
224
     */
225 1
    public static function truncateEnd(string $input, int $length, string $trimMarker = '…', string $encoding = 'UTF-8'): string
226
    {
227 1
        $inputLength = mb_strlen($input, $encoding);
228
229 1
        if ($inputLength <= $length) {
230 1
            return $input;
231
        }
232
233 1
        $trimMarkerLength = mb_strlen($trimMarker, $encoding);
234 1
        return rtrim(mb_substr($input, 0, $length - $trimMarkerLength, $encoding)) . $trimMarker;
235
    }
236
237
    /**
238
     * Truncates a string to the number of words specified.
239
     *
240
     * @param string $input The string to truncate.
241
     * @param int $count How many words from original string to include into truncated string.
242
     * @param string $trimMarker String to append to the end of truncated string.
243
     * @return string The truncated string.
244
     */
245 1
    public static function truncateWords(string $input, int $count, string $trimMarker = '…'): string
246
    {
247 1
        $words = preg_split('/(\s+)/u', trim($input), -1, PREG_SPLIT_DELIM_CAPTURE);
248 1
        if (count($words) / 2 > $count) {
0 ignored issues
show
Bug introduced by
It seems like $words can also be of type false; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

248
        if (count(/** @scrutinizer ignore-type */ $words) / 2 > $count) {
Loading history...
249 1
            return implode('', array_slice($words, 0, ($count * 2) - 1)) . $trimMarker;
0 ignored issues
show
Bug introduced by
It seems like $words can also be of type false; however, parameter $array of array_slice() does only seem to accept array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

249
            return implode('', array_slice(/** @scrutinizer ignore-type */ $words, 0, ($count * 2) - 1)) . $trimMarker;
Loading history...
250
        }
251
252 1
        return $input;
253
    }
254
255
    /**
256
     * Counts words in a string.
257
     *
258
     * @param string $input
259
     * @return int
260
     */
261 1
    public static function countWords(string $input): int
262
    {
263 1
        return count(preg_split('/\s+/u', $input, -1, PREG_SPLIT_NO_EMPTY));
0 ignored issues
show
Bug introduced by
It seems like preg_split('/\s+/u', $in...gs\PREG_SPLIT_NO_EMPTY) can also be of type false; however, parameter $var of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

263
        return count(/** @scrutinizer ignore-type */ preg_split('/\s+/u', $input, -1, PREG_SPLIT_NO_EMPTY));
Loading history...
264
    }
265
266
    /**
267
     * Explodes string into array, optionally trims values and skips empty ones.
268
     *
269
     * @param string $input String to be exploded.
270
     * @param string $delimiter Delimiter. Default is ','.
271
     * @param mixed $trim Whether to trim each element. Can be:
272
     *   - boolean - to trim normally;
273
     *   - string - custom characters to trim. Will be passed as a second argument to `trim()` function.
274
     *   - callable - will be called for each value instead of trim. Takes the only argument - value.
275
     * @param bool $skipEmpty Whether to skip empty strings between delimiters. Default is false.
276
     * @return array
277
     */
278 1
    public static function explode(string $input, string $delimiter = ',', $trim = true, bool $skipEmpty = false): array
279
    {
280 1
        $result = explode($delimiter, $input);
281 1
        if ($trim !== false) {
282 1
            if ($trim === true) {
283 1
                $trim = 'trim';
284 1
            } elseif (!\is_callable($trim)) {
285 1
                $trim = static function ($v) use ($trim) {
286 1
                    return trim($v, $trim);
287 1
                };
288
            }
289 1
            $result = array_map($trim, $result);
290
        }
291 1
        if ($skipEmpty) {
292
            // Wrapped with array_values to make array keys sequential after empty values removing
293 1
            $result = array_values(array_filter($result, static function ($value) {
294 1
                return $value !== '';
295 1
            }));
296
        }
297
298 1
        return $result;
299
    }
300
301
    /**
302
     * Encodes string into "Base 64 Encoding with URL and Filename Safe Alphabet" (RFC 4648).
303
     *
304
     * > Note: Base 64 padding `=` may be at the end of the returned string.
305
     * > `=` is not transparent to URL encoding.
306
     *
307
     * @see https://tools.ietf.org/html/rfc4648#page-7
308
     * @param string $input The string to encode.
309
     * @return string Encoded string.
310
     */
311 4
    public static function base64UrlEncode(string $input): string
312
    {
313 4
        return strtr(base64_encode($input), '+/', '-_');
314
    }
315
316
    /**
317
     * Decodes "Base 64 Encoding with URL and Filename Safe Alphabet" (RFC 4648).
318
     *
319
     * @see https://tools.ietf.org/html/rfc4648#page-7
320
     * @param string $input Encoded string.
321
     * @return string Decoded string.
322
     */
323 4
    public static function base64UrlDecode(string $input): string
324
    {
325 4
        return base64_decode(strtr($input, '-_', '+/'));
326
    }
327
328
    /**
329
     * Returns string representation of number value with replaced commas to dots, if decimal point
330
     * of current locale is comma.
331
     * @param int|float|string $value
332
     * @return string
333
     */
334
    public static function normalizeNumber($value): string
335
    {
336
        $value = (string)$value;
337
338
        $localeInfo = localeconv();
339
        $decimalSeparator = $localeInfo['decimal_point'] ?? null;
340
341
        if ($decimalSeparator !== null && $decimalSeparator !== '.') {
342
            $value = str_replace($decimalSeparator, '.', $value);
343
        }
344
345
        return $value;
346
    }
347
348
    /**
349
     * Safely casts a float to string independent of the current locale.
350
     *
351
     * The decimal separator will always be `.`.
352
     * @param float|int $number A floating point number or integer.
353
     * @return string The string representation of the number.
354
     */
355 1
    public static function floatToString($number): string
356
    {
357
        // . and , are the only decimal separators known in ICU data,
358
        // so its safe to call str_replace here
359 1
        return str_replace(',', '.', (string) $number);
360
    }
361
362
    /**
363
     * Make a string's first character uppercase.
364
     * This method provides a unicode-safe implementation of built-in PHP function `ucfirst()`.
365
     *
366
     * @param string $string The string to be processed.
367
     * @param string $encoding The encoding to use, defaults to "UTF-8".
368
     * @return string
369
     * @see https://php.net/manual/en/function.ucfirst.php
370
     */
371 23
    public static function uppercaseFirstCharacter(string $string, string $encoding = 'UTF-8'): string
372
    {
373 23
        $firstCharacter = static::substring($string, 0, 1, $encoding);
374 23
        $rest = static::substring($string, 1, null, $encoding);
375
376 23
        return static::uppercase($firstCharacter, $encoding) . $rest;
377
    }
378
379
    /**
380
     * Uppercase the first character of each word in a string.
381
     * This method provides a unicode-safe implementation of built-in PHP function `ucwords()`.
382
     *
383
     * @param string $string The string to be processed.
384
     * @param string $encoding The encoding to use, defaults to "UTF-8".
385
     * @see https://php.net/manual/en/function.ucwords.php
386
     * @return string
387
     */
388 19
    public static function uppercaseFirstCharacterInEachWord(string $string, string $encoding = 'UTF-8'): string
389
    {
390 19
        $words = preg_split("/\s/u", $string, -1, PREG_SPLIT_NO_EMPTY);
391
392 19
        $wordsWithUppercaseFirstCharacter = array_map(static function ($word) use ($encoding) {
393 18
            return static::uppercaseFirstCharacter($word, $encoding);
394 19
        }, $words);
0 ignored issues
show
Bug introduced by
It seems like $words can also be of type false; however, parameter $arr1 of array_map() does only seem to accept array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

394
        }, /** @scrutinizer ignore-type */ $words);
Loading history...
395
396 19
        return implode(' ', $wordsWithUppercaseFirstCharacter);
397
    }
398
399
    /**
400
     * Get string length.
401
     *
402
     * @param string $string String to calculate length for.
403
     * @param string $encoding The encoding to use, defaults to "UTF-8".
404
     * @see https://php.net/manual/en/function.mb-strlen.php
405
     * @return int
406
     */
407 1
    public static function length(string $string, string $encoding = 'UTF-8'): int
408
    {
409 1
        return mb_strlen($string, $encoding);
410
    }
411
412
    /**
413
     * Get part of string.
414
     *
415
     * @param string $string To get substring from.
416
     * @param int $start Character to start at.
417
     * @param int|null $length Number of characters to get.
418
     * @param string $encoding The encoding to use, defaults to "UTF-8".
419
     * @see https://php.net/manual/en/function.mb-substr.php
420
     * @return string
421
     */
422 24
    public static function substring(string $string, int $start, int $length = null, string $encoding = 'UTF-8'): string
423
    {
424 24
        return mb_substr($string, $start, $length, $encoding);
425
    }
426
427
    /**
428
     * Make a string lowercase.
429
     *
430
     * @param string $string String to process.
431
     * @param string $encoding The encoding to use, defaults to "UTF-8".
432
     * @see https://php.net/manual/en/function.mb-strtolower.php
433
     * @return string
434
     */
435 3
    public static function lowercase(string $string, string $encoding = 'UTF-8'): string
436
    {
437 3
        return mb_strtolower($string, $encoding);
438
    }
439
440
    /**
441
     * Make a string uppercase.
442
     *
443
     * @param string $string String to process.
444
     * @param string $encoding The encoding to use, defaults to "UTF-8".
445
     * @see https://php.net/manual/en/function.mb-strtoupper.php
446
     * @return string
447
     */
448 24
    public static function uppercase(string $string, string $encoding = 'UTF-8'): string
449
    {
450 24
        return mb_strtoupper($string, $encoding);
451
    }
452
453
    /**
454
     * Replace text within a portion of a string.
455
     *
456
     * @param string $string The input string.
457
     * @param string $replacement The replacement string.
458
     * @param int $start Position to begin replacing substring at.
459
     * If start is non-negative, the replacing will begin at the start'th offset into string.
460
     * If start is negative, the replacing will begin at the start'th character from the end of string.
461
     * @param int|null $length Length of the substring to be replaced.
462
     * If given and is positive, it represents the length of the portion of string which is to be replaced.
463
     * If it is negative, it represents the number of characters from the end of string at which to stop replacing.
464
     * If it is not given, then it will default to the length of the string; i.e. end the replacing at the end of string.
465
     * If length is zero then this function will have the effect of inserting replacement into string at the given start offset.
466
     * @param string $encoding The encoding to use, defaults to "UTF-8".
467
     * @return string
468
     */
469 2
    public static function replaceSubstring(string $string, string $replacement, int $start, ?int $length = null, string $encoding = 'UTF-8'): string
470
    {
471 2
        $stringLength = mb_strlen($string, $encoding);
472
473 2
        if ($start < 0) {
474
            $start = \max(0, $stringLength + $start);
475 2
        } elseif ($start > $stringLength) {
476
            $start = $stringLength;
477
        }
478
479 2
        if ($length !== null && $length < 0) {
480 2
            $length = \max(0, $stringLength - $start + $length);
481
        } elseif ($length === null || $length > $stringLength) {
482
            $length = $stringLength;
483
        }
484
485 2
        if (($start + $length) > $stringLength) {
486
            $length = $stringLength - $start;
487
        }
488
489 2
        return mb_substr($string, 0, $start, $encoding) . $replacement . mb_substr($string, $start + $length, $stringLength - $start - $length, $encoding);
490
    }
491
492
    /**
493
     * Convert special characters to HTML entities.
494
     *
495
     * @param string $string String to process.
496
     * @param int $flags A bitmask of one or more flags.
497
     * @param bool $doubleEncode If set to false, method will not encode existing HTML entities.
498
     * @param string|null $encoding The encoding to use, defaults to `ini_get('default_charset')`.
499
     * @return string
500
     * @see https://php.net/manual/en/function.htmlspecialchars.php
501
     */
502 1
    public static function htmlspecialchars(string $string, int $flags, bool $doubleEncode = true, string $encoding = null): string
503
    {
504 1
        return $encoding === null && $doubleEncode
505 1
            ? htmlspecialchars($string, $flags)
506 1
            : htmlspecialchars($string, $flags, $encoding ?: ini_get('default_charset'), $doubleEncode);
507
    }
508
509
    /**
510
     * Converts a list of words into a sentence.
511
     *
512
     * Special treatment is done for the last few words. For example,
513
     *
514
     * ```php
515
     * $words = ['Spain', 'France'];
516
     * echo Inflector::sentence($words);
517
     * // output: Spain and France
518
     *
519
     * $words = ['Spain', 'France', 'Italy'];
520
     * echo Inflector::sentence($words);
521
     * // output: Spain, France and Italy
522
     *
523
     * $words = ['Spain', 'France', 'Italy'];
524
     * echo Inflector::sentence($words, ' & ');
525
     * // output: Spain, France & Italy
526
     * ```
527
     *
528
     * @param array $words The words to be converted into an string.
529
     * @param string $twoWordsConnector The string connecting words when there are only two. Default to " and ".
530
     * @param string|null $lastWordConnector The string connecting the last two words. If this is null, it will
531
     * take the value of `$twoWordsConnector`.
532
     * @param string $connector The string connecting words other than those connected by
533
     * $lastWordConnector and $twoWordsConnector.
534
     * @return string The generated sentence.
535
     */
536 1
    public static function sentence(array $words, string $twoWordsConnector = ' and ', ?string $lastWordConnector = null, string $connector = ', '): ?string
537
    {
538 1
        if ($lastWordConnector === null) {
539 1
            $lastWordConnector = $twoWordsConnector;
540
        }
541 1
        switch (count($words)) {
542 1
            case 0:
543 1
                return '';
544 1
            case 1:
545 1
                return reset($words);
546 1
            case 2:
547 1
                return implode($twoWordsConnector, $words);
548
            default:
549 1
                return implode($connector, \array_slice($words, 0, -1)) . $lastWordConnector . end($words);
550
        }
551
    }
552
}
553