Completed
Pull Request — master (#271)
by
unknown
03:48
created

Requests_IRI   D

Complexity

Total Complexity 221

Size/Duplication

Total Lines 1019
Duplicated Lines 10.79 %

Coupling/Cohesion

Components 1
Dependencies 2

Importance

Changes 0
Metric Value
dl 110
loc 1019
rs 4.4375
c 0
b 0
f 0
wmc 221
lcom 1
cbo 2

26 Methods

Rating   Name   Duplication   Size   Complexity  
A IRI::__toString() 0 3 1
B IRI::__set() 0 15 8
C IRI::__get() 0 39 12
A IRI::__isset() 0 3 2
A IRI::__unset() 0 5 2
A IRI::__construct() 0 3 1
C IRI::absolutize() 0 64 20
C IRI::parse_iri() 0 24 10
C IRI::remove_dot_segments() 8 52 14
D IRI::replace_invalid_with_pct_encoding() 19 113 28
D IRI::remove_iunreserved_percent_encoded() 34 113 36
D IRI::scheme_normalization() 18 23 15
B IRI::is_valid() 0 18 11
C IRI::set_iri() 0 39 8
A IRI::set_scheme() 0 13 3
C IRI::set_authority() 0 51 10
A IRI::set_userinfo() 11 11 2
C IRI::set_host() 0 39 7
A IRI::set_port() 0 15 3
A IRI::set_path() 0 21 4
A IRI::set_query() 10 10 2
A IRI::set_fragment() 10 10 2
A IRI::to_uri() 0 20 4
B IRI::get_iri() 0 22 6
A IRI::get_uri() 0 3 1
B IRI::get_iauthority() 0 17 7

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like Requests_IRI often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Requests_IRI, and based on these observations, apply Extract Interface, too.

1
<?php
2
namespace Rmccue\Requests;
3
4
use Rmccue\Requests\IPv6 as IPv6;
5
6
/**
7
 * IRI parser/serialiser/normaliser
8
 *
9
 * @package Rmccue\Requests
10
 * @subpackage Utilities
11
 */
12
13
/**
14
 * IRI parser/serialiser/normaliser
15
 *
16
 * Copyright (c) 2007-2010, Geoffrey Sneddon and Steve Minutillo.
17
 * All rights reserved.
18
 *
19
 * Redistribution and use in source and binary forms, with or without
20
 * modification, are permitted provided that the following conditions are met:
21
 *
22
 *  * Redistributions of source code must retain the above copyright notice,
23
 *       this list of conditions and the following disclaimer.
24
 *
25
 *  * Redistributions in binary form must reproduce the above copyright notice,
26
 *       this list of conditions and the following disclaimer in the documentation
27
 *       and/or other materials provided with the distribution.
28
 *
29
 *  * Neither the name of the SimplePie Team nor the names of its contributors
30
 *       may be used to endorse or promote products derived from this software
31
 *       without specific prior written permission.
32
 *
33
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
34
 * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
35
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
36
 * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS AND CONTRIBUTORS BE
37
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
38
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
39
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
40
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
41
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
42
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
43
 * POSSIBILITY OF SUCH DAMAGE.
44
 *
45
 * @package Rmccue\Requests
46
 * @subpackage Utilities
47
 * @author Geoffrey Sneddon
48
 * @author Steve Minutillo
49
 * @copyright 2007-2009 Geoffrey Sneddon and Steve Minutillo
50
 * @license http://www.opensource.org/licenses/bsd-license.php
51
 * @link http://hg.gsnedders.com/iri/
52
 *
53
 * @property string $iri IRI we're working with
54
 * @property-read string $uri IRI in URI form, {@see to_uri}
55
 * @property string $scheme Scheme part of the IRI
56
 * @property string $authority Authority part, formatted for a URI (userinfo + host + port)
57
 * @property string $iauthority Authority part of the IRI (userinfo + host + port)
58
 * @property string $userinfo Userinfo part, formatted for a URI (after '://' and before '@')
59
 * @property string $iuserinfo Userinfo part of the IRI (after '://' and before '@')
60
 * @property string $host Host part, formatted for a URI
61
 * @property string $ihost Host part of the IRI
62
 * @property string $port Port part of the IRI (after ':')
63
 * @property string $path Path part, formatted for a URI (after first '/')
64
 * @property string $ipath Path part of the IRI (after first '/')
65
 * @property string $query Query part, formatted for a URI (after '?')
66
 * @property string $iquery Query part of the IRI (after '?')
67
 * @property string $fragment Fragment, formatted for a URI (after '#')
68
 * @property string $ifragment Fragment part of the IRI (after '#')
69
 */
70
class IRI {
71
	/**
72
	 * Scheme
73
	 *
74
	 * @var string
75
	 */
76
	protected $scheme = null;
77
78
	/**
79
	 * User Information
80
	 *
81
	 * @var string
82
	 */
83
	protected $iuserinfo = null;
84
85
	/**
86
	 * ihost
87
	 *
88
	 * @var string
89
	 */
90
	protected $ihost = null;
91
92
	/**
93
	 * Port
94
	 *
95
	 * @var string
96
	 */
97
	protected $port = null;
98
99
	/**
100
	 * ipath
101
	 *
102
	 * @var string
103
	 */
104
	protected $ipath = '';
105
106
	/**
107
	 * iquery
108
	 *
109
	 * @var string
110
	 */
111
	protected $iquery = null;
112
113
	/**
114
	 * ifragment
115
	 *
116
	 * @var string
117
	 */
118
	protected $ifragment = null;
119
120
	/**
121
	 * Normalization database
122
	 *
123
	 * Each key is the scheme, each value is an array with each key as the IRI
124
	 * part and value as the default value for that part.
125
	 */
126
	protected $normalization = array(
127
		'acap' => array(
128
			'port' => 674
129
		),
130
		'dict' => array(
131
			'port' => 2628
132
		),
133
		'file' => array(
134
			'ihost' => 'localhost'
135
		),
136
		'http' => array(
137
			'port' => 80,
138
		),
139
		'https' => array(
140
			'port' => 443,
141
		),
142
	);
143
144
	/**
145
	 * Return the entire IRI when you try and read the object as a string
146
	 *
147
	 * @return string
148
	 */
149
	public function __toString() {
150
		return $this->get_iri();
151
	}
152
153
	/**
154
	 * Overload __set() to provide access via properties
155
	 *
156
	 * @param string $name Property name
157
	 * @param mixed $value Property value
158
	 */
159
	public function __set($name, $value) {
160
		if (method_exists($this, 'set_' . $name)) {
161
			call_user_func(array($this, 'set_' . $name), $value);
162
		}
163
		elseif (
164
			   $name === 'iauthority'
165
			|| $name === 'iuserinfo'
166
			|| $name === 'ihost'
167
			|| $name === 'ipath'
168
			|| $name === 'iquery'
169
			|| $name === 'ifragment'
170
		) {
171
			call_user_func(array($this, 'set_' . substr($name, 1)), $value);
172
		}
173
	}
174
175
	/**
176
	 * Overload __get() to provide access via properties
177
	 *
178
	 * @param string $name Property name
179
	 * @return mixed
180
	 */
181
	public function __get($name) {
182
		// isset() returns false for null, we don't want to do that
183
		// Also why we use array_key_exists below instead of isset()
184
		$props = get_object_vars($this);
185
186
		if (
187
			$name === 'iri' ||
188
			$name === 'uri' ||
189
			$name === 'iauthority' ||
190
			$name === 'authority'
191
		) {
192
			$method = 'get_' . $name;
193
			$return = $this->$method();
194
		}
195
		elseif (array_key_exists($name, $props)) {
196
			$return = $this->$name;
197
		}
198
		// host -> ihost
199
		elseif (($prop = 'i' . $name) && array_key_exists($prop, $props)) {
200
			$name = $prop;
201
			$return = $this->$prop;
202
		}
203
		// ischeme -> scheme
204
		elseif (($prop = substr($name, 1)) && array_key_exists($prop, $props)) {
205
			$name = $prop;
206
			$return = $this->$prop;
207
		}
208
		else {
209
			trigger_error('Undefined property: ' . get_class($this) . '::' . $name, E_USER_NOTICE);
210
			$return = null;
211
		}
212
213
		if ($return === null && isset($this->normalization[$this->scheme][$name])) {
214
			return $this->normalization[$this->scheme][$name];
215
		}
216
		else {
217
			return $return;
218
		}
219
	}
220
221
	/**
222
	 * Overload __isset() to provide access via properties
223
	 *
224
	 * @param string $name Property name
225
	 * @return bool
226
	 */
227
	public function __isset($name) {
228
		return (method_exists($this, 'get_' . $name) || isset($this->$name));
229
	}
230
231
	/**
232
	 * Overload __unset() to provide access via properties
233
	 *
234
	 * @param string $name Property name
235
	 */
236
	public function __unset($name) {
237
		if (method_exists($this, 'set_' . $name)) {
238
			call_user_func(array($this, 'set_' . $name), '');
239
		}
240
	}
241
242
	/**
243
	 * Create a new IRI object, from a specified string
244
	 *
245
	 * @param string|null $iri
246
	 */
247
	public function __construct($iri = null) {
248
		$this->set_iri($iri);
249
	}
250
251
	/**
252
	 * Create a new IRI object by resolving a relative IRI
253
	 *
254
	 * Returns false if $base is not absolute, otherwise an IRI.
255
	 *
256
	 * @param IRI|string $base (Absolute) Base IRI
257
	 * @param IRI|string $relative Relative IRI
258
	 * @return IRI|false
259
	 */
260
	public static function absolutize($base, $relative) {
261
		if (!($relative instanceof IRI)) {
262
			$relative = new IRI($relative);
263
		}
264
		if (!$relative->is_valid()) {
265
			return false;
266
		}
267
		elseif ($relative->scheme !== null) {
268
			return clone $relative;
269
		}
270
271
		if (!($base instanceof IRI)) {
272
			$base = new IRI($base);
273
		}
274
		if ($base->scheme === null || !$base->is_valid()) {
275
			return false;
276
}
277
278
		if ($relative->get_iri() !== '') {
279
			if ($relative->iuserinfo !== null || $relative->ihost !== null || $relative->port !== null) {
280
				$target = clone $relative;
281
				$target->scheme = $base->scheme;
282
			}
283
			else {
284
				$target = new IRI;
285
				$target->scheme = $base->scheme;
286
				$target->iuserinfo = $base->iuserinfo;
287
				$target->ihost = $base->ihost;
288
				$target->port = $base->port;
289
				if ($relative->ipath !== '') {
290
					if ($relative->ipath[0] === '/') {
291
						$target->ipath = $relative->ipath;
292
					}
293
					elseif (($base->iuserinfo !== null || $base->ihost !== null || $base->port !== null) && $base->ipath === '') {
294
						$target->ipath = '/' . $relative->ipath;
295
					}
296
					elseif (($last_segment = strrpos($base->ipath, '/')) !== false) {
297
						$target->ipath = substr($base->ipath, 0, $last_segment + 1) . $relative->ipath;
298
					}
299
					else {
300
						$target->ipath = $relative->ipath;
301
					}
302
					$target->ipath = $target->remove_dot_segments($target->ipath);
303
					$target->iquery = $relative->iquery;
304
				}
305
				else {
306
					$target->ipath = $base->ipath;
307
					if ($relative->iquery !== null) {
308
						$target->iquery = $relative->iquery;
309
					}
310
					elseif ($base->iquery !== null) {
311
						$target->iquery = $base->iquery;
312
					}
313
				}
314
				$target->ifragment = $relative->ifragment;
315
			}
316
		}
317
		else {
318
			$target = clone $base;
319
			$target->ifragment = null;
320
		}
321
		$target->scheme_normalization();
322
		return $target;
323
	}
324
325
	/**
326
	 * Parse an IRI into scheme/authority/path/query/fragment segments
327
	 *
328
	 * @param string $iri
329
	 * @return array
330
	 */
331
	protected function parse_iri($iri) {
332
		$iri = trim($iri, "\x20\x09\x0A\x0C\x0D");
333
		$has_match = preg_match('/^((?P<scheme>[^:\/?#]+):)?(\/\/(?P<authority>[^\/?#]*))?(?P<path>[^?#]*)(\?(?P<query>[^#]*))?(#(?P<fragment>.*))?$/', $iri, $match);
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 160 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
334
		if (!$has_match) {
335
			throw new Exception('Cannot parse supplied IRI', 'iri.cannot_parse', $iri);
336
		}
337
338
		if ($match[1] === '') {
339
			$match['scheme'] = null;
340
		}
341
		if (!isset($match[3]) || $match[3] === '') {
342
			$match['authority'] = null;
343
		}
344
		if (!isset($match[5])) {
345
			$match['path'] = '';
346
		}
347
		if (!isset($match[6]) || $match[6] === '') {
348
			$match['query'] = null;
349
		}
350
		if (!isset($match[8]) || $match[8] === '') {
351
			$match['fragment'] = null;
352
		}
353
		return $match;
354
	}
355
356
	/**
357
	 * Remove dot segments from a path
358
	 *
359
	 * @param string $input
360
	 * @return string
361
	 */
362
	protected function remove_dot_segments($input) {
363
		$output = '';
364
		while (strpos($input, './') !== false || strpos($input, '/.') !== false || $input === '.' || $input === '..') {
365
			// A: If the input buffer begins with a prefix of "../" or "./",
366
			// then remove that prefix from the input buffer; otherwise,
367
			if (strpos($input, '../') === 0) {
368
				$input = substr($input, 3);
369
			}
370
			elseif (strpos($input, './') === 0) {
371
				$input = substr($input, 2);
372
			}
373
			// B: if the input buffer begins with a prefix of "/./" or "/.",
374
			// where "." is a complete path segment, then replace that prefix
375
			// with "/" in the input buffer; otherwise,
376
			elseif (strpos($input, '/./') === 0) {
377
				$input = substr($input, 2);
378
			}
379
			elseif ($input === '/.') {
380
				$input = '/';
381
			}
382
			// C: if the input buffer begins with a prefix of "/../" or "/..",
383
			// where ".." is a complete path segment, then replace that prefix
384
			// with "/" in the input buffer and remove the last segment and its
385
			// preceding "/" (if any) from the output buffer; otherwise,
386 View Code Duplication
			elseif (strpos($input, '/../') === 0) {
387
				$input = substr($input, 3);
388
				$output = substr_replace($output, '', strrpos($output, '/'));
389
			}
390
			elseif ($input === '/..') {
391
				$input = '/';
392
				$output = substr_replace($output, '', strrpos($output, '/'));
393
			}
394
			// D: if the input buffer consists only of "." or "..", then remove
395
			// that from the input buffer; otherwise,
396
			elseif ($input === '.' || $input === '..') {
397
				$input = '';
398
			}
399
			// E: move the first path segment in the input buffer to the end of
400
			// the output buffer, including the initial "/" character (if any)
401
			// and any subsequent characters up to, but not including, the next
402
			// "/" character or the end of the input buffer
403 View Code Duplication
			elseif (($pos = strpos($input, '/', 1)) !== false) {
404
				$output .= substr($input, 0, $pos);
405
				$input = substr_replace($input, '', 0, $pos);
406
			}
407
			else {
408
				$output .= $input;
409
				$input = '';
410
			}
411
		}
412
		return $output . $input;
413
	}
414
415
	/**
416
	 * Replace invalid character with percent encoding
417
	 *
418
	 * @param string $string Input string
419
	 * @param string $extra_chars Valid characters not in iunreserved or
420
	 *                            iprivate (this is ASCII-only)
421
	 * @param bool $iprivate Allow iprivate
422
	 * @return string
423
	 */
424
	protected function replace_invalid_with_pct_encoding($string, $extra_chars, $iprivate = false) {
425
		// Normalize as many pct-encoded sections as possible
426
		$string = preg_replace_callback('/(?:%[A-Fa-f0-9]{2})+/', array(&$this, 'remove_iunreserved_percent_encoded'), $string);
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 122 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
427
428
		// Replace invalid percent characters
429
		$string = preg_replace('/%(?![A-Fa-f0-9]{2})/', '%25', $string);
430
431
		// Add unreserved and % to $extra_chars (the latter is safe because all
432
		// pct-encoded sections are now valid).
433
		$extra_chars .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~%';
434
435
		// Now replace any bytes that aren't allowed with their pct-encoded versions
436
		$position = 0;
437
		$strlen = strlen($string);
438
		while (($position += strspn($string, $extra_chars, $position)) < $strlen) {
439
			$value = ord($string[$position]);
440
441
			// Start position
442
			$start = $position;
443
444
			// By default we are valid
445
			$valid = true;
446
447
			// No one byte sequences are valid due to the while.
448
			// Two byte sequence:
449
			if (($value & 0xE0) === 0xC0) {
450
				$character = ($value & 0x1F) << 6;
451
				$length = 2;
452
				$remaining = 1;
453
			}
454
			// Three byte sequence:
455 View Code Duplication
			elseif (($value & 0xF0) === 0xE0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
456
				$character = ($value & 0x0F) << 12;
457
				$length = 3;
458
				$remaining = 2;
459
			}
460
			// Four byte sequence:
461 View Code Duplication
			elseif (($value & 0xF8) === 0xF0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
462
				$character = ($value & 0x07) << 18;
463
				$length = 4;
464
				$remaining = 3;
465
			}
466
			// Invalid byte:
467
			else {
468
				$valid = false;
469
				$length = 1;
470
				$remaining = 0;
471
			}
472
473
			if ($remaining) {
474
				if ($position + $length <= $strlen) {
475
					for ($position++; $remaining; $position++) {
476
						$value = ord($string[$position]);
477
478
						// Check that the byte is valid, then add it to the character:
479 View Code Duplication
						if (($value & 0xC0) === 0x80) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
480
							$character |= ($value & 0x3F) << (--$remaining * 6);
481
						}
482
						// If it is invalid, count the sequence as invalid and reprocess the current byte:
483
						else {
484
							$valid = false;
485
							$position--;
486
							break;
487
						}
488
					}
489
				}
490
				else {
491
					$position = $strlen - 1;
492
					$valid = false;
493
				}
494
			}
495
496
			// Percent encode anything invalid or not in ucschar
497
			if (
498
				// Invalid sequences
499
				!$valid
500
				// Non-shortest form sequences are invalid
501
				|| $length > 1 && $character <= 0x7F
502
				|| $length > 2 && $character <= 0x7FF
503
				|| $length > 3 && $character <= 0xFFFF
504
				// Outside of range of ucschar codepoints
505
				// Noncharacters
506
				|| ($character & 0xFFFE) === 0xFFFE
507
				|| $character >= 0xFDD0 && $character <= 0xFDEF
508
				|| (
509
					// Everything else not in ucschar
510
					   $character > 0xD7FF && $character < 0xF900
511
					|| $character < 0xA0
512
					|| $character > 0xEFFFD
513
				)
514
				&& (
515
					// Everything not in iprivate, if it applies
516
					   !$iprivate
517
					|| $character < 0xE000
518
					|| $character > 0x10FFFD
519
				)
520
			) {
521
				// If we were a character, pretend we weren't, but rather an error.
522
				if ($valid) {
523
					$position--;
524
				}
525
526
				for ($j = $start; $j <= $position; $j++) {
527
					$string = substr_replace($string, sprintf('%%%02X', ord($string[$j])), $j, 1);
528
					$j += 2;
529
					$position += 2;
530
					$strlen += 2;
531
				}
532
			}
533
		}
534
535
		return $string;
536
	}
537
538
	/**
539
	 * Callback function for preg_replace_callback.
540
	 *
541
	 * Removes sequences of percent encoded bytes that represent UTF-8
542
	 * encoded characters in iunreserved
543
	 *
544
	 * @param array $match PCRE match
545
	 * @return string Replacement
546
	 */
547
	protected function remove_iunreserved_percent_encoded($match) {
548
		// As we just have valid percent encoded sequences we can just explode
549
		// and ignore the first member of the returned array (an empty string).
550
		$bytes = explode('%', $match[0]);
551
552
		// Initialize the new string (this is what will be returned) and that
553
		// there are no bytes remaining in the current sequence (unsurprising
554
		// at the first byte!).
555
		$string = '';
556
		$remaining = 0;
557
558
		// Loop over each and every byte, and set $value to its value
559
		for ($i = 1, $len = count($bytes); $i < $len; $i++) {
560
			$value = hexdec($bytes[$i]);
561
562
			// If we're the first byte of sequence:
563
			if (!$remaining) {
564
				// Start position
565
				$start = $i;
566
567
				// By default we are valid
568
				$valid = true;
569
570
				// One byte sequence:
571
				if ($value <= 0x7F) {
572
					$character = $value;
573
					$length = 1;
574
				}
575
				// Two byte sequence:
576 View Code Duplication
				elseif (($value & 0xE0) === 0xC0) {
577
					$character = ($value & 0x1F) << 6;
578
					$length = 2;
579
					$remaining = 1;
580
				}
581
				// Three byte sequence:
582 View Code Duplication
				elseif (($value & 0xF0) === 0xE0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
583
					$character = ($value & 0x0F) << 12;
584
					$length = 3;
585
					$remaining = 2;
586
				}
587
				// Four byte sequence:
588 View Code Duplication
				elseif (($value & 0xF8) === 0xF0) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
589
					$character = ($value & 0x07) << 18;
590
					$length = 4;
591
					$remaining = 3;
592
				}
593
				// Invalid byte:
594
				else {
595
					$valid = false;
596
					$remaining = 0;
597
				}
598
			}
599
			// Continuation byte:
600
			else {
601
				// Check that the byte is valid, then add it to the character:
602 View Code Duplication
				if (($value & 0xC0) === 0x80) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
603
					$remaining--;
604
					$character |= ($value & 0x3F) << ($remaining * 6);
605
				}
606
				// If it is invalid, count the sequence as invalid and reprocess the current byte as the start of a sequence:
607
				else {
608
					$valid = false;
609
					$remaining = 0;
610
					$i--;
611
				}
612
			}
613
614
			// If we've reached the end of the current byte sequence, append it to Unicode::$data
615
			if (!$remaining) {
616
				// Percent encode anything invalid or not in iunreserved
617
				if (
618
					// Invalid sequences
619
					!$valid
620
					// Non-shortest form sequences are invalid
621
					|| $length > 1 && $character <= 0x7F
622
					|| $length > 2 && $character <= 0x7FF
623
					|| $length > 3 && $character <= 0xFFFF
624
					// Outside of range of iunreserved codepoints
625
					|| $character < 0x2D
626
					|| $character > 0xEFFFD
627
					// Noncharacters
628
					|| ($character & 0xFFFE) === 0xFFFE
629
					|| $character >= 0xFDD0 && $character <= 0xFDEF
630
					// Everything else not in iunreserved (this is all BMP)
631
					|| $character === 0x2F
632
					|| $character > 0x39 && $character < 0x41
633
					|| $character > 0x5A && $character < 0x61
634
					|| $character > 0x7A && $character < 0x7E
635
					|| $character > 0x7E && $character < 0xA0
636
					|| $character > 0xD7FF && $character < 0xF900
637
				) {
638 View Code Duplication
					for ($j = $start; $j <= $i; $j++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
639
						$string .= '%' . strtoupper($bytes[$j]);
640
					}
641
				}
642
				else {
643 View Code Duplication
					for ($j = $start; $j <= $i; $j++) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
644
						$string .= chr(hexdec($bytes[$j]));
645
					}
646
				}
647
			}
648
		}
649
650
		// If we have any bytes left over they are invalid (i.e., we are
651
		// mid-way through a multi-byte sequence)
652
		if ($remaining) {
653 View Code Duplication
			for ($j = $start; $j < $len; $j++) {
654
				$string .= '%' . strtoupper($bytes[$j]);
655
			}
656
		}
657
658
		return $string;
659
	}
660
661
	protected function scheme_normalization() {
662 View Code Duplication
		if (isset($this->normalization[$this->scheme]['iuserinfo']) && $this->iuserinfo === $this->normalization[$this->scheme]['iuserinfo']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 137 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
663
			$this->iuserinfo = null;
664
		}
665 View Code Duplication
		if (isset($this->normalization[$this->scheme]['ihost']) && $this->ihost === $this->normalization[$this->scheme]['ihost']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 125 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
666
			$this->ihost = null;
667
		}
668 View Code Duplication
		if (isset($this->normalization[$this->scheme]['port']) && $this->port === $this->normalization[$this->scheme]['port']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 122 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
669
			$this->port = null;
670
		}
671 View Code Duplication
		if (isset($this->normalization[$this->scheme]['ipath']) && $this->ipath === $this->normalization[$this->scheme]['ipath']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 125 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
672
			$this->ipath = '';
673
		}
674
		if (isset($this->ihost) && empty($this->ipath)) {
675
			$this->ipath = '/';
676
		}
677 View Code Duplication
		if (isset($this->normalization[$this->scheme]['iquery']) && $this->iquery === $this->normalization[$this->scheme]['iquery']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 128 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
678
			$this->iquery = null;
679
		}
680 View Code Duplication
		if (isset($this->normalization[$this->scheme]['ifragment']) && $this->ifragment === $this->normalization[$this->scheme]['ifragment']) {
0 ignored issues
show
Coding Style introduced by
This line exceeds maximum limit of 120 characters; contains 137 characters

Overly long lines are hard to read on any screen. Most code styles therefor impose a maximum limit on the number of characters in a line.

Loading history...
681
			$this->ifragment = null;
682
		}
683
	}
684
685
	/**
686
	 * Check if the object represents a valid IRI. This needs to be done on each
687
	 * call as some things change depending on another part of the IRI.
688
	 *
689
	 * @return bool
690
	 */
691
	public function is_valid() {
692
		$isauthority = $this->iuserinfo !== null || $this->ihost !== null || $this->port !== null;
693
		if ($this->ipath !== '' &&
0 ignored issues
show
Unused Code introduced by
This if statement, and the following return statement can be replaced with return !($this->ipath !=...($this->ipath, '/'))));.
Loading history...
694
			(
695
				$isauthority && $this->ipath[0] !== '/' ||
696
				(
697
					$this->scheme === null &&
698
					!$isauthority &&
699
					strpos($this->ipath, ':') !== false &&
700
					(strpos($this->ipath, '/') === false ? true : strpos($this->ipath, ':') < strpos($this->ipath, '/'))
701
				)
702
			)
703
		) {
704
			return false;
705
		}
706
707
		return true;
708
	}
709
710
	/**
711
	 * Set the entire IRI. Returns true on success, false on failure (if there
712
	 * are any invalid characters).
713
	 *
714
	 * @param string $iri
715
	 * @return bool
716
	 */
717
	protected function set_iri($iri) {
718
		static $cache;
719
		if (!$cache) {
720
			$cache = array();
721
		}
722
723
		if ($iri === null) {
724
			return true;
725
		}
726
		if (isset($cache[$iri])) {
727
			list($this->scheme,
728
				 $this->iuserinfo,
729
				 $this->ihost,
730
				 $this->port,
731
				 $this->ipath,
732
				 $this->iquery,
733
				 $this->ifragment,
734
				 $return) = $cache[$iri];
735
			return $return;
736
		}
737
738
		$parsed = $this->parse_iri((string) $iri);
739
740
		$return = $this->set_scheme($parsed['scheme'])
741
			&& $this->set_authority($parsed['authority'])
742
			&& $this->set_path($parsed['path'])
743
			&& $this->set_query($parsed['query'])
744
			&& $this->set_fragment($parsed['fragment']);
745
746
		$cache[$iri] = array($this->scheme,
747
							 $this->iuserinfo,
748
							 $this->ihost,
749
							 $this->port,
750
							 $this->ipath,
751
							 $this->iquery,
752
							 $this->ifragment,
753
							 $return);
754
		return $return;
755
	}
756
757
	/**
758
	 * Set the scheme. Returns true on success, false on failure (if there are
759
	 * any invalid characters).
760
	 *
761
	 * @param string $scheme
762
	 * @return bool
763
	 */
764
	protected function set_scheme($scheme) {
765
		if ($scheme === null) {
766
			$this->scheme = null;
767
		}
768
		elseif (!preg_match('/^[A-Za-z][0-9A-Za-z+\-.]*$/', $scheme)) {
769
			$this->scheme = null;
770
			return false;
771
		}
772
		else {
773
			$this->scheme = strtolower($scheme);
774
		}
775
		return true;
776
	}
777
778
	/**
779
	 * Set the authority. Returns true on success, false on failure (if there are
780
	 * any invalid characters).
781
	 *
782
	 * @param string $authority
783
	 * @return bool
784
	 */
785
	protected function set_authority($authority) {
786
		static $cache;
787
		if (!$cache) {
788
			$cache = array();
789
		}
790
791
		if ($authority === null) {
792
			$this->iuserinfo = null;
793
			$this->ihost = null;
794
			$this->port = null;
795
			return true;
796
		}
797
		if (isset($cache[$authority])) {
798
			list($this->iuserinfo,
799
				 $this->ihost,
800
				 $this->port,
801
				 $return) = $cache[$authority];
802
803
			return $return;
804
		}
805
806
		$remaining = $authority;
807
		if (($iuserinfo_end = strrpos($remaining, '@')) !== false) {
808
			$iuserinfo = substr($remaining, 0, $iuserinfo_end);
809
			$remaining = substr($remaining, $iuserinfo_end + 1);
810
		}
811
		else {
812
			$iuserinfo = null;
813
		}
814
		if (($port_start = strpos($remaining, ':', strpos($remaining, ']'))) !== false) {
815
			$port = substr($remaining, $port_start + 1);
816
			if ($port === false || $port === '') {
817
				$port = null;
818
			}
819
			$remaining = substr($remaining, 0, $port_start);
820
		}
821
		else {
822
			$port = null;
823
		}
824
825
		$return = $this->set_userinfo($iuserinfo) &&
826
				  $this->set_host($remaining) &&
827
				  $this->set_port($port);
828
829
		$cache[$authority] = array($this->iuserinfo,
830
								   $this->ihost,
831
								   $this->port,
832
								   $return);
833
834
		return $return;
835
	}
836
837
	/**
838
	 * Set the iuserinfo.
839
	 *
840
	 * @param string $iuserinfo
841
	 * @return bool
842
	 */
843 View Code Duplication
	protected function set_userinfo($iuserinfo) {
844
		if ($iuserinfo === null) {
845
			$this->iuserinfo = null;
846
		}
847
		else {
848
			$this->iuserinfo = $this->replace_invalid_with_pct_encoding($iuserinfo, '!$&\'()*+,;=:');
849
			$this->scheme_normalization();
850
		}
851
852
		return true;
853
	}
854
855
	/**
856
	 * Set the ihost. Returns true on success, false on failure (if there are
857
	 * any invalid characters).
858
	 *
859
	 * @param string $ihost
860
	 * @return bool
861
	 */
862
	protected function set_host($ihost) {
863
		if ($ihost === null) {
864
			$this->ihost = null;
865
			return true;
866
		}
867
		if (substr($ihost, 0, 1) === '[' && substr($ihost, -1) === ']') {
868
			if (IPv6::check_ipv6(substr($ihost, 1, -1))) {
869
				$this->ihost = '[' . IPv6::compress(substr($ihost, 1, -1)) . ']';
870
			}
871
			else {
872
				$this->ihost = null;
873
				return false;
874
			}
875
		}
876
		else {
877
			$ihost = $this->replace_invalid_with_pct_encoding($ihost, '!$&\'()*+,;=');
878
879
			// Lowercase, but ignore pct-encoded sections (as they should
880
			// remain uppercase). This must be done after the previous step
881
			// as that can add unescaped characters.
882
			$position = 0;
883
			$strlen = strlen($ihost);
884
			while (($position += strcspn($ihost, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ%', $position)) < $strlen) {
885
				if ($ihost[$position] === '%') {
886
					$position += 3;
887
				}
888
				else {
889
					$ihost[$position] = strtolower($ihost[$position]);
890
					$position++;
891
				}
892
			}
893
894
			$this->ihost = $ihost;
895
		}
896
897
		$this->scheme_normalization();
898
899
		return true;
900
	}
901
902
	/**
903
	 * Set the port. Returns true on success, false on failure (if there are
904
	 * any invalid characters).
905
	 *
906
	 * @param string $port
907
	 * @return bool
908
	 */
909
	protected function set_port($port) {
910
		if ($port === null) {
911
			$this->port = null;
912
			return true;
913
		}
914
915
		if (strspn($port, '0123456789') === strlen($port)) {
916
			$this->port = (int) $port;
917
			$this->scheme_normalization();
918
			return true;
919
		}
920
921
		$this->port = null;
922
		return false;
923
	}
924
925
	/**
926
	 * Set the ipath.
927
	 *
928
	 * @param string $ipath
929
	 * @return bool
930
	 */
931
	protected function set_path($ipath) {
932
		static $cache;
933
		if (!$cache) {
934
			$cache = array();
935
		}
936
937
		$ipath = (string) $ipath;
938
939
		if (isset($cache[$ipath])) {
940
			$this->ipath = $cache[$ipath][(int) ($this->scheme !== null)];
941
		}
942
		else {
943
			$valid = $this->replace_invalid_with_pct_encoding($ipath, '!$&\'()*+,;=@:/');
944
			$removed = $this->remove_dot_segments($valid);
945
946
			$cache[$ipath] = array($valid, $removed);
947
			$this->ipath = ($this->scheme !== null) ? $removed : $valid;
948
		}
949
		$this->scheme_normalization();
950
		return true;
951
	}
952
953
	/**
954
	 * Set the iquery.
955
	 *
956
	 * @param string $iquery
957
	 * @return bool
958
	 */
959 View Code Duplication
	protected function set_query($iquery) {
960
		if ($iquery === null) {
961
			$this->iquery = null;
962
		}
963
		else {
964
			$this->iquery = $this->replace_invalid_with_pct_encoding($iquery, '!$&\'()*+,;=:@/?', true);
965
			$this->scheme_normalization();
966
		}
967
		return true;
968
	}
969
970
	/**
971
	 * Set the ifragment.
972
	 *
973
	 * @param string $ifragment
974
	 * @return bool
975
	 */
976 View Code Duplication
	protected function set_fragment($ifragment) {
977
		if ($ifragment === null) {
978
			$this->ifragment = null;
979
		}
980
		else {
981
			$this->ifragment = $this->replace_invalid_with_pct_encoding($ifragment, '!$&\'()*+,;=:@/?');
982
			$this->scheme_normalization();
983
		}
984
		return true;
985
	}
986
987
	/**
988
	 * Convert an IRI to a URI (or parts thereof)
989
	 *
990
	 * @param string|bool IRI to convert (or false from {@see get_iri})
991
	 * @return string|false URI if IRI is valid, false otherwise.
992
	 */
993
	protected function to_uri($string) {
994
		if (!is_string($string)) {
995
			return false;
996
		}
997
998
		static $non_ascii;
999
		if (!$non_ascii) {
1000
			$non_ascii = implode('', range("\x80", "\xFF"));
1001
		}
1002
1003
		$position = 0;
1004
		$strlen = strlen($string);
1005
		while (($position += strcspn($string, $non_ascii, $position)) < $strlen) {
1006
			$string = substr_replace($string, sprintf('%%%02X', ord($string[$position])), $position, 1);
1007
			$position += 3;
1008
			$strlen += 2;
1009
		}
1010
1011
		return $string;
1012
	}
1013
1014
	/**
1015
	 * Get the complete IRI
1016
	 *
1017
	 * @return string
1018
	 */
1019
	protected function get_iri() {
1020
		if (!$this->is_valid()) {
1021
			return false;
1022
		}
1023
1024
		$iri = '';
1025
		if ($this->scheme !== null) {
1026
			$iri .= $this->scheme . ':';
1027
		}
1028
		if (($iauthority = $this->get_iauthority()) !== null) {
1029
			$iri .= '//' . $iauthority;
1030
		}
1031
		$iri .= $this->ipath;
1032
		if ($this->iquery !== null) {
1033
			$iri .= '?' . $this->iquery;
1034
		}
1035
		if ($this->ifragment !== null) {
1036
			$iri .= '#' . $this->ifragment;
1037
		}
1038
1039
		return $iri;
1040
	}
1041
1042
	/**
1043
	 * Get the complete URI
1044
	 *
1045
	 * @return string
1046
	 */
1047
	protected function get_uri() {
1048
		return $this->to_uri($this->get_iri());
1049
	}
1050
1051
	/**
1052
	 * Get the complete iauthority
1053
	 *
1054
	 * @return string
1055
	 */
1056
	protected function get_iauthority() {
1057
		if ($this->iuserinfo === null && $this->ihost === null && $this->port === null) {
1058
			return null;
1059
		}
1060
1061
		$iauthority = '';
1062
		if ($this->iuserinfo !== null) {
1063
			$iauthority .= $this->iuserinfo . '@';
1064
		}
1065
		if ($this->ihost !== null) {
1066
			$iauthority .= $this->ihost;
1067
		}
1068
		if ($this->port !== null) {
1069
			$iauthority .= ':' . $this->port;
1070
		}
1071
		return $iauthority;
1072
	}
1073
1074
	/**
1075
	 * Get the complete authority
1076
	 *
1077
	 * @return string
1078
	 */
1079
	protected function get_authority() {
1080
		$iauthority = $this->get_iauthority();
1081
		if (is_string($iauthority)) {
1082
			return $this->to_uri($iauthority);
1083
		}
1084
		else {
1085
			return $iauthority;
1086
		}
1087
	}
1088
}
1089