Completed
Push — develop ( 1c6cec...2f013f )
by Lars
05:10
created

PublicSuffixListManager   B

Complexity

Total Complexity 53

Size/Duplication

Total Lines 438
Duplicated Lines 1.37 %

Coupling/Cohesion

Components 1
Dependencies 5

Test Coverage

Coverage 98.06%

Importance

Changes 0
Metric Value
dl 6
loc 438
ccs 101
cts 103
cp 0.9806
rs 7.4757
c 0
b 0
f 0
wmc 53
lcom 1
cbo 5

16 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 10 2
A refreshPublicSuffixList() 0 16 4
A fetchListFromSource() 0 10 2
B parseListToArray() 0 32 4
B buildArray() 0 30 6
A writePhpCache() 0 6 1
A varExportToFile() 0 6 1
B getList() 0 23 6
A getListFromFile() 0 18 3
B convertListToArray() 0 25 3
A convertLineToArray() 0 11 2
A validateDomainAddition() 0 8 2
B isValidSection() 6 12 5
C write() 0 28 8
A getHttpAdapter() 0 12 3
A setHttpAdapter() 0 4 1

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like PublicSuffixListManager often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use PublicSuffixListManager, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
/**
4
 * PHP Domain Parser: Public Suffix List based URL parsing.
5
 *
6
 * @link      http://github.com/jeremykendall/php-domain-parser for the canonical source repository
7
 *
8
 * @copyright Copyright (c) 2014 Jeremy Kendall (http://about.me/jeremykendall)
9
 * @license   http://github.com/jeremykendall/php-domain-parser/blob/master/LICENSE MIT License
10
 */
11
namespace Pdp;
12
13
use Pdp\HttpAdapter\HttpAdapterInterface;
14
15
/**
16
 * Public Suffix List Manager.
17
 *
18
 * This class obtains, writes, caches, and returns text and PHP representations
19
 * of the Public Suffix List
20
 */
21
class PublicSuffixListManager
22
{
23
  const ALL_DOMAINS = 'ALL';
24
25
  const PDP_PSL_TEXT_FILE = 'public-suffix-list.txt';
26
  const PDP_PSL_PHP_FILE  = 'public-suffix-list.php';
27
28
  const ICANN_DOMAINS      = 'ICANN';
29
  const ICANN_PSL_PHP_FILE = 'icann-public-suffix-list.php';
30
31
  const PRIVATE_DOMAINS      = 'PRIVATE';
32
  const PRIVATE_PSL_PHP_FILE = 'private-public-suffix-list.php';
33
34
  /**
35
   * @var string Public Suffix List URL
36
   */
37
  protected $publicSuffixListUrl = 'https://publicsuffix.org/list/effective_tld_names.dat';
38
39
  /**
40
   * @var string Directory where text and php versions of list will be cached
41
   */
42
  protected $cacheDir;
43
44
  /**
45
   * @var PublicSuffixList Public Suffix List
46
   */
47
  protected static $domainList = array(
48
      self::ALL_DOMAINS     => self::PDP_PSL_PHP_FILE,
49
      self::ICANN_DOMAINS   => self::ICANN_PSL_PHP_FILE,
50
      self::PRIVATE_DOMAINS => self::PRIVATE_PSL_PHP_FILE,
51 11
  );
52
53 11
  /**
54 1
   * @var \Pdp\HttpAdapter\HttpAdapterInterface Http adapter
55 1
   */
56 1
  protected $httpAdapter;
57 1
58
  /**
59 11
   * Public constructor.
60 11
   *
61
   * @param string $cacheDir Optional cache directory
62
   */
63
  public function __construct($cacheDir = null)
64
  {
65
    if (null === $cacheDir) {
66 1
      $cacheDir = realpath(
67
          dirname(dirname(__DIR__)) . DIRECTORY_SEPARATOR . 'data'
68 1
      );
69 1
    }
70 1
71 1
    $this->cacheDir = $cacheDir;
72 1
  }
73 1
74
  /**
75
   * Downloads Public Suffix List and writes text cache and PHP cache. If these files
76
   * already exist, they will be overwritten.
77
   */
78
  public function refreshPublicSuffixList()
79
  {
80 2
    $this->fetchListFromSource();
81
    $cacheFile = $this->cacheDir . '/' . self::PDP_PSL_TEXT_FILE;
82 2
    $publicSuffixListArray = $this->convertListToArray($cacheFile);
83
    foreach ($publicSuffixListArray as $domain => $data) {
84 2
      // do not empty existing PHP cache file if source TXT is empty
85
      if (
86
          is_array($data)
87
          &&
88 2
          !empty($data)
89
      ) {
90
        $this->varExportToFile(self::$domainList[$domain], $data);
91
      }
92
    }
93
  }
94
95
  /**
96
   * Obtain Public Suffix List from its online source and write to cache dir.
97
   *
98
   * @return int|bool Number of bytes that were written to the file OR false in case of error
99
   */
100
  public function fetchListFromSource()
101
  {
102
    $publicSuffixList = $this->getHttpAdapter()->getContent($this->publicSuffixListUrl);
103
104
    if ($publicSuffixList === false) {
105
      return 0;
106
    }
107 4
108
    return $this->write(self::PDP_PSL_TEXT_FILE, $publicSuffixList);
109
  }
110 4
111 4
  /**
112 1
   * Parses text representation of list to associative, multidimensional array.
113
   *
114
   * This method is based heavily on the code found in generateEffectiveTLDs.php
115 3
   *
116 3
   * @link https://github.com/usrflo/registered-domain-libs/blob/master/generateEffectiveTLDs.php
117 3
   * A copy of the Apache License, Version 2.0, is provided with this
118 3
   * distribution
119
   *
120 3
   * @param string $textFile Public Suffix List text filename
121 3
   *
122
   * @return array Associative, multidimensional array representation of the
123 3
   *               public suffx list
124 3
   *
125 3
   * @throws \Exception Throws \Exception if unable to read file
126 3
   */
127
  public function parseListToArray($textFile)
128 3
  {
129
    /** @noinspection PhpUsageOfSilenceOperatorInspection */
130 3
    $fp = @fopen($textFile, 'r');
131
    if (!$fp || !flock($fp, LOCK_SH)) {
132 3
      throw new \Exception("Cannot read '$textFile'");
133 3
    }
134 3
135 3
    $data = file(
136
        $textFile,
137 3
        FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES
138
    );
139
140
    flock($fp, LOCK_UN);
141
    fclose($fp);
142
143
    $data = array_filter(
144
        $data,
145
        function ($line) {
146
          return false === strpos($line, '//');
147
        }
148
    );
149
150
    $publicSuffixListArray = array();
151
152
    foreach ($data as $line) {
153
      $ruleParts = explode('.', $line);
154 3
      $this->buildArray($publicSuffixListArray, $ruleParts);
155
    }
156 3
157
    return $publicSuffixListArray;
158 3
  }
159
160
  /**
161
   * Recursive method to build the array representation of the Public Suffix List.
162
   *
163
   * This method is based heavily on the code found in generateEffectiveTLDs.php
164 3
   *
165 3
   * @link https://github.com/usrflo/registered-domain-libs/blob/master/generateEffectiveTLDs.php
166
   * A copy of the Apache License, Version 2.0, is provided with this
167 3
   * distribution
168 3
   *
169 3
   * @param array $publicSuffixListArray Initially an empty array, this eventually
170 3
   *                                     becomes the array representation of the Public Suffix List
171
   * @param array $ruleParts             One line (rule) from the Public Suffix List
172 3
   *                                     exploded on '.', or the remaining portion of that array during recursion
173 3
   */
174 3
  public function buildArray(array &$publicSuffixListArray, array $ruleParts)
175 3
  {
176 3
    $isDomain = true;
177
178 3
    $part = array_pop($ruleParts);
179
180 3
    // Adheres to canonicalization rule from the "Formal Algorithm" section
181 3
    // of https://publicsuffix.org/list/
182 3
    // "The domain and all rules must be canonicalized in the normal way
183 3
    // for hostnames - lower-case, Punycode (RFC 3492)."
184
    $punycode = new PunycodeWrapper();
185
    $part = $punycode->encode($part);
186
187
    if (strpos($part, '!') === 0) {
188
      $part = substr($part, 1);
189
      $isDomain = false;
190
    }
191
192 3
    if (!isset($publicSuffixListArray[$part])) {
193
      if ($isDomain) {
194 3
        $publicSuffixListArray[$part] = array();
195
      } else {
196 3
        $publicSuffixListArray[$part] = array('!' => '');
197
      }
198
    }
199
200
    if ($isDomain && count($ruleParts) > 0) {
201
      $this->buildArray($publicSuffixListArray[$part], $ruleParts);
202
    }
203
  }
204
205
  /**
206 3
   * Writes php array representation of the Public Suffix List to disk.
207
   *
208 3
   * @param array $publicSuffixList Array representation of the Public Suffix List
209
   *
210 3
   * @return int Number of bytes that were written to the file
211 1
   */
212 1
  public function writePhpCache(array $publicSuffixList)
213
  {
214 3
    $data = '<?php' . PHP_EOL . 'static $data = ' . var_export($publicSuffixList, true) . '; $result =& $data; unset($data); return $result;';
215
216 3
    return $this->write(self::PDP_PSL_PHP_FILE, $data);
217
  }
218
219
  /**
220
   * Writes php array representation to disk.
221
   *
222
   * @param string $basename file path
223
   * @param array  $input    input data
224
   *
225
   * @return int Number of bytes that were written to the file
226 4
   */
227
  protected function varExportToFile($basename, array $input)
228
  {
229 4
    $data = '<?php' . PHP_EOL . 'static $data = ' . var_export($input, true) . '; $result =& $data; unset($data); return $result;';
230 4
231 1
    return $this->write($basename, $data);
232
  }
233
234
  /**
235 3
   * Gets Public Suffix List.
236
   *
237 3
   * @param string $list the Public Suffix List type
238
   * @param bool   $withStaticCache
239 3
   *
240 3
   * @return PublicSuffixList Instance of Public Suffix List
241
   *
242 3
   * @throws \Exception Throws \Exception if unable to read file
243
   */
244
  public function getList($list = self::ALL_DOMAINS, $withStaticCache = true)
245
  {
246
    // init
247
    static $LIST_STATIC = array();
248
249
    $cacheBasename = isset(self::$domainList[$list]) ? self::$domainList[$list] : self::PDP_PSL_PHP_FILE;
250
    $cacheFile = $this->cacheDir . '/' . $cacheBasename;
251
    $cacheKey = md5($cacheFile);
252
253
    if ($withStaticCache === true && isset($LIST_STATIC[$cacheKey])) {
254
      return $LIST_STATIC[$cacheKey];
255 4
    }
256
257 4
    if (!file_exists($cacheFile)) {
258
      $this->refreshPublicSuffixList();
259
    }
260
261 4
    if (!isset($LIST_STATIC[$cacheKey])) {
262
      $LIST_STATIC[$cacheKey] = new PublicSuffixList($cacheFile);
263 4
    }
264 4
265 4
    return $LIST_STATIC[$cacheKey];
266 4
  }
267
268 4
  /**
269 1
   * Retrieves public suffix list from file after obtaining a shared lock.
270 1
   *
271
   * @param string $phpFile
272
   *
273 3
   * @return PublicSuffixList Instance of Public Suffix List
274 3
   *
275
   * @throws \Exception Throws \Exception if unable to read file
276 3
   */
277
  public function getListFromFile($phpFile)
278
  {
279
    /** @noinspection PhpUsageOfSilenceOperatorInspection */
280
    $fp = @fopen($phpFile, 'r');
281
    if (!$fp || !flock($fp, LOCK_SH)) {
282
      throw new \Exception("Cannot read '$phpFile'");
283
    }
284 3
285
    /** @noinspection PhpIncludeInspection */
286 3
    $list = new PublicSuffixList(
287 1
        require $phpFile
288 1
    );
289 1
290
    flock($fp, LOCK_UN);
291
    fclose($fp);
292 1
293
    return $list;
294 3
  }
295
296
  /**
297
   * Parses text representation of list to associative, multidimensional array.
298
   *
299
   * @param string $textFile Public Suffix List text filename
300
   *
301
   * @return array Associative, multidimensional array representation of the
302 11
   *               public suffx list
303
   */
304 11
  protected function convertListToArray($textFile)
305 11
  {
306
    $addDomain = array(
307
        self::ICANN_DOMAINS   => false,
308
        self::PRIVATE_DOMAINS => false,
309
    );
310
311
    $publicSuffixListArray = array(
312
        self::ALL_DOMAINS     => array(),
313
        self::ICANN_DOMAINS   => array(),
314
        self::PRIVATE_DOMAINS => array(),
315
    );
316
317
    $data = new \SplFileObject($textFile);
318
    $data->setFlags(\SplFileObject::DROP_NEW_LINE | \SplFileObject::READ_AHEAD | \SplFileObject::SKIP_EMPTY);
319
    foreach ($data as $line) {
320
      $addDomain = $this->validateDomainAddition($line, $addDomain);
0 ignored issues
show
Bug introduced by
It seems like $line defined by $line on line 319 can also be of type array; however, Pdp\PublicSuffixListMana...alidateDomainAddition() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
321
      if (strstr($line, '//') !== false) {
322
        continue;
323
      }
324
      $publicSuffixListArray = $this->convertLineToArray($line, $publicSuffixListArray, $addDomain);
0 ignored issues
show
Bug introduced by
It seems like $line defined by $line on line 319 can also be of type array; however, Pdp\PublicSuffixListManager::convertLineToArray() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
325
    }
326
327
    return $publicSuffixListArray;
328
  }
329
330
  /**
331
   * Convert a line from the Public Suffix list.
332
   *
333
   * @param string $textLine              Public Suffix List text line
334
   * @param array  $publicSuffixListArray Associative, multidimensional array representation of the
335
   *                                      public suffx list
336
   * @param array  $addDomain             Tell which section should be converted
337
   *
338
   * @return array Associative, multidimensional array representation of the
339
   *               public suffx list
340
   */
341
  protected function convertLineToArray($textLine, array $publicSuffixListArray, array $addDomain)
342
  {
343
    $ruleParts = explode('.', $textLine);
344
    $this->buildArray($publicSuffixListArray[self::ALL_DOMAINS], $ruleParts);
345
    $domainNames = array_keys(array_filter($addDomain));
346
    foreach ($domainNames as $domainName) {
347
      $this->buildArray($publicSuffixListArray[$domainName], $ruleParts);
348
    }
349
350
    return $publicSuffixListArray;
351
  }
352
353
  /**
354
   * Update the addition status for a given line against the domain list (ICANN and PRIVATE).
355
   *
356
   * @param string $line      the current file line
357
   * @param array  $addDomain the domain addition status
358
   *
359
   * @return array
360
   */
361
  protected function validateDomainAddition($line, array $addDomain)
362
  {
363
    foreach ($addDomain as $section => $status) {
364
      $addDomain[$section] = $this->isValidSection($status, $line, $section);
365
    }
366
367
    return $addDomain;
368
  }
369
370
  /**
371
   * Tell whether the line can be converted for a given domain.
372
   *
373
   * @param bool   $previousStatus the previous status
374
   * @param string $line           the current file line
375
   * @param string $section        the section to be considered
376
   *
377
   * @return bool
378
   */
379
  protected function isValidSection($previousStatus, $line, $section)
380
  {
381 View Code Duplication
    if (!$previousStatus && 0 === strpos($line, '// ===BEGIN ' . $section . ' DOMAINS===')) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
382
      return true;
383
    }
384
385 View Code Duplication
    if ($previousStatus && 0 === strpos($line, '// ===END ' . $section . ' DOMAINS===')) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
386
      return false;
387
    }
388
389
    return $previousStatus;
390
  }
391
392
  /**
393
   * Writes to file after obtaining an exclusive lock.
394
   *
395
   * @param string $filename Filename in cache dir where data will be written
396
   * @param mixed  $data     Data to write
397
   *
398
   * @return int Number of bytes that were written to the file
399
   *
400
   * @throws \Exception Throws \Exception if unable to write file
401
   */
402
  protected function write($filename, $data)
403
  {
404
    $data = trim($data);
405
    $filePath = $this->cacheDir . '/' . $filename;
406
407
    if (empty($data)) {
408
      throw new \Exception("No data to write into '{$filePath}'");
409
    }
410
411
    // open with 'c' and truncate file only after obtaining a lock
412
    /** @noinspection PhpUsageOfSilenceOperatorInspection */
413
    $fp = @fopen($filePath, 'c');
414
    $result = $fp
415
              && flock($fp, LOCK_EX)
416
              && ftruncate($fp, 0)
417
              && fwrite($fp, $data) !== false
418
              && fflush($fp);
419
420
    if (!$result) {
421
      $fp && fclose($fp);
422
      throw new \Exception("Cannot write to '$filePath'");
423
    }
424
425
    flock($fp, LOCK_UN);
426
    fclose($fp);
427
428
    return $result;
429
  }
430
431
  /**
432
   * Returns http adapter. Returns default http adapter if one is not set.
433
   *
434
   * @return \Pdp\HttpAdapter\HttpAdapterInterface Http adapter
435
   */
436
  public function getHttpAdapter()
437
  {
438
    if (!$this->httpAdapter instanceof HttpAdapterInterface) {
439
      if (extension_loaded('curl')) {
440
        $this->httpAdapter = new HttpAdapter\CurlHttpAdapter();
441
      } else {
442
        $this->httpAdapter = new HttpAdapter\PhpHttpAdapter();
443
      }
444
    }
445
446
    return $this->httpAdapter;
447
  }
448
449
  /**
450
   * Sets http adapter.
451
   *
452
   * @param \Pdp\HttpAdapter\HttpAdapterInterface $httpAdapter
453
   */
454
  public function setHttpAdapter(HttpAdapter\HttpAdapterInterface $httpAdapter)
455
  {
456
    $this->httpAdapter = $httpAdapter;
457
  }
458
}
459