Completed
Push — master ( 2d7c5c...16dc59 )
by Dev
09:53
created

Indexable::isIndexable()   B

Complexity

Conditions 11
Paths 8

Size

Total Lines 39
Code Lines 17

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 4
CRAP Score 67.9362

Importance

Changes 0
Metric Value
cc 11
eloc 17
nc 8
nop 2
dl 0
loc 39
ccs 4
cts 18
cp 0.2222
crap 67.9362
rs 7.3166
c 0
b 0
f 0

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
namespace PiedWeb\UrlHarvester;
4
5
use Spatie\Robots\RobotsTxt;
6
use Spatie\Robots\RobotsHeaders;
7
use PiedWeb\Curl\Request;
0 ignored issues
show
Bug introduced by
This use statement conflicts with another class in this namespace, PiedWeb\UrlHarvester\Request. Consider defining an alias.

Let?s assume that you have a directory layout like this:

.
|-- OtherDir
|   |-- Bar.php
|   `-- Foo.php
`-- SomeDir
    `-- Foo.php

and let?s assume the following content of Bar.php:

// Bar.php
namespace OtherDir;

use SomeDir\Foo; // This now conflicts the class OtherDir\Foo

If both files OtherDir/Foo.php and SomeDir/Foo.php are loaded in the same runtime, you will see a PHP error such as the following:

PHP Fatal error:  Cannot use SomeDir\Foo as Foo because the name is already in use in OtherDir/Foo.php

However, as OtherDir/Foo.php does not necessarily have to be loaded and the error is only triggered if it is loaded before OtherDir/Bar.php, this problem might go unnoticed for a while. In order to prevent this error from surfacing, you must import the namespace with a different alias:

// Bar.php
namespace OtherDir;

use SomeDir\Foo as SomeDirFoo; // There is no conflict anymore.
Loading history...
8
9
class Indexable
10
{
11
    // https://stackoverflow.com/questions/1880148/how-to-get-name-of-the-constant
12
    const INDEXABLE = 0;
13
    const NOT_INDEXABLE_ROBOTS = 1;
14
    const NOT_INDEXABLE_HEADER = 2;
15
    const NOT_INDEXABLE_META = 3;
16
    const NOT_INDEXABLE_CANONICAL = 4;
17
    const NOT_INDEXABLE_4XX = 5;
18
    const NOT_INDEXABLE_5XX = 6;
19
    const NOT_INDEXABLE_NETWORK_ERROR = 7;
20
    const NOT_INDEXABLE_3XX = 8;
21
    const NOT_INDEXABLE_NOT_HTML = 9;
22
23
    /** @var Harvest */
24
    protected $harvest;
25
26
    /** @var string */
27
    protected $isIndexableFor;
28
29 3
    public function __construct(Harvest $harvest, string $isIndexableFor = 'googlebot')
30
    {
31 3
        $this->harvest        = $harvest;
32 3
        $this->isIndexableFor = $isIndexableFor;
33 3
    }
34
35 3
    public function robotsTxtAllows()
36
    {
37 3
        $url       = $this->harvest->getResponse()->getUrl();
38 3
        $robotsTxt = $this->harvest->getRobotsTxt();
39
40 3
        return '' === $robotsTxt ? true : $robotsTxt->allows($url, $this->isIndexableFor);
41
    }
42
43
    public function metaAllows()
44
    {
45
        $meta    = $this->harvest->getMeta($this->isIndexableFor);
46
        $generic = $this->harvest->getMeta('robots');
47
48
        return ! (false !== stripos($meta, 'noindex') || false !== stripos($generic, 'noindex'));
49
    }
50
51
    public function headersAllow()
52
    {
53
        $headers = explode(PHP_EOL, $this->harvest->getResponse()->getHeaders(false));
54
55
        return RobotsHeaders::create($headers)->mayIndex($this->isIndexableFor);
56
    }
57
58
    /**
59
     * @return int
60
     */
61 3
    public static function isIndexable(Harvest $harvest, string $isIndexableFor = 'googlebot')
62
    {
63 3
        $self = new self($harvest, $isIndexableFor);
64
65
        // robots
66 3
        if (!$self->robotsTxtAllows()) {
67 3
            return self::NOT_INDEXABLE_ROBOTS;
68
        }
69
70
        if ($self->headersAllow()) {
71
            return self::NOT_INDEXABLE_HEADER;
72
        }
73
74
        if ($self->metaAllows()) {
75
            return self::NOT_INDEXABLE_META;
76
        }
77
78
        // canonical
79
        if (!$harvest->isCanonicalCorrect()) {
80
            return self::NOT_INDEXABLE_CANONICAL;
81
        }
82
83
        $statusCode = $harvest->getResponse()->getStatusCode();
84
        // status 4XX
85
        if ($statusCode < 500 && $statusCode > 399) {
86
            return self::NOT_INDEXABLE_5XX;
87
        }
88
89
        // status 5XX
90
        if ($statusCode < 600 && $statusCode > 499) {
91
            return self::NOT_INDEXABLE_5XX;
92
        }
93
94
        // status 3XX
95
        if ($statusCode < 400 && $statusCode > 299) {
96
            return self::NOT_INDEXABLE_3XX;
97
        }
98
99
        return self::INDEXABLE;
100
    }
101
}
102