Completed
Pull Request — master (#189)
by Mark
03:07
created

CrawlerDetect::checkAgent()   A

Complexity

Conditions 4
Paths 6

Size

Total Lines 21
Code Lines 11

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 21
rs 9.0534
c 0
b 0
f 0
cc 4
eloc 11
nc 6
nop 0
1
<?php
2
3
/*
4
 * This file is part of Crawler Detect - the web crawler detection library.
5
 *
6
 * (c) Mark Beech <[email protected]>
7
 *
8
 * This source file is subject to the MIT license that is bundled
9
 * with this source code in the file LICENSE.
10
 */
11
12
namespace Jaybizzle\CrawlerDetect;
13
14
use Jaybizzle\CrawlerDetect\Fixtures\Headers;
15
use Jaybizzle\CrawlerDetect\Fixtures\Crawlers;
16
use Jaybizzle\CrawlerDetect\Fixtures\Exclusions;
17
use Jaybizzle\CrawlerDetect\Fixtures\Ips;
18
19
class CrawlerDetect
20
{
21
    /**
22
     * The user agent.
23
     *
24
     * @var null
25
     */
26
    protected $userAgent = null;
27
28
    protected $checkAgent = null;
29
30
    protected $checkIp = null;
31
32
    /**
33
     * Headers that contain a user agent.
34
     *
35
     * @var array
36
     */
37
    protected $httpHeaders = array();
38
39
    /**
40
     * Store regex matches.
41
     *
42
     * @var array
43
     */
44
    protected $matches = array();
45
46
    /**
47
     * Crawlers object.
48
     *
49
     * @var \Jaybizzle\CrawlerDetect\Fixtures\Crawlers
50
     */
51
    protected $crawlers;
52
53
    /**
54
     * Exclusions object.
55
     *
56
     * @var \Jaybizzle\CrawlerDetect\Fixtures\Exclusions
57
     */
58
    protected $exclusions;
59
60
    /**
61
     * Headers object.
62
     *
63
     * @var \Jaybizzle\CrawlerDetect\Fixtures\Headers
64
     */
65
    protected $uaHttpHeaders;
66
67
    /**
68
     * The compiled regex string.
69
     *
70
     * @var string
71
     */
72
    protected $compiledRegex;
73
74
    /**
75
     * The compiled exclusions regex string.
76
     *
77
     * @var string
78
     */
79
    protected $compiledExclusions;
80
81
    /**
82
     * Class constructor.
83
     */
84
    public function __construct()
85
    {
86
        //
87
    }
88
89
    /**
90
     * Compile the regex patterns into one regex string.
91
     *
92
     * @param array
93
     * 
94
     * @return string
95
     */
96
    public function compileRegex($patterns)
97
    {
98
        return '('.implode('|', $patterns).')';
99
    }
100
101
    /**
102
     * Set HTTP headers.
103
     *
104
     * @param array|null $httpHeaders
0 ignored issues
show
Bug introduced by
There is no parameter named $httpHeaders. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
105
     */
106
    public function setHttpHeaders()
0 ignored issues
show
Coding Style introduced by
setHttpHeaders uses the super-global variable $_SERVER which is generally not recommended.

Instead of super-globals, we recommend to explicitly inject the dependencies of your class. This makes your code less dependent on global state and it becomes generally more testable:

// Bad
class Router
{
    public function generate($path)
    {
        return $_SERVER['HOST'].$path;
    }
}

// Better
class Router
{
    private $host;

    public function __construct($host)
    {
        $this->host = $host;
    }

    public function generate($path)
    {
        return $this->host.$path;
    }
}

class Controller
{
    public function myAction(Request $request)
    {
        // Instead of
        $page = isset($_GET['page']) ? intval($_GET['page']) : 1;

        // Better (assuming you use the Symfony2 request)
        $page = $request->query->get('page', 1);
    }
}
Loading history...
107
    {
108
        $httpHeaders = $_SERVER;
109
110
        // Only save HTTP headers. In PHP land, that means
111
        // only _SERVER vars that start with HTTP_.
112
        foreach ($httpHeaders as $key => $value) {
113
            if (strpos($key, 'HTTP_') === 0) {
114
                $this->httpHeaders[$key] = $value;
115
            }
116
        }
117
    }
118
119
    /**
120
     * Return user agent headers.
121
     *
122
     * @return array
123
     */
124
    public function getUaHttpHeaders()
125
    {
126
        return $this->uaHttpHeaders->getAll();
127
    }
128
129
    /**
130
     * Set the user agent.
131
     *
132
     * @param string|null $userAgent
0 ignored issues
show
Bug introduced by
There is no parameter named $userAgent. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
133
     */
134
    public function setUserAgent()
135
    {
136
        foreach ($this->getUaHttpHeaders() as $altHeader) {
137
            if (false === empty($this->httpHeaders[$altHeader])) { // @todo: should use getHttpHeader(), but it would be slow.
138
                $this->userAgent .= $this->httpHeaders[$altHeader].' ';
139
            }
140
        }
141
142
        $this->userAgent = (! empty($this->userAgent) ? trim($this->userAgent) : null);
0 ignored issues
show
Documentation Bug introduced by
It seems like !empty($this->userAgent)...this->userAgent) : null can also be of type string. However, the property $userAgent is declared as type null. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
143
    }
144
145
    /**
146
     * Check the user agent.
147
     * 
148
     * @return $this
149
     */
150
    public function agent($userAgent = null)
151
    {
152
        $this->checkAgent = true;
153
154
        $this->userAgent = $userAgent;
155
156
        $this->crawlers = new Crawlers();
157
        $this->exclusions = new Exclusions();
158
        $this->uaHttpHeaders = new Headers();
159
160
        $this->compiledRegex = $this->compileRegex($this->crawlers->getAll());
161
        $this->compiledExclusions = $this->compileRegex($this->exclusions->getAll());
162
163
        return $this;
164
    }
165
166
    /**
167
     * Check the IP address.
168
     * 
169
     * @return $this
170
     */
171
    public function ip($userIp)
172
    {
173
        $this->checkIp = true;
174
175
        $this->userIp = $userIp;
0 ignored issues
show
Bug introduced by
The property userIp does not exist. Did you maybe forget to declare it?

In PHP it is possible to write to properties without declaring them. For example, the following is perfectly valid PHP code:

class MyClass { }

$x = new MyClass();
$x->foo = true;

Generally, it is a good practice to explictly declare properties to avoid accidental typos and provide IDE auto-completion:

class MyClass {
    public $foo;
}

$x = new MyClass();
$x->foo = true;
Loading history...
176
177
        return $this;
178
    }
179
180
    public function checkAgent()
181
    {
182
        if (is_null($this->userAgent)) {
183
            $this->setHttpHeaders();
184
            $this->setUserAgent();
185
        }
186
187
        $agent = preg_replace('/'.$this->compiledExclusions.'/i', '', $this->userAgent);
0 ignored issues
show
Unused Code introduced by
$agent is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
188
189
        if (strlen(trim($this->userAgent)) == 0) {
190
            return false;
191
        }
192
193
        $result = preg_match('/'.$this->compiledRegex.'/i', trim($this->userAgent), $matches);
194
195
        if ($matches) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $matches of type string[] is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
196
            $this->matches = $matches;
197
        }
198
199
        return (bool) $result;
200
    }
201
202
    public function checkIp()
203
    {
204
        $ips = new Ips();
205
206
207
        foreach ($ips->getAll() as $ip) {
208
            if($this->ipInRange($this->userIp, $ip) === true) {
209
                return true;
210
            }
211
        }
212
213
        return false;
214
    }
215
216
    /**
217
     * Check user agent string against the regex.
218
     *
219
     * @param string|null $userAgent
0 ignored issues
show
Bug introduced by
There is no parameter named $userAgent. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
220
     *
221
     * @return bool
222
     */
223
    public function isCrawler()
224
    {
225
        if($this->checkAgent && ! $this->checkIp) {
226
            return $this->checkAgent();
227
        }
228
229
        if($this->checkIp && ! $this->checkAgent) {
230
            return $this->checkIp();
231
        }
232
233
        if(is_null($this->checkIp) && is_null($this->checkAgent)) {
234
            return $this->checkAgent() && $this->checkIp();
235
        }
236
    }
237
238
    /**
239
     * Return the matches.
240
     *
241
     * @return string|null
242
     */
243
    public function getMatches()
244
    {
245
        return isset($this->matches[0]) ? $this->matches[0] : null;
246
    }
247
248
249
    public function ipInRange($ip, $range)
250
    {
251
      if (strpos($range, '/') !== false) {
252
        // $range is in IP/NETMASK format
253
        list($range, $netmask) = explode('/', $range, 2);
254
255
        if (strpos($netmask, '.') !== false) {
256
          // $netmask is a 255.255.0.0 format
257
          $netmask = str_replace('*', '0', $netmask);
258
          $netmask_dec = ip2long($netmask);
259
          return ( (ip2long($ip) & $netmask_dec) == (ip2long($range) & $netmask_dec) );
260
        } else {
261
          // $netmask is a CIDR size block
262
          // fix the range argument
263
          $x = explode('.', $range);
264
          while(count($x)<4) $x[] = '0';
265
          list($a,$b,$c,$d) = $x;
266
          $range = sprintf("%u.%u.%u.%u", empty($a)?'0':$a, empty($b)?'0':$b,empty($c)?'0':$c,empty($d)?'0':$d);
267
          $range_dec = ip2long($range);
268
          $ip_dec = ip2long($ip);
269
270
          # Strategy 1 - Create the netmask with 'netmask' 1s and then fill it to 32 with 0s
271
          #$netmask_dec = bindec(str_pad('', $netmask, '1') . str_pad('', 32-$netmask, '0'));
0 ignored issues
show
Unused Code Comprehensibility introduced by
58% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
272
273
          # Strategy 2 - Use math to create it
274
          $wildcard_dec = pow(2, (32-$netmask)) - 1;
275
          $netmask_dec = ~ $wildcard_dec;
276
277
          return (($ip_dec & $netmask_dec) == ($range_dec & $netmask_dec));
278
        }
279
      } else {
280
        // range might be 255.255.*.* or 1.2.3.0-1.2.3.255
0 ignored issues
show
Unused Code Comprehensibility introduced by
37% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
281
        if (strpos($range, '*') !==false) { // a.b.*.* format
282
          // Just convert to A-B format by setting * to 0 for A and 255 for B
283
          $lower = str_replace('*', '0', $range);
284
          $upper = str_replace('*', '255', $range);
285
          $range = "$lower-$upper";
286
        }
287
288
        if (strpos($range, '-')!==false) { // A-B format
289
          list($lower, $upper) = explode('-', $range, 2);
290
          $lower_dec = (float)sprintf("%u",ip2long($lower));
291
          $upper_dec = (float)sprintf("%u",ip2long($upper));
292
          $ip_dec = (float)sprintf("%u",ip2long($ip));
293
          return ( ($ip_dec>=$lower_dec) && ($ip_dec<=$upper_dec) );
294
        }
295
296
        echo 'Range argument is not in 1.2.3.4/24 or 1.2.3.4/255.255.255.0 format';
297
        return false;
298
      }
299
300
    }
301
302
    // decbin32
303
    // In order to simplify working with IP addresses (in binary) and their
304
    // netmasks, it is easier to ensure that the binary strings are padded
305
    // with zeros out to 32 characters - IP addresses are 32 bit numbers
306
    public function decbin32 ($dec) {
307
      return str_pad(decbin($dec), 32, '0', STR_PAD_LEFT);
308
    }
309
}
310