Completed
Pull Request — master (#17)
by
unknown
05:46
created

Consumer::loadUrl()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 7
Code Lines 3

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 0
CRAP Score 2

Importance

Changes 0
Metric Value
dl 0
loc 7
ccs 0
cts 3
cp 0
rs 9.4285
c 0
b 0
f 0
cc 1
eloc 3
nc 1
nop 1
crap 2
1
<?php
2
3
namespace Fusonic\OpenGraph;
4
5
use Fusonic\Linq\Linq;
6
use Fusonic\OpenGraph\Objects\ObjectBase;
7
use Fusonic\OpenGraph\Objects\Website;
8
use GuzzleHttp\Adapter\AdapterInterface;
9
use GuzzleHttp\Client;
10
use Symfony\Component\DomCrawler\Crawler;
11
12
/**
13
 * Consumer that extracts Open Graph data from either a URL or a HTML string.
14
 */
15
class Consumer
16
{
17
    private $client;
18
19
    /**
20
     * When enabled, crawler will read content of title and meta description if no
21
     * Open Graph data is provided by target page.
22
     *
23
     * @var bool
24
     */
25
    public $useFallbackMode = false;
26
27
    /**
28
     * When enabled, crawler will throw exceptions for some crawling errors like unexpected
29
     * Open Graph elements.
30
     *
31
     * @var bool
32
     */
33
    public $debug = false;
34
35
    /**
36
     * @param   AdapterInterface $adapter Guzzle adapter to use for making HTTP requests.
37
     * @param   array            $config  Optional Guzzle config overrides.
38
     */
39 15
    public function __construct(AdapterInterface $adapter = null, array $config = [])
40
    {
41 15
        $config = array_replace_recursive(['adapter' => $adapter], $config);
42
43 15
        $this->client = new Client($config);
44 15
    }
45
46
    /**
47
     * Fetches HTML content from the given URL and then crawls it for Open Graph data.
48
     *
49
     * @param   string  $url            URL to be crawled.
50
     *
51
     * @return  Website
52
     */
53
    public function loadUrl($url)
54
    {
55
        // Fetch HTTP content using Guzzle
56
        $response = $this->client->get($url);
57
58
        return $this->loadHtml($response->getBody()->__toString(), $url);
59
    }
60
61
    /**
62
     * Crawls the given HTML string for OpenGraph data.
63
     *
64
     * @param   string  $html           HTML string, usually whole content of crawled web resource.
65
     * @param   string  $fallbackUrl    URL to use when fallback mode is enabled.
66
     *
67
     * @return  ObjectBase
68
     */
69 15
    public function loadHtml($html, $fallbackUrl = null)
70
    {
71
        // Extract all data that can be found
72 15
        $page = $this->extractOpenGraphData($html);
73
74
        // Use the user's URL as fallback
75 12
        if ($this->useFallbackMode && $page->url === null) {
76 1
            $page->url = $fallbackUrl;
77 1
        }
78
79
        // Return result
80 12
        return $page;
81
    }
82
83 15
    private function extractOpenGraphData($content)
84
    {
85 15
        $crawler = new Crawler($content);
86
87 15
        $properties = [];
88 15
        foreach(['name', 'property'] as $t)
0 ignored issues
show
Coding Style introduced by
Expected 1 space after FOREACH keyword; 0 found
Loading history...
89
        {
90
            // Get all meta-tags starting with "og:"
91 15
            $ogMetaTags = $crawler->filter("meta[{$t}^='og:']");
92
            // Create clean property array
93 15
            $props = Linq::from($ogMetaTags)
94 15
                ->select(
95
                    function (\DOMElement $tag) use ($t) {
96 13
                        $name = strtolower(trim($tag->getAttribute($t)));
97 13
                        $value = trim($tag->getAttribute("content"));
98 13
                        return new Property($name, $value);
99
                    }
100 15
                )
101 15
                ->toArray();
102 15
            $properties = array_merge($properties, $props);
103
          
0 ignored issues
show
Coding Style introduced by
Blank line found at end of control structure
Loading history...
104 15
        }
105
            
106
        // Create new object of the correct type
107 15
        $typeProperty = Linq::from($properties)
108 15
            ->firstOrNull(
109 13
                function (Property $property) {
110 13
                    return $property->key === Property::TYPE;
111
                }
112 15
            );
113 15
        switch ($typeProperty !== null ? $typeProperty->value : null) {
114 15
            default:
0 ignored issues
show
Unused Code introduced by
default: $object = n...s\Website(); break; does not seem to be reachable.

This check looks for unreachable code. It uses sophisticated control flow analysis techniques to find statements which will never be executed.

Unreachable code is most often the result of return, die or exit statements that have been added for debug purposes.

function fx() {
    try {
        doSomething();
        return true;
    }
    catch (\Exception $e) {
        return false;
    }

    return false;
}

In the above example, the last return false will never be executed, because a return statement has already been met in every possible execution path.

Loading history...
115 15
                $object = new Website();
116 15
                break;
117 15
        }
118
119
        // Assign all properties to the object
120 15
        $object->assignProperties($properties, $this->debug);
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
121
122
        // Fallback for title
123 12 View Code Duplication
        if ($this->useFallbackMode && !$object->title) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
124 1
            $titleElement = $crawler->filter("title")->first();
125 1
            if ($titleElement->count()) {
126 1
                $object->title = trim($titleElement->text());
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
127 1
            }
128
129 1
            if (!$object->title) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
130
                $titleElement = $crawler->filter("h1")->first();
131
                if ($titleElement->count()) {
132
                    $object->title = trim($titleElement->text());
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
133
                }
134
            }
135
136 1
            if (!$object->title) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
137
                $titleElement = $crawler->filter("h2")->first();
138
                if ($titleElement->count()) {
139
                    $object->title = trim($titleElement->text());
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
140
                }
141
            }
142 1
        }
143
144
        // Fallback for description
145 12 View Code Duplication
        if ($this->useFallbackMode && !$object->description) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
146 1
            $descriptionElement = $crawler->filter("meta[property='description']")->first();
147 1
            if ($descriptionElement->count()) {
148 1
                $object->description = trim($descriptionElement->attr("content"));
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
149 1
            }
150
151 1
            if (!$object->description) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
152
                $descriptionElement = $crawler->filter("meta[name='description']")->first();
153
                if ($descriptionElement->count()) {
154
                    $object->description = trim($descriptionElement->attr("content"));
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
155
                }
156
            }
157
158 1
            if (!$object->description) {
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
159
                $descriptionElement = $crawler->filter("p")->first();
160
                if ($descriptionElement->count()) {
161
                    $object->description = trim($descriptionElement->text());
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
162
                }
163
            }
164 1
        }
165
166 12
        return $object;
0 ignored issues
show
Bug introduced by
The variable $object seems only to be defined at a later point. Did you maybe move this code here without moving the variable definition?

This error can happen if you refactor code and forget to move the variable initialization.

Let’s take a look at a simple example:

function someFunction() {
    $x = 5;
    echo $x;
}

The above code is perfectly fine. Now imagine that we re-order the statements:

function someFunction() {
    echo $x;
    $x = 5;
}

In that case, $x would be read before it is initialized. This was a very basic example, however the principle is the same for the found issue.

Loading history...
167
    }
168
}
169