WebsiteSource   A
last analyzed

Complexity

Total Complexity 9

Size/Duplication

Total Lines 63
Duplicated Lines 0 %

Importance

Changes 3
Bugs 1 Features 1
Metric Value
eloc 21
c 3
b 1
f 1
dl 0
loc 63
rs 10
wmc 9

3 Methods

Rating   Name   Duplication   Size   Complexity  
A getTextFromArticle() 0 22 5
A getDOMDocumentArticle() 0 8 2
A handle() 0 9 2
1
<?php
2
3
namespace Cion\TextToSpeech\Sources;
4
5
use Cion\TextToSpeech\Contracts\Source as SourceContract;
6
use DOMDocument;
7
use DOMNodeList;
8
use RecursiveIteratorIterator;
9
10
class WebsiteSource implements SourceContract
11
{
12
    /**
13
     * Handles in getting the text from source.
14
     *
15
     * @param  string $data
16
     * @return string
17
     */
18
    public function handle(string $data): string
19
    {
20
        $articles = $this->getDOMDocumentArticle($data);
21
22
        if ($articles === null) {
23
            return '';
24
        }
25
26
        return $this->getTextFromArticle($articles);
27
    }
28
29
    /**
30
     * Get the DOM Node List of article tag.
31
     *
32
     * @return DOMNodeList|null
33
     */
34
    protected function getDOMDocumentArticle(string $url)
35
    {
36
        $dom = new DOMDocument();
37
        @$dom->loadHTML(file_get_contents($url));
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition for loadHTML(). This can introduce security issues, and is generally not recommended. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unhandled  annotation

37
        /** @scrutinizer ignore-unhandled */ @$dom->loadHTML(file_get_contents($url));

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
38
        $element = $dom->getElementsByTagName('article')->item(0);
39
40
        if ($element !== null) {
41
            return $element->childNodes;
42
        }
43
    }
44
45
    /**
46
     * Get text from the articles DOM Node List.
47
     *
48
     * @param DOMNodeList $articles
49
     * @return string
50
     */
51
    protected function getTextFromArticle(DOMNodeList $articles): string
52
    {
53
        $text = '';
54
55
        for ($i = 0; $i < $articles->length; $i++) {
56
            // Check element if there is a childNodes
57
            if ($articles->item($i)->childNodes === null) {
58
                continue;
59
            }
60
61
            $dit = new RecursiveIteratorIterator(
62
                new RecursiveDOMIterator($articles->item($i)),
63
                RecursiveIteratorIterator::SELF_FIRST
64
            );
65
            foreach ($dit as $node) {
66
                if ($node->nodeName === 'p') {
67
                    $text .= $node->textContent.' ';
68
                }
69
            }
70
        }
71
72
        return $text;
73
    }
74
}
75