Completed
Push — master ( 979ae6...ee01ad )
by Jeroen
05:12 queued 02:57
created

XmpMetadataExtractor::convertXmlNode()   C

Complexity

Conditions 17
Paths 50

Size

Total Lines 49
Code Lines 27

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 49
rs 5.1117
c 0
b 0
f 0
cc 17
eloc 27
nc 50
nop 1

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
namespace JeroenDesloovere\XmpMetadataExtractor;
4
5
use DOMDocument;
6
use JeroenDesloovere\XmpMetadataExtractor\Exception\FileNotFoundException;
7
use SplFileInfo;
8
9
final class XmpMetadataExtractor
10
{
11
    protected const RDF_ALT = 'rdf:Alt';
12
    protected const RDF_BAG = 'rdf:Bag';
13
    protected const RDF_LI = 'rdf:li';
14
    protected const RDF_SEQ = 'rdf:Seq';
15
    protected const POSSIBLE_CONTAINERS = [
16
        self::RDF_ALT,
17
        self::RDF_BAG,
18
        self::RDF_SEQ,
19
    ];
20
21
    private function convertDomNode($node)
22
    {
23
        switch ($node->nodeType) {
24
            case XML_CDATA_SECTION_NODE:
25
            case XML_TEXT_NODE:
26
                return trim($node->textContent);
27
28
                break;
0 ignored issues
show
Unused Code introduced by
break is not strictly necessary here and could be removed.

The break statement is not necessary if it is preceded for example by a return statement:

switch ($x) {
    case 1:
        return 'foo';
        break; // This break is not necessary and can be left off.
}

If you would like to keep this construct to be consistent with other case statements, you can safely mark this issue as a false-positive.

Loading history...
29
            case XML_ELEMENT_NODE:
30
                return $this->convertXmlNode($node);
31
32
                break;
33
        }
34
    }
35
36
    private function convertXmlNode($node)
37
    {
38
        $output = [];
39
40
        for ($i = 0, $m = $node->childNodes->length; $i < $m; $i++) {
41
            $child = $node->childNodes->item($i);
42
            $v = $this->convertDomNode($child);
43
44
            if (isset($child->tagName)) {
45
                $t = $child->tagName;
46
                if (!isset($output[$t])) {
47
                    $output[$t] = array();
48
                }
49
                $output[$t][] = $v;
50
            } elseif ($v || $v === '0') {
51
                $output = (string)$v;
52
            }
53
        }
54
55
        // Has attributes but isn't an array
56
        if ($node->attributes->length && !is_array($output)) {
57
            // Change output into an array.
58
            $output = array('@content' => $output);
59
        }
60
61
        if (is_array($output)) {
62
            if ($node->attributes->length) {
63
                $a = array();
64
                foreach ($node->attributes as $attrName => $attrNode) {
65
                    $a[$attrName] = (string)$attrNode->value;
66
                }
67
                $output['@attributes'] = $a;
68
            }
69
70
            foreach ($output as $t => $v) {
71
                // We are combining arrays for rdf:Bag, rdf:Alt, rdf:Seq
72
                if (in_array($t, self::POSSIBLE_CONTAINERS)) {
73
                    if (!array_key_exists(self::RDF_LI, $v[0])) {
74
                        break;
75
                    }
76
77
                    $output = $v[0][self::RDF_LI];
78
                } elseif (is_array($v) && count($v) == 1 && $t != '@attributes') {
79
                    $output[$t] = $v[0];
80
                }
81
            }
82
        }
83
84
        return $output;
85
    }
86
87
    public function extractFromContent(string $content): array
88
    {
89
        try {
90
            $doc = new DOMDocument();
0 ignored issues
show
Bug introduced by
The call to DOMDocument::__construct() has too few arguments starting with version. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

90
            $doc = /** @scrutinizer ignore-call */ new DOMDocument();

This check compares calls to functions or methods with their respective definitions. If the call has less arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress. Please note the @ignore annotation hint above.

Loading history...
91
            $doc->loadXML($this->getXmpXmlString($content));
92
93
            $root = $doc->documentElement;
94
            $output = $this->convertDomNode($root);
95
            $output['@root'] = $root->tagName;
96
97
            return $output;
0 ignored issues
show
Bug Best Practice introduced by
The expression return $output could return the type string which is incompatible with the type-hinted return array. Consider adding an additional type-check to rule them out.
Loading history...
98
        } catch (\Exception $e) {
99
            return [];
100
        }
101
    }
102
103
    public function extractFromFile(string $file): array
104
    {
105
        try {
106
            $file = new SplFileInfo($file);
107
            $contents = file_get_contents($file->getPathname());
108
        } catch (\Exception $e) {
109
            throw new FileNotFoundException('The given File could not be found.');
110
        }
111
112
        return $this->extractFromContent($contents);
113
    }
114
115
    private function getXmpXmlString(string $content): string
116
    {
117
        $xmpDataStart = strpos($content, '<x:xmpmeta');
118
        $xmpDataEnd = strpos($content, '</x:xmpmeta>');
119
        $xmpLength = $xmpDataEnd - $xmpDataStart;
120
121
        return substr($content, $xmpDataStart, $xmpLength + 12);
122
    }
123
}
124