Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#690)
by
unknown
02:58
created

FullTextDocument   A

Complexity

Total Complexity 12

Size/Duplication

Total Lines 143
Duplicated Lines 0 %

Importance

Changes 1
Bugs 0 Features 0
Metric Value
wmc 12
eloc 41
c 1
b 0
f 0
dl 0
loc 143
rs 10

3 Methods

Rating   Name   Duplication   Size   Complexity  
B getFullTextFromXml() 0 52 10
A getTextFormat() 0 4 1
A _getHasFullText() 0 4 1
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common\Document;
14
15
use Kitodo\Dlf\Common\FulltextInterface;
16
use Kitodo\Dlf\Common\Helper;
17
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
20
/**
21
 * Document class for the 'dlf' extension
22
 *
23
 * @author Beatrycze Volk <[email protected]>
24
 * @package TYPO3
25
 * @subpackage dlf
26
 * @access public
27
 * @property-read bool $hasFullText Are there any full text files available?
28
 * @property array $rawTextArray array containing raw text
29
 * @abstract
30
 */
31
abstract class FullTextDocument extends Document
32
{
33
    /**
34
     * The extension key
35
     *
36
     * @var string
37
     * @access public
38
     */
39
    public static $extKey = 'dlf';
40
41
    /**
42
     * Are there any fulltext files available? This also includes IIIF text annotations
43
     * with motivation 'painting' if Kitodo.Presentation is configured to store text
44
     * annotations as fulltext.
45
     *
46
     * @var bool
47
     * @access protected
48
     */
49
    protected $hasFullText = false;
50
51
    /**
52
     * This holds the documents' raw text pages with their corresponding
53
     * structMap//div's ID (METS) or Range / Manifest / Sequence ID (IIIF) as array key
54
     *
55
     * @var array
56
     * @access protected
57
     */
58
    protected $rawTextArray = [];
59
60
    /**
61
     * This extracts the OCR full text for a physical structure node / IIIF Manifest / Canvas. Text might be
62
     * given as ALTO for METS or as annotations or ALTO for IIIF resources.
63
     *
64
     * @access public
65
     *
66
     * @abstract
67
     *
68
     * @param string $id: The @ID attribute of the physical structure node (METS) or the @id property
69
     * of the Manifest / Range (IIIF)
70
     *
71
     * @return string The OCR full text
72
     */
73
    public abstract function getFullText($id);
74
75
    /**
76
     * Analyze the document if it contains any full text that needs to be indexed.
77
     *
78
     * @access protected
79
     *
80
     * @abstract
81
     */
82
    protected abstract function ensureHasFullTextIsSet();
83
84
    /**
85
     * This extracts the OCR full text for a physical structure node / IIIF Manifest / Canvas from an
86
     * XML full text representation (currently only ALTO). For IIIF manifests, ALTO documents have
87
     * to be given in the Canvas' / Manifest's "seeAlso" property.
88
     *
89
     * @param string $id: The @ID attribute of the physical structure node (METS) or the @id property
90
     * of the Manifest / Range (IIIF)
91
     *
92
     * @return string The OCR full text
93
     */
94
    protected function getFullTextFromXml($id)
95
    {
96
        $fullText = '';
97
        // Load available text formats, ...
98
        $this->loadFormats();
99
        // ... physical structure ...
100
        $this->_getPhysicalStructure();
101
        // ... and extension configuration.
102
        $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
103
        $fileGrpsFulltext = GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']);
104
        if (!empty($this->physicalStructureInfo[$id])) {
105
            while ($fileGrpFulltext = array_shift($fileGrpsFulltext)) {
106
                if (!empty($this->physicalStructureInfo[$id]['files'][$fileGrpFulltext])) {
107
                    // Get full text file.
108
                    $fileContent = GeneralUtility::getUrl($this->getFileLocation($this->physicalStructureInfo[$id]['files'][$fileGrpFulltext]));
109
                    if ($fileContent !== false) {
110
                        $textFormat = $this->getTextFormat($fileContent);
111
                    } else {
112
                        $this->logger->warning('Couldn\'t load full text file for structure node @ID "' . $id . '"');
1 ignored issue
show
Bug introduced by
The method warning() does not exist on TYPO3\CMS\Core\Log\LogManager. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

112
                        $this->logger->/** @scrutinizer ignore-call */ 
113
                                       warning('Couldn\'t load full text file for structure node @ID "' . $id . '"');

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
113
                        return $fullText;
114
                    }
115
                    break;
116
                }
117
            }
118
        } else {
119
            $this->logger->warning('Invalid structure node @ID "' . $id . '"');
120
            return $fullText;
121
        }
122
        // Is this text format supported?
123
        // This part actually differs from previous version of indexed OCR
124
        if (!empty($fileContent) && !empty($this->formats[$textFormat])) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $textFormat does not seem to be defined for all execution paths leading up to this point.
Loading history...
125
            $textMiniOcr = '';
126
            if (!empty($this->formats[$textFormat]['class'])) {
127
                $class = $this->formats[$textFormat]['class'];
128
                // Get the raw text from class.
129
                if (
130
                    class_exists($class)
131
                    && ($obj = GeneralUtility::makeInstance($class)) instanceof FulltextInterface
132
                ) {
133
                    // Load XML from file.
134
                    $ocrTextXml = Helper::getXmlFileAsString($fileContent);
135
                    $textMiniOcr = $obj->getTextAsMiniOcr($ocrTextXml);
136
                    $this->rawTextArray[$id] = $textMiniOcr;
137
                } else {
138
                    $this->logger->warning('Invalid class/method "' . $class . '->getRawText()" for text format "' . $textFormat . '"');
139
                }
140
            }
141
            $fullText = $textMiniOcr;
142
        } else {
143
            $this->logger->warning('Unsupported text format "' . $textFormat . '" in physical node with @ID "' . $id . '"');
144
        }
145
        return $fullText;
146
    }
147
148
    /**
149
     * Get format of the OCR full text
150
     *
151
     * @access private
152
     *
153
     * @param string $fileContent: content of the XML file
154
     *
155
     * @return string The format of the OCR full text
156
     */
157
    private function getTextFormat($fileContent)
158
    {
159
        // Get the root element's name as text format.
160
        return strtoupper(Helper::getXmlFileAsString($fileContent)->getName());
161
    }
162
163
    /**
164
     * This returns $this->hasFullText via __get()
165
     *
166
     * @access protected
167
     *
168
     * @return bool Are there any full text files available?
169
     */
170
    protected function _getHasFullText()
171
    {
172
        $this->ensureHasFullTextIsSet();
173
        return $this->hasFullText;
174
    }
175
}
176