Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#818)
by
unknown
12:19 queued 08:29
created

MetsDocument::getMetadataIds()   B

Complexity

Conditions 7
Paths 15

Size

Total Lines 37
Code Lines 21

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 7
eloc 21
nc 15
nop 1
dl 0
loc 37
rs 8.6506
c 0
b 0
f 0
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
16
use TYPO3\CMS\Core\Database\ConnectionPool;
17
use TYPO3\CMS\Core\Database\Query\Restriction\HiddenRestriction;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
use Ubl\Iiif\Tools\IiifHelper;
20
use Ubl\Iiif\Services\AbstractImageService;
21
use TYPO3\CMS\Core\Log\LogManager;
22
23
/**
24
 * MetsDocument class for the 'dlf' extension.
25
 *
26
 * @author Sebastian Meyer <[email protected]>
27
 * @author Henrik Lochmann <[email protected]>
28
 * @package TYPO3
29
 * @subpackage dlf
30
 * @access public
31
 * @property int $cPid This holds the PID for the configuration
32
 * @property-read array $mdSec Associative array of METS metadata sections indexed by their IDs.
33
 * @property-read array $dmdSec Subset of `$mdSec` storing only the dmdSec entries; kept for compatibility.
34
 * @property-read array $fileGrps This holds the file ID -> USE concordance
35
 * @property-read bool $hasFulltext Are there any fulltext files available?
36
 * @property-read array $metadataArray This holds the documents' parsed metadata array
37
 * @property-read \SimpleXMLElement $mets This holds the XML file's METS part as \SimpleXMLElement object
38
 * @property-read int $numPages The holds the total number of pages
39
 * @property-read int $parentId This holds the UID of the parent document or zero if not multi-volumed
40
 * @property-read array $physicalStructure This holds the physical structure
41
 * @property-read array $physicalStructureInfo This holds the physical structure metadata
42
 * @property-read int $pid This holds the PID of the document or zero if not in database
43
 * @property-read bool $ready Is the document instantiated successfully?
44
 * @property-read string $recordId The METS file's / IIIF manifest's record identifier
45
 * @property-read int $rootId This holds the UID of the root document or zero if not multi-volumed
46
 * @property-read array $smLinks This holds the smLinks between logical and physical structMap
47
 * @property-read array $tableOfContents This holds the logical structure
48
 * @property-read string $thumbnail This holds the document's thumbnail location
49
 * @property-read string $toplevelId This holds the toplevel structure's @ID (METS) or the manifest's @id (IIIF)
50
 */
51
final class MetsDocument extends Doc
52
{
53
    /**
54
     * Subsections / tags that may occur within `<mets:amdSec>`.
55
     *
56
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#amdSec
57
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#mdSecType
58
     *
59
     * @var string[]
60
     */
61
    protected const ALLOWED_AMD_SEC = ['techMD', 'rightsMD', 'sourceMD', 'digiprovMD'];
62
63
    /**
64
     * This holds the whole XML file as string for serialization purposes
65
     * @see __sleep() / __wakeup()
66
     *
67
     * @var string
68
     * @access protected
69
     */
70
    protected $asXML = '';
71
72
    /**
73
     * This maps the ID of each amdSec to the IDs of its children (techMD etc.).
74
     * When an ADMID references an amdSec instead of techMD etc., this is used to iterate the child elements.
75
     *
76
     * @var string[]
77
     * @access protected
78
     */
79
    protected $amdSecChildIds = [];
80
81
    /**
82
     * Associative array of METS metadata sections indexed by their IDs.
83
     *
84
     * @var array
85
     * @access protected
86
     */
87
    protected $mdSec = [];
88
89
    /**
90
     * Are the METS file's metadata sections loaded?
91
     * @see MetsDocument::$mdSec
92
     *
93
     * @var bool
94
     * @access protected
95
     */
96
    protected $mdSecLoaded = false;
97
98
    /**
99
     * Subset of $mdSec storing only the dmdSec entries; kept for compatibility.
100
     *
101
     * @var array
102
     * @access protected
103
     */
104
    protected $dmdSec = [];
105
106
    /**
107
     * The extension key
108
     *
109
     * @var	string
110
     * @access public
111
     */
112
    public static $extKey = 'dlf';
113
114
    /**
115
     * This holds the file ID -> USE concordance
116
     * @see _getFileGrps()
117
     *
118
     * @var array
119
     * @access protected
120
     */
121
    protected $fileGrps = [];
122
123
    /**
124
     * Are the image file groups loaded?
125
     * @see $fileGrps
126
     *
127
     * @var bool
128
     * @access protected
129
     */
130
    protected $fileGrpsLoaded = false;
131
132
    /**
133
     * This holds the XML file's METS part as \SimpleXMLElement object
134
     *
135
     * @var \SimpleXMLElement
136
     * @access protected
137
     */
138
    protected $mets;
139
140
    /**
141
     * This holds the whole XML file as \SimpleXMLElement object
142
     *
143
     * @var \SimpleXMLElement
144
     * @access protected
145
     */
146
    protected $xml;
147
148
    /**
149
     * This adds metadata from METS structural map to metadata array.
150
     *
151
     * @access	public
152
     *
153
     * @param	array	&$metadata: The metadata array to extend
154
     * @param	string	$id: The "@ID" attribute of the logical structure node
155
     *
156
     * @return  void
157
     */
158
    public function addMetadataFromMets(&$metadata, $id)
159
    {
160
        $details = $this->getLogicalStructure($id);
161
        if (!empty($details)) {
162
            $metadata['mets_order'][0] = $details['order'];
163
            $metadata['mets_label'][0] = $details['label'];
164
            $metadata['mets_orderlabel'][0] = $details['orderlabel'];
165
        }
166
    }
167
168
    /**
169
     *
170
     * {@inheritDoc}
171
     * @see \Kitodo\Dlf\Common\Doc::establishRecordId()
172
     */
173
    protected function establishRecordId($pid)
174
    {
175
        // Check for METS object @ID.
176
        if (!empty($this->mets['OBJID'])) {
177
            $this->recordId = (string) $this->mets['OBJID'];
178
        }
179
        // Get hook objects.
180
        $hookObjects = Helper::getHookObjects('Classes/Common/MetsDocument.php');
181
        // Apply hooks.
182
        foreach ($hookObjects as $hookObj) {
183
            if (method_exists($hookObj, 'construct_postProcessRecordId')) {
184
                $hookObj->construct_postProcessRecordId($this->xml, $this->recordId);
185
            }
186
        }
187
    }
188
189
    /**
190
     *
191
     * {@inheritDoc}
192
     * @see \Kitodo\Dlf\Common\Doc::getDownloadLocation()
193
     */
194
    public function getDownloadLocation($id)
195
    {
196
        $fileMimeType = $this->getFileMimeType($id);
197
        $fileLocation = $this->getFileLocation($id);
198
        if ($fileMimeType === 'application/vnd.kitodo.iiif') {
199
            $fileLocation = (strrpos($fileLocation, 'info.json') === strlen($fileLocation) - 9) ? $fileLocation : (strrpos($fileLocation, '/') === strlen($fileLocation) ? $fileLocation . 'info.json' : $fileLocation . '/info.json');
200
            $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
201
            IiifHelper::setUrlReader(IiifUrlReader::getInstance());
202
            IiifHelper::setMaxThumbnailHeight($conf['iiifThumbnailHeight']);
203
            IiifHelper::setMaxThumbnailWidth($conf['iiifThumbnailWidth']);
204
            $service = IiifHelper::loadIiifResource($fileLocation);
205
            if ($service !== null && $service instanceof AbstractImageService) {
206
                return $service->getImageUrl();
207
            }
208
        } elseif ($fileMimeType === 'application/vnd.netfpx') {
209
            $baseURL = $fileLocation . (strpos($fileLocation, '?') === false ? '?' : '');
210
            // TODO CVT is an optional IIP server capability; in theory, capabilities should be determined in the object request with '&obj=IIP-server'
211
            return $baseURL . '&CVT=jpeg';
212
        }
213
        return $fileLocation;
214
    }
215
216
    /**
217
     * {@inheritDoc}
218
     * @see \Kitodo\Dlf\Common\Doc::getFileLocation()
219
     */
220
    public function getFileLocation($id)
221
    {
222
        $location = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/mets:FLocat[@LOCTYPE="URL"]');
223
        if (
224
            !empty($id)
225
            && !empty($location)
226
        ) {
227
            return (string) $location[0]->attributes('http://www.w3.org/1999/xlink')->href;
228
        } else {
229
            $this->logger->warning('There is no file node with @ID "' . $id . '"');
230
            return '';
231
        }
232
    }
233
234
    /**
235
     * {@inheritDoc}
236
     * @see \Kitodo\Dlf\Common\Doc::getFileMimeType()
237
     */
238
    public function getFileMimeType($id)
239
    {
240
        $mimetype = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/@MIMETYPE');
241
        if (
242
            !empty($id)
243
            && !empty($mimetype)
244
        ) {
245
            return (string) $mimetype[0];
246
        } else {
247
            $this->logger->warning('There is no file node with @ID "' . $id . '" or no MIME type specified');
248
            return '';
249
        }
250
    }
251
252
    /**
253
     * {@inheritDoc}
254
     * @see \Kitodo\Dlf\Common\Doc::getLogicalStructure()
255
     */
256
    public function getLogicalStructure($id, $recursive = false)
257
    {
258
        $details = [];
259
        // Is the requested logical unit already loaded?
260
        if (
261
            !$recursive
262
            && !empty($this->logicalUnits[$id])
263
        ) {
264
            // Yes. Return it.
265
            return $this->logicalUnits[$id];
266
        } elseif (!empty($id)) {
267
            // Get specified logical unit.
268
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]');
269
        } else {
270
            // Get all logical units at top level.
271
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]/mets:div');
272
        }
273
        if (!empty($divs)) {
274
            if (!$recursive) {
275
                // Get the details for the first xpath hit.
276
                $details = $this->getLogicalStructureInfo($divs[0]);
277
            } else {
278
                // Walk the logical structure recursively and fill the whole table of contents.
279
                foreach ($divs as $div) {
280
                    $this->tableOfContents[] = $this->getLogicalStructureInfo($div, $recursive);
281
                }
282
            }
283
        }
284
        return $details;
285
    }
286
287
    /**
288
     * This gets details about a logical structure element
289
     *
290
     * @access protected
291
     *
292
     * @param \SimpleXMLElement $structure: The logical structure node
293
     * @param bool $recursive: Whether to include the child elements
294
     *
295
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
296
     */
297
    protected function getLogicalStructureInfo(\SimpleXMLElement $structure, $recursive = false)
298
    {
299
        // Get attributes.
300
        foreach ($structure->attributes() as $attribute => $value) {
301
            $attributes[$attribute] = (string) $value;
302
        }
303
        // Load plugin configuration.
304
        $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
305
        // Extract identity information.
306
        $details = [];
307
        $details['id'] = $attributes['ID'];
308
        $details['dmdId'] = (isset($attributes['DMDID']) ? $attributes['DMDID'] : '');
309
        $details['admId'] = (isset($attributes['ADMID']) ? $attributes['ADMID'] : '');
310
        $details['order'] = (isset($attributes['ORDER']) ? $attributes['ORDER'] : '');
311
        $details['label'] = (isset($attributes['LABEL']) ? $attributes['LABEL'] : '');
312
        $details['orderlabel'] = (isset($attributes['ORDERLABEL']) ? $attributes['ORDERLABEL'] : '');
313
        $details['contentIds'] = (isset($attributes['CONTENTIDS']) ? $attributes['CONTENTIDS'] : '');
314
        $details['volume'] = '';
315
        // Set volume information only if no label is set and this is the toplevel structure element.
316
        if (
317
            empty($details['label'])
318
            && $details['id'] == $this->_getToplevelId()
319
        ) {
320
            $metadata = $this->getMetadata($details['id']);
321
            if (!empty($metadata['volume'][0])) {
322
                $details['volume'] = $metadata['volume'][0];
323
            }
324
        }
325
        $details['pagination'] = '';
326
        $details['type'] = $attributes['TYPE'];
327
        $details['thumbnailId'] = '';
328
        // Load smLinks.
329
        $this->_getSmLinks();
330
        // Load physical structure.
331
        $this->_getPhysicalStructure();
332
        // Get the physical page or external file this structure element is pointing at.
333
        $details['points'] = '';
334
        // Is there a mptr node?
335
        if (count($structure->children('http://www.loc.gov/METS/')->mptr)) {
336
            // Yes. Get the file reference.
337
            $details['points'] = (string) $structure->children('http://www.loc.gov/METS/')->mptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
338
        } elseif (
339
            !empty($this->physicalStructure)
340
            && array_key_exists($details['id'], $this->smLinks['l2p'])
341
        ) {
342
            // Link logical structure to the first corresponding physical page/track.
343
            $details['points'] = max(intval(array_search($this->smLinks['l2p'][$details['id']][0], $this->physicalStructure, true)), 1);
344
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
345
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
346
                if (!empty($this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb])) {
347
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb];
348
                    break;
349
                }
350
            }
351
            // Get page/track number of the first page/track related to this structure element.
352
            $details['pagination'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['orderlabel'];
353
            $details['videoChapter'] = $this->getTimecode($details, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
354
        } elseif ($details['id'] == $this->_getToplevelId()) {
355
            // Point to self if this is the toplevel structure.
356
            $details['points'] = 1;
357
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
358
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
359
                if (
360
                    !empty($this->physicalStructure)
361
                    && !empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])
362
                ) {
363
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb];
364
                    break;
365
                }
366
            }
367
        }
368
        // Get the files this structure element is pointing at.
369
        $details['files'] = [];
370
        $details['all_files'] = [];
371
        $fileUse = $this->_getFileGrps();
372
        // Get the file representations from fileSec node.
373
        foreach ($structure->children('http://www.loc.gov/METS/')->fptr as $fptr) {
374
            // Check if file has valid @USE attribute.
375
            if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

375
            if (!empty($fileUse[(string) $fptr->/** @scrutinizer ignore-call */ attributes()->FILEID])) {

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
376
                $details['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
377
                $details['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
378
            }
379
        }
380
        // Keep for later usage.
381
        $this->logicalUnits[$details['id']] = $details;
382
        // Walk the structure recursively? And are there any children of the current element?
383
        if (
384
            $recursive
385
            && count($structure->children('http://www.loc.gov/METS/')->div)
386
        ) {
387
            $details['children'] = [];
388
            foreach ($structure->children('http://www.loc.gov/METS/')->div as $child) {
389
                // Repeat for all children.
390
                $details['children'][] = $this->getLogicalStructureInfo($child, true);
0 ignored issues
show
Bug introduced by
It seems like $child can also be of type null; however, parameter $structure of Kitodo\Dlf\Common\MetsDo...tLogicalStructureInfo() does only seem to accept SimpleXMLElement, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

390
                $details['children'][] = $this->getLogicalStructureInfo(/** @scrutinizer ignore-type */ $child, true);
Loading history...
391
            }
392
        }
393
        return $details;
394
    }
395
396
    protected function getTimecode($logInfo, $fileGrps)
397
    {
398
        foreach ($fileGrps as $fileGrp) {
399
            $physInfo = $this->physicalStructureInfo[$this->smLinks['l2p'][$logInfo['id']][0]];
400
401
            $fileId = $physInfo['all_files'][$fileGrp][0] ?? '';
402
            if (empty($fileId)) {
403
                continue;
404
            }
405
406
            $fileArea = $physInfo['fileInfos'][$fileId]['area'] ?? '';
407
            if (empty($fileArea) || $fileArea['betype'] !== 'TIME') {
408
                continue;
409
            }
410
411
            return [
412
                'fileId' => $fileId,
413
                'timecode' => Helper::timecodeToSeconds($fileArea['begin']),
414
            ];
415
        }
416
    }
417
418
    /**
419
     * {@inheritDoc}
420
     * @see \Kitodo\Dlf\Common\Doc::getMetadata()
421
     */
422
    public function getMetadata($id, $cPid = 0)
423
    {
424
        // Make sure $cPid is a non-negative integer.
425
        $cPid = max(intval($cPid), 0);
426
        // If $cPid is not given, try to get it elsewhere.
427
        if (
428
            !$cPid
429
            && ($this->cPid || $this->pid)
430
        ) {
431
            // Retain current PID.
432
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
433
        } elseif (!$cPid) {
434
            $this->logger->warning('Invalid PID ' . $cPid . ' for metadata definitions');
435
            return [];
436
        }
437
        // Get metadata from parsed metadata array if available.
438
        if (
439
            !empty($this->metadataArray[$id])
440
            && $this->metadataArray[0] == $cPid
441
        ) {
442
            return $this->metadataArray[$id];
443
        }
444
        // Initialize metadata array with empty values.
445
        $metadata = [
446
            'title' => [],
447
            'title_sorting' => [],
448
            'author' => [],
449
            'place' => [],
450
            'year' => [],
451
            'prod_id' => [],
452
            'record_id' => [],
453
            'opac_id' => [],
454
            'union_id' => [],
455
            'urn' => [],
456
            'purl' => [],
457
            'type' => [],
458
            'volume' => [],
459
            'volume_sorting' => [],
460
            'license' => [],
461
            'terms' => [],
462
            'restrictions' => [],
463
            'out_of_print' => [],
464
            'rights_info' => [],
465
            'collection' => [],
466
            'owner' => [],
467
            'mets_label' => [],
468
            'mets_orderlabel' => [],
469
            'document_format' => ['METS'],
470
        ];
471
        $mdIds = $this->getMetadataIds($id);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $mdIds is correct as $this->getMetadataIds($id) targeting Kitodo\Dlf\Common\MetsDocument::getMetadataIds() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
472
        if (empty($mdIds)) {
473
            // There is no metadata section for this structure node.
474
            return [];
475
        }
476
        // Associative array used as set of available section types (dmdSec, techMD, ...)
477
        $hasMetadataSection = [];
478
        // Load available metadata formats and metadata sections.
479
        $this->loadFormats();
480
        $this->_getMdSec();
481
        // Get the structure's type.
482
        if (!empty($this->logicalUnits[$id])) {
483
            $metadata['type'] = [$this->logicalUnits[$id]['type']];
484
        } else {
485
            $struct = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]/@TYPE');
486
            if (!empty($struct)) {
487
                $metadata['type'] = [(string) $struct[0]];
488
            }
489
        }
490
        foreach ($mdIds as $dmdId) {
0 ignored issues
show
Bug introduced by
The expression $mdIds of type void is not traversable.
Loading history...
491
            $mdSectionType = $this->mdSec[$dmdId]['section'];
492
493
            // To preserve behavior of previous Kitodo versions, extract metadata only from first supported dmdSec
494
            // However, we want to extract, for example, all techMD sections (VIDEOMD, AUDIOMD)
495
            if ($mdSectionType === 'dmdSec' && isset($hasMetadataSection['dmdSec'])) {
496
                continue;
497
            }
498
499
            // Is this metadata format supported?
500
            if (!empty($this->formats[$this->mdSec[$dmdId]['type']])) {
501
                if (!empty($this->formats[$this->mdSec[$dmdId]['type']]['class'])) {
502
                    $class = $this->formats[$this->mdSec[$dmdId]['type']]['class'];
503
                    // Get the metadata from class.
504
                    if (
505
                        class_exists($class)
506
                        && ($obj = GeneralUtility::makeInstance($class)) instanceof MetadataInterface
507
                    ) {
508
                        $obj->extractMetadata($this->mdSec[$dmdId]['xml'], $metadata);
509
                    } else {
510
                        $this->logger->warning('Invalid class/method "' . $class . '->extractMetadata()" for metadata format "' . $this->mdSec[$dmdId]['type'] . '"');
511
                    }
512
                }
513
            } else {
514
                $this->logger->notice('Unsupported metadata format "' . $this->mdSec[$dmdId]['type'] . '" in ' . $mdSectionType . ' with @ID "' . $dmdId . '"');
515
                // Continue searching for supported metadata with next @DMDID.
516
                continue;
517
            }
518
            // Get the additional metadata from database.
519
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
520
                ->getQueryBuilderForTable('tx_dlf_metadata');
521
            // Get hidden records, too.
522
            $queryBuilder
523
                ->getRestrictions()
524
                ->removeByType(HiddenRestriction::class);
525
            // Get all metadata with configured xpath and applicable format first.
526
            $resultWithFormat = $queryBuilder
527
                ->select(
528
                    'tx_dlf_metadata.index_name AS index_name',
529
                    'tx_dlf_metadataformat_joins.xpath AS xpath',
530
                    'tx_dlf_metadataformat_joins.xpath_sorting AS xpath_sorting',
531
                    'tx_dlf_metadata.is_sortable AS is_sortable',
532
                    'tx_dlf_metadata.default_value AS default_value',
533
                    'tx_dlf_metadata.format AS format'
534
                )
535
                ->from('tx_dlf_metadata')
536
                ->innerJoin(
537
                    'tx_dlf_metadata',
538
                    'tx_dlf_metadataformat',
539
                    'tx_dlf_metadataformat_joins',
540
                    $queryBuilder->expr()->eq(
541
                        'tx_dlf_metadataformat_joins.parent_id',
542
                        'tx_dlf_metadata.uid'
543
                    )
544
                )
545
                ->innerJoin(
546
                    'tx_dlf_metadataformat_joins',
547
                    'tx_dlf_formats',
548
                    'tx_dlf_formats_joins',
549
                    $queryBuilder->expr()->eq(
550
                        'tx_dlf_formats_joins.uid',
551
                        'tx_dlf_metadataformat_joins.encoded'
552
                    )
553
                )
554
                ->where(
555
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
556
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
557
                    $queryBuilder->expr()->eq('tx_dlf_metadataformat_joins.pid', intval($cPid)),
558
                    $queryBuilder->expr()->eq('tx_dlf_formats_joins.type', $queryBuilder->createNamedParameter($this->mdSec[$dmdId]['type']))
559
                )
560
                ->execute();
561
            // Get all metadata without a format, but with a default value next.
562
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
563
                ->getQueryBuilderForTable('tx_dlf_metadata');
564
            // Get hidden records, too.
565
            $queryBuilder
566
                ->getRestrictions()
567
                ->removeByType(HiddenRestriction::class);
568
            $resultWithoutFormat = $queryBuilder
569
                ->select(
570
                    'tx_dlf_metadata.index_name AS index_name',
571
                    'tx_dlf_metadata.is_sortable AS is_sortable',
572
                    'tx_dlf_metadata.default_value AS default_value',
573
                    'tx_dlf_metadata.format AS format'
574
                )
575
                ->from('tx_dlf_metadata')
576
                ->where(
577
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
578
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
579
                    $queryBuilder->expr()->eq('tx_dlf_metadata.format', 0),
580
                    $queryBuilder->expr()->neq('tx_dlf_metadata.default_value', $queryBuilder->createNamedParameter(''))
581
                )
582
                ->execute();
583
            // Merge both result sets.
584
            $allResults = array_merge($resultWithFormat->fetchAll(), $resultWithoutFormat->fetchAll());
585
            // We need a \DOMDocument here, because SimpleXML doesn't support XPath functions properly.
586
            $domNode = dom_import_simplexml($this->mdSec[$dmdId]['xml']);
587
            $domXPath = new \DOMXPath($domNode->ownerDocument);
0 ignored issues
show
Bug introduced by
It seems like $domNode->ownerDocument can also be of type null; however, parameter $document of DOMXPath::__construct() does only seem to accept DOMDocument, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

587
            $domXPath = new \DOMXPath(/** @scrutinizer ignore-type */ $domNode->ownerDocument);
Loading history...
588
            $this->registerNamespaces($domXPath);
589
            // OK, now make the XPath queries.
590
            foreach ($allResults as $resArray) {
591
                // Set metadata field's value(s).
592
                if (
593
                    $resArray['format'] > 0
594
                    && !empty($resArray['xpath'])
595
                    && ($values = $domXPath->evaluate($resArray['xpath'], $domNode))
596
                ) {
597
                    if (
598
                        $values instanceof \DOMNodeList
599
                        && $values->length > 0
600
                    ) {
601
                        $metadata[$resArray['index_name']] = [];
602
                        foreach ($values as $value) {
603
                            $metadata[$resArray['index_name']][] = trim((string) $value->nodeValue);
604
                        }
605
                    } elseif (!($values instanceof \DOMNodeList)) {
606
                        $metadata[$resArray['index_name']] = [trim((string) $values)];
607
                    }
608
                }
609
                // Set default value if applicable.
610
                if (
611
                    empty($metadata[$resArray['index_name']][0])
612
                    && strlen($resArray['default_value']) > 0
613
                ) {
614
                    $metadata[$resArray['index_name']] = [$resArray['default_value']];
615
                }
616
                // Set sorting value if applicable.
617
                if (
618
                    !empty($metadata[$resArray['index_name']])
619
                    && $resArray['is_sortable']
620
                ) {
621
                    if (
622
                        $resArray['format'] > 0
623
                        && !empty($resArray['xpath_sorting'])
624
                        && ($values = $domXPath->evaluate($resArray['xpath_sorting'], $domNode))
625
                    ) {
626
                        if (
627
                            $values instanceof \DOMNodeList
628
                            && $values->length > 0
629
                        ) {
630
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values->item(0)->nodeValue);
631
                        } elseif (!($values instanceof \DOMNodeList)) {
632
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values);
633
                        }
634
                    }
635
                    if (empty($metadata[$resArray['index_name'] . '_sorting'][0])) {
636
                        $metadata[$resArray['index_name'] . '_sorting'][0] = $metadata[$resArray['index_name']][0];
637
                    }
638
                }
639
            }
640
641
            $hasMetadataSection[$mdSectionType] = true;
642
        }
643
        // Set title to empty string if not present.
644
        if (empty($metadata['title'][0])) {
645
            $metadata['title'][0] = '';
646
            $metadata['title_sorting'][0] = '';
647
        }
648
        if (isset($hasMetadataSection['dmdSec'])) {
649
            return $metadata;
650
        } else {
651
            $this->logger->warning('No supported descriptive metadata found for logical structure with @ID "' . $id . '"');
652
            return [];
653
        }
654
    }
655
656
    /**
657
     * Get IDs of (descriptive and administrative) metadata sections
658
     * referenced by logical structure node of given $id.
659
     *
660
     * @access protected
661
     * @param string $id: The "@ID" attribute of the file node
662
     * @return void
663
     */
664
    protected function getMetadataIds($id)
665
    {
666
        // ­Load amdSecChildIds concordance
667
        $this->_getMdSec();
668
669
        // Get DMDID and ADMID of logical structure node
670
        if (!empty($this->logicalUnits[$id])) {
671
            $dmdIds = $this->logicalUnits[$id]['dmdId'] ?? '';
672
            $admIds = $this->logicalUnits[$id]['admId'] ?? '';
673
        } else {
674
            $mdSec = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]')[0];
675
            if ($mdSec) {
0 ignored issues
show
introduced by
$mdSec is of type SimpleXMLElement, thus it always evaluated to true.
Loading history...
676
                $dmdIds = (string) $mdSec->attributes()->DMDID;
677
                $admIds = (string) $mdSec->attributes()->ADMID;
678
            } else {
679
                $dmdIds = '';
680
                $admIds = '';
681
            }
682
        }
683
684
        // Handle multiple DMDIDs/ADMIDs
685
        $allMdIds = explode(' ', $dmdIds);
686
687
        foreach (explode(' ', $admIds) as $admId) {
688
            if (isset($this->mdSec[$admId])) {
689
                // $admId references an actual metadata section such as techMD
690
                $allMdIds[] = $admId;
691
            } elseif (isset($this->amdSecChildIds[$admId])) {
692
                // $admId references a <mets:amdSec> element. Resolve child elements.
693
                foreach ($this->amdSecChildIds[$admId] as $childId) {
0 ignored issues
show
Bug introduced by
The expression $this->amdSecChildIds[$admId] of type string is not traversable.
Loading history...
694
                    $allMdIds[] = $childId;
695
                }
696
            }
697
        }
698
699
        return array_filter($allMdIds, function ($element) {
0 ignored issues
show
Bug Best Practice introduced by
The expression return array_filter($all...ion(...) { /* ... */ }) returns the type array which is incompatible with the documented return type void.
Loading history...
700
            return !empty($element);
701
        });
702
    }
703
704
    /**
705
     * {@inheritDoc}
706
     * @see \Kitodo\Dlf\Common\Doc::getFullText()
707
     */
708
    public function getFullText($id)
709
    {
710
        $fullText = '';
711
712
        // Load fileGrps and check for full text files.
713
        $this->_getFileGrps();
714
        if ($this->hasFulltext) {
715
            $fullText = $this->getFullTextFromXml($id);
716
        }
717
        return $fullText;
718
    }
719
720
    /**
721
     * {@inheritDoc}
722
     * @see Doc::getStructureDepth()
723
     */
724
    public function getStructureDepth($logId)
725
    {
726
        $ancestors = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $logId . '"]/ancestor::*');
727
        if (!empty($ancestors)) {
728
            return count($ancestors);
729
        } else {
730
            return 0;
731
        }
732
    }
733
734
    /**
735
     * {@inheritDoc}
736
     * @see \Kitodo\Dlf\Common\Doc::init()
737
     */
738
    protected function init($location)
739
    {
740
        $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class($this));
741
        // Get METS node from XML file.
742
        $this->registerNamespaces($this->xml);
743
        $mets = $this->xml->xpath('//mets:mets');
744
        if (!empty($mets)) {
745
            $this->mets = $mets[0];
746
            // Register namespaces.
747
            $this->registerNamespaces($this->mets);
748
        } else {
749
            if (!empty($location)) {
750
                $this->logger->error('No METS part found in document with location "' . $location . '".');
751
            } else if (!empty($this->recordId)) {
752
                $this->logger->error('No METS part found in document with recordId "' . $this->recordId . '".');
753
            } else {
754
                $this->logger->error('No METS part found in current document.');
755
            }
756
        }
757
    }
758
759
    /**
760
     * {@inheritDoc}
761
     * @see \Kitodo\Dlf\Common\Doc::loadLocation()
762
     */
763
    protected function loadLocation($location)
764
    {
765
        $fileResource = Helper::getUrl($location);
766
        if ($fileResource !== false) {
767
            $xml = Helper::getXmlFileAsString($fileResource);
768
            // Set some basic properties.
769
            if ($xml !== false) {
770
                $this->xml = $xml;
771
                return true;
772
            }
773
        }
774
        $this->logger->error('Could not load XML file from "' . $location . '"');
775
        return false;
776
    }
777
778
    /**
779
     * {@inheritDoc}
780
     * @see \Kitodo\Dlf\Common\Doc::ensureHasFulltextIsSet()
781
     */
782
    protected function ensureHasFulltextIsSet()
783
    {
784
        // Are the fileGrps already loaded?
785
        if (!$this->fileGrpsLoaded) {
786
            $this->_getFileGrps();
787
        }
788
    }
789
790
    /**
791
     * {@inheritDoc}
792
     * @see Doc::setPreloadedDocument()
793
     */
794
    protected function setPreloadedDocument($preloadedDocument)
795
    {
796
797
        if ($preloadedDocument instanceof \SimpleXMLElement) {
798
            $this->xml = $preloadedDocument;
799
            return true;
800
        }
801
        return false;
802
    }
803
804
    /**
805
     * {@inheritDoc}
806
     * @see Doc::getDocument()
807
     */
808
    protected function getDocument()
809
    {
810
        return $this->mets;
811
    }
812
813
    /**
814
     * This builds an array of the document's metadata sections
815
     *
816
     * @access protected
817
     *
818
     * @return array Array of metadata sections with their IDs as array key
819
     */
820
    protected function _getMdSec()
821
    {
822
        if (!$this->mdSecLoaded) {
823
            $this->loadFormats();
824
825
            foreach ($this->mets->xpath('./mets:dmdSec') as $dmdSecTag) {
826
                $dmdSec = $this->processMdSec($dmdSecTag);
827
828
                if ($dmdSec !== null) {
829
                    $this->mdSec[$dmdSec['id']] = $dmdSec;
0 ignored issues
show
Bug introduced by
The property mdSec is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
830
                    $this->dmdSec[$dmdSec['id']] = $dmdSec;
831
                }
832
            }
833
834
            foreach ($this->mets->xpath('./mets:amdSec') as $amdSecTag) {
835
                $childIds = [];
836
837
                foreach ($amdSecTag->children('http://www.loc.gov/METS/') as $mdSecTag) {
838
                    if (!in_array($mdSecTag->getName(), self::ALLOWED_AMD_SEC)) {
839
                        continue;
840
                    }
841
842
                    // TODO: Should we check that the format may occur within this type (e.g., to ignore VIDEOMD within rightsMD)?
843
                    $mdSec = $this->processMdSec($mdSecTag);
844
845
                    if ($mdSec !== null) {
846
                        $this->mdSec[$mdSec['id']] = $mdSec;
847
848
                        $childIds[] = $mdSec['id'];
849
                    }
850
                }
851
852
                $amdSecId = (string) $amdSecTag->attributes()->ID;
853
                if (!empty($amdSecId)) {
854
                    $this->amdSecChildIds[$amdSecId] = $childIds;
855
                }
856
            }
857
858
            $this->mdSecLoaded = true;
859
        }
860
        return $this->mdSec;
861
    }
862
863
    protected function _getDmdSec()
864
    {
865
        $this->_getMdSec();
866
        return $this->dmdSec;
867
    }
868
869
    /**
870
     * Processes an element of METS `mdSecType`.
871
     *
872
     * @access protected
873
     *
874
     * @param \SimpleXMLElement $element
875
     *
876
     * @return array|null The processed metadata section
877
     */
878
    protected function processMdSec($element)
879
    {
880
        $mdId = (string) $element->attributes()->ID;
881
        if (empty($mdId)) {
882
            return null;
883
        }
884
885
        $this->registerNamespaces($element);
886
        if ($type = $element->xpath('./mets:mdWrap[not(@MDTYPE="OTHER")]/@MDTYPE')) {
887
            if (!empty($this->formats[(string) $type[0]])) {
888
                $type = (string) $type[0];
889
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
890
            }
891
        } elseif ($type = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"]/@OTHERMDTYPE')) {
892
            if (!empty($this->formats[(string) $type[0]])) {
893
                $type = (string) $type[0];
894
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"][@OTHERMDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
895
            }
896
        }
897
898
        if (empty($xml)) {
899
            return null;
900
        }
901
902
        $this->registerNamespaces($xml[0]);
903
904
        return [
905
            'id' => $mdId,
906
            'section' => $element->getName(),
907
            'type' => $type,
908
            'xml' => $xml[0],
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $xml does not seem to be defined for all execution paths leading up to this point.
Loading history...
909
        ];
910
    }
911
912
    /**
913
     * This builds the file ID -> USE concordance
914
     *
915
     * @access protected
916
     *
917
     * @return array Array of file use groups with file IDs
918
     */
919
    protected function _getFileGrps()
920
    {
921
        if (!$this->fileGrpsLoaded) {
922
            // Get configured USE attributes.
923
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
924
            $useGrps = GeneralUtility::trimExplode(',', $extConf['fileGrpImages']);
925
            if (!empty($extConf['fileGrpThumbs'])) {
926
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']));
927
            }
928
            if (!empty($extConf['fileGrpDownload'])) {
929
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpDownload']));
930
            }
931
            if (!empty($extConf['fileGrpFulltext'])) {
932
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']));
933
            }
934
            if (!empty($extConf['fileGrpAudio'])) {
935
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpAudio']));
936
            }
937
            if (!empty($extConf['fileGrpVideo'])) {
938
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
939
            }
940
            // Get all file groups.
941
            $fileGrps = $this->mets->xpath('./mets:fileSec/mets:fileGrp');
942
            if (!empty($fileGrps)) {
943
                // Build concordance for configured USE attributes.
944
                foreach ($fileGrps as $fileGrp) {
945
                    if (in_array((string) $fileGrp['USE'], $useGrps)) {
946
                        foreach ($fileGrp->children('http://www.loc.gov/METS/')->file as $file) {
947
                            $this->fileGrps[(string) $file->attributes()->ID] = (string) $fileGrp['USE'];
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

947
                            $this->fileGrps[(string) $file->/** @scrutinizer ignore-call */ attributes()->ID] = (string) $fileGrp['USE'];

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
948
                        }
949
                    }
950
                }
951
            }
952
            // Are there any fulltext files available?
953
            if (
954
                !empty($extConf['fileGrpFulltext'])
955
                && array_intersect(GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']), $this->fileGrps) !== []
956
            ) {
957
                $this->hasFulltext = true;
958
            }
959
            $this->fileGrpsLoaded = true;
960
        }
961
        return $this->fileGrps;
962
    }
963
964
    /**
965
     * {@inheritDoc}
966
     * @see \Kitodo\Dlf\Common\Doc::prepareMetadataArray()
967
     */
968
    protected function prepareMetadataArray($cPid)
969
    {
970
        $ids = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID]/@ID');
971
        // Get all logical structure nodes with metadata.
972
        if (!empty($ids)) {
973
            foreach ($ids as $id) {
974
                $this->metadataArray[(string) $id] = $this->getMetadata((string) $id, $cPid);
975
            }
976
        }
977
        // Set current PID for metadata definitions.
978
    }
979
980
    /**
981
     * This returns $this->mets via __get()
982
     *
983
     * @access protected
984
     *
985
     * @return \SimpleXMLElement The XML's METS part as \SimpleXMLElement object
986
     */
987
    protected function _getMets()
988
    {
989
        return $this->mets;
990
    }
991
992
    /**
993
     * {@inheritDoc}
994
     * @see \Kitodo\Dlf\Common\Doc::_getPhysicalStructure()
995
     */
996
    protected function _getPhysicalStructure()
997
    {
998
        // Is there no physical structure array yet?
999
        if (!$this->physicalStructureLoaded) {
1000
            // Does the document have a structMap node of type "PHYSICAL"?
1001
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div');
1002
            if (!empty($elementNodes)) {
1003
                // Get file groups.
1004
                $fileUse = $this->_getFileGrps();
1005
                // Get the physical sequence's metadata.
1006
                $physNode = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]');
1007
                $physSeq[0] = (string) $physNode[0]['ID'];
1008
                $this->physicalStructureInfo[$physSeq[0]]['id'] = (string) $physNode[0]['ID'];
1009
                $this->physicalStructureInfo[$physSeq[0]]['dmdId'] = (isset($physNode[0]['DMDID']) ? (string) $physNode[0]['DMDID'] : '');
1010
                $this->physicalStructureInfo[$physSeq[0]]['admId'] = (isset($physNode[0]['ADMID']) ? (string) $physNode[0]['ADMID'] : '');
1011
                $this->physicalStructureInfo[$physSeq[0]]['order'] = (isset($physNode[0]['ORDER']) ? (string) $physNode[0]['ORDER'] : '');
1012
                $this->physicalStructureInfo[$physSeq[0]]['label'] = (isset($physNode[0]['LABEL']) ? (string) $physNode[0]['LABEL'] : '');
1013
                $this->physicalStructureInfo[$physSeq[0]]['orderlabel'] = (isset($physNode[0]['ORDERLABEL']) ? (string) $physNode[0]['ORDERLABEL'] : '');
1014
                $this->physicalStructureInfo[$physSeq[0]]['type'] = (string) $physNode[0]['TYPE'];
1015
                $this->physicalStructureInfo[$physSeq[0]]['contentIds'] = (isset($physNode[0]['CONTENTIDS']) ? (string) $physNode[0]['CONTENTIDS'] : '');
1016
                // Get the file representations from fileSec node.
1017
                foreach ($physNode[0]->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1018
                    // Check if file has valid @USE attribute.
1019
                    if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1020
                        $this->physicalStructureInfo[$physSeq[0]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1021
                        $this->physicalStructureInfo[$physSeq[0]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1022
                    }
1023
                }
1024
                // Build the physical elements' array from the physical structMap node.
1025
                foreach ($elementNodes as $elementNode) {
1026
                    $elements[(int) $elementNode['ORDER']] = (string) $elementNode['ID'];
1027
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['id'] = (string) $elementNode['ID'];
1028
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['dmdId'] = (isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '');
1029
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['admId'] = (isset($elementNode['ADMID']) ? (string) $elementNode['ADMID'] : '');
1030
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['order'] = (isset($elementNode['ORDER']) ? (string) $elementNode['ORDER'] : '');
1031
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['label'] = (isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '');
1032
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['orderlabel'] = (isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '');
1033
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['type'] = (string) $elementNode['TYPE'];
1034
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['contentIds'] = (isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '');
1035
                    // Get the file representations from fileSec node.
1036
                    foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1037
                        // Check if file has valid @USE attribute.
1038
                        if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1039
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1040
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1041
                        } elseif ($area = $fptr->children('http://www.loc.gov/METS/')->area) {
1042
                            $areaAttrs = $area->attributes();
1043
                            $fileId = (string) $areaAttrs->FILEID;
1044
                            $physInfo = &$this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]];
1045
1046
                            $physInfo['files'][$fileUse[$fileId]] = $fileId;
1047
                            $physInfo['all_files'][$fileUse[$fileId]][] = $fileId;
1048
                            $physInfo['fileInfos'][$fileId]['area'] = [
1049
                                'begin' => (string) $areaAttrs->BEGIN,
1050
                                'betype' => (string) $areaAttrs->BETYPE,
1051
                                'extent' => (string) $areaAttrs->EXTENT,
1052
                                'exttype' => (string) $areaAttrs->EXTTYPE,
1053
                            ];
1054
                        }
1055
                    }
1056
                }
1057
                // Sort array by keys (= @ORDER).
1058
                if (ksort($elements)) {
1059
                    // Set total number of pages/tracks.
1060
                    $this->numPages = count($elements);
1061
                    // Merge and re-index the array to get nice numeric indexes.
1062
                    $this->physicalStructure = array_merge($physSeq, $elements);
1063
                }
1064
            }
1065
            $this->physicalStructureLoaded = true;
1066
        }
1067
        return $this->physicalStructure;
1068
    }
1069
1070
    /**
1071
     * {@inheritDoc}
1072
     * @see \Kitodo\Dlf\Common\Doc::_getSmLinks()
1073
     */
1074
    protected function _getSmLinks()
1075
    {
1076
        if (!$this->smLinksLoaded) {
1077
            $smLinks = $this->mets->xpath('./mets:structLink/mets:smLink');
1078
            if (!empty($smLinks)) {
1079
                foreach ($smLinks as $smLink) {
1080
                    $this->smLinks['l2p'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->from][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->to;
1081
                    $this->smLinks['p2l'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->to][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->from;
1082
                }
1083
            }
1084
            $this->smLinksLoaded = true;
1085
        }
1086
        return $this->smLinks;
1087
    }
1088
1089
    /**
1090
     * {@inheritDoc}
1091
     * @see \Kitodo\Dlf\Common\Doc::_getThumbnail()
1092
     */
1093
    protected function _getThumbnail($forceReload = false)
1094
    {
1095
        if (
1096
            !$this->thumbnailLoaded
1097
            || $forceReload
1098
        ) {
1099
            // Retain current PID.
1100
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
1101
            if (!$cPid) {
1102
                $this->logger->error('Invalid PID ' . $cPid . ' for structure definitions');
1103
                $this->thumbnailLoaded = true;
1104
                return $this->thumbnail;
1105
            }
1106
            // Load extension configuration.
1107
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
1108
            if (empty($extConf['fileGrpThumbs'])) {
1109
                $this->logger->warning('No fileGrp for thumbnails specified');
1110
                $this->thumbnailLoaded = true;
1111
                return $this->thumbnail;
1112
            }
1113
            $strctId = $this->_getToplevelId();
1114
            $metadata = $this->getTitledata($cPid);
1115
1116
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1117
                ->getQueryBuilderForTable('tx_dlf_structures');
1118
1119
            // Get structure element to get thumbnail from.
1120
            $result = $queryBuilder
1121
                ->select('tx_dlf_structures.thumbnail AS thumbnail')
1122
                ->from('tx_dlf_structures')
1123
                ->where(
1124
                    $queryBuilder->expr()->eq('tx_dlf_structures.pid', intval($cPid)),
1125
                    $queryBuilder->expr()->eq('tx_dlf_structures.index_name', $queryBuilder->expr()->literal($metadata['type'][0])),
1126
                    Helper::whereExpression('tx_dlf_structures')
1127
                )
1128
                ->setMaxResults(1)
1129
                ->execute();
1130
1131
            $allResults = $result->fetchAll();
1132
1133
            if (count($allResults) == 1) {
1134
                $resArray = $allResults[0];
1135
                // Get desired thumbnail structure if not the toplevel structure itself.
1136
                if (!empty($resArray['thumbnail'])) {
1137
                    $strctType = Helper::getIndexNameFromUid($resArray['thumbnail'], 'tx_dlf_structures', $cPid);
1138
                    // Check if this document has a structure element of the desired type.
1139
                    $strctIds = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@TYPE="' . $strctType . '"]/@ID');
1140
                    if (!empty($strctIds)) {
1141
                        $strctId = (string) $strctIds[0];
1142
                    }
1143
                }
1144
                // Load smLinks.
1145
                $this->_getSmLinks();
1146
                // Get thumbnail location.
1147
                $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
1148
                while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
1149
                    if (
1150
                        $this->_getPhysicalStructure()
1151
                        && !empty($this->smLinks['l2p'][$strctId])
1152
                        && !empty($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb])
1153
                    ) {
1154
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb]);
1155
                        break;
1156
                    } elseif (!empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])) {
1157
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb]);
1158
                        break;
1159
                    }
1160
                }
1161
            } else {
1162
                $this->logger->error('No structure of type "' . $metadata['type'][0] . '" found in database');
1163
            }
1164
            $this->thumbnailLoaded = true;
1165
        }
1166
        return $this->thumbnail;
1167
    }
1168
1169
    /**
1170
     * {@inheritDoc}
1171
     * @see \Kitodo\Dlf\Common\Doc::_getToplevelId()
1172
     */
1173
    protected function _getToplevelId()
1174
    {
1175
        if (empty($this->toplevelId)) {
1176
            // Get all logical structure nodes with metadata, but without associated METS-Pointers.
1177
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID and not(./mets:mptr)]');
1178
            if (!empty($divs)) {
1179
                // Load smLinks.
1180
                $this->_getSmLinks();
1181
                foreach ($divs as $div) {
1182
                    $id = (string) $div['ID'];
1183
                    // Are there physical structure nodes for this logical structure?
1184
                    if (array_key_exists($id, $this->smLinks['l2p'])) {
1185
                        // Yes. That's what we're looking for.
1186
                        $this->toplevelId = $id;
1187
                        break;
1188
                    } elseif (empty($this->toplevelId)) {
1189
                        // No. Remember this anyway, but keep looking for a better one.
1190
                        $this->toplevelId = $id;
1191
                    }
1192
                }
1193
            }
1194
        }
1195
        return $this->toplevelId;
1196
    }
1197
1198
    /**
1199
     * This magic method is executed prior to any serialization of the object
1200
     * @see __wakeup()
1201
     *
1202
     * @access public
1203
     *
1204
     * @return array Properties to be serialized
1205
     */
1206
    public function __sleep()
1207
    {
1208
        // \SimpleXMLElement objects can't be serialized, thus save the XML as string for serialization
1209
        $this->asXML = $this->xml->asXML();
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->xml->asXML() can also be of type true. However, the property $asXML is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
1210
        return ['uid', 'pid', 'recordId', 'parentId', 'asXML'];
1211
    }
1212
1213
    /**
1214
     * This magic method is used for setting a string value for the object
1215
     *
1216
     * @access public
1217
     *
1218
     * @return string String representing the METS object
1219
     */
1220
    public function __toString()
1221
    {
1222
        $xml = new \DOMDocument('1.0', 'utf-8');
1223
        $xml->appendChild($xml->importNode(dom_import_simplexml($this->mets), true));
1224
        $xml->formatOutput = true;
1225
        return $xml->saveXML();
1226
    }
1227
1228
    /**
1229
     * This magic method is executed after the object is deserialized
1230
     * @see __sleep()
1231
     *
1232
     * @access public
1233
     *
1234
     * @return void
1235
     */
1236
    public function __wakeup()
1237
    {
1238
        $xml = Helper::getXmlFileAsString($this->asXML);
1239
        if ($xml !== false) {
1240
            $this->asXML = '';
1241
            $this->xml = $xml;
1242
            // Rebuild the unserializable properties.
1243
            $this->init('');
1244
        } else {
1245
            $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(static::class);
1246
            $this->logger->error('Could not load XML after deserialization');
1247
        }
1248
    }
1249
}
1250