Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#818)
by
unknown
04:28
created

MetsDocument::getMetadataIds()   B

Complexity

Conditions 8
Paths 20

Size

Total Lines 41
Code Lines 25

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 8
eloc 25
nc 20
nop 1
dl 0
loc 41
rs 8.4444
c 0
b 0
f 0
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
16
use TYPO3\CMS\Core\Database\ConnectionPool;
17
use TYPO3\CMS\Core\Database\Query\Restriction\HiddenRestriction;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
use Ubl\Iiif\Tools\IiifHelper;
20
use Ubl\Iiif\Services\AbstractImageService;
21
use TYPO3\CMS\Core\Log\LogManager;
22
23
/**
24
 * MetsDocument class for the 'dlf' extension.
25
 *
26
 * @author Sebastian Meyer <[email protected]>
27
 * @author Henrik Lochmann <[email protected]>
28
 * @package TYPO3
29
 * @subpackage dlf
30
 * @access public
31
 * @property int $cPid This holds the PID for the configuration
32
 * @property-read array $mdSec Associative array of METS metadata sections indexed by their IDs.
33
 * @property-read array $dmdSec Subset of `$mdSec` storing only the dmdSec entries; kept for compatibility.
34
 * @property-read array $fileGrps This holds the file ID -> USE concordance
35
 * @property-read array $fileInfos Additional information about files (e.g., ADMID), indexed by ID.
36
 * @property-read bool $hasFulltext Are there any fulltext files available?
37
 * @property-read array $metadataArray This holds the documents' parsed metadata array
38
 * @property-read \SimpleXMLElement $mets This holds the XML file's METS part as \SimpleXMLElement object
39
 * @property-read int $numPages The holds the total number of pages
40
 * @property-read int $parentId This holds the UID of the parent document or zero if not multi-volumed
41
 * @property-read array $physicalStructure This holds the physical structure
42
 * @property-read array $physicalStructureInfo This holds the physical structure metadata
43
 * @property-read int $pid This holds the PID of the document or zero if not in database
44
 * @property-read bool $ready Is the document instantiated successfully?
45
 * @property-read string $recordId The METS file's / IIIF manifest's record identifier
46
 * @property-read int $rootId This holds the UID of the root document or zero if not multi-volumed
47
 * @property-read array $smLinks This holds the smLinks between logical and physical structMap
48
 * @property-read array $tableOfContents This holds the logical structure
49
 * @property-read string $thumbnail This holds the document's thumbnail location
50
 * @property-read string $toplevelId This holds the toplevel structure's @ID (METS) or the manifest's @id (IIIF)
51
 */
52
final class MetsDocument extends Doc
53
{
54
    /**
55
     * Subsections / tags that may occur within `<mets:amdSec>`.
56
     *
57
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#amdSec
58
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#mdSecType
59
     *
60
     * @var string[]
61
     */
62
    protected const ALLOWED_AMD_SEC = ['techMD', 'rightsMD', 'sourceMD', 'digiprovMD'];
63
64
    /**
65
     * This holds the whole XML file as string for serialization purposes
66
     * @see __sleep() / __wakeup()
67
     *
68
     * @var string
69
     * @access protected
70
     */
71
    protected $asXML = '';
72
73
    /**
74
     * This maps the ID of each amdSec to the IDs of its children (techMD etc.).
75
     * When an ADMID references an amdSec instead of techMD etc., this is used to iterate the child elements.
76
     *
77
     * @var string[]
78
     * @access protected
79
     */
80
    protected $amdSecChildIds = [];
81
82
    /**
83
     * Associative array of METS metadata sections indexed by their IDs.
84
     *
85
     * @var array
86
     * @access protected
87
     */
88
    protected $mdSec = [];
89
90
    /**
91
     * Are the METS file's metadata sections loaded?
92
     * @see MetsDocument::$mdSec
93
     *
94
     * @var bool
95
     * @access protected
96
     */
97
    protected $mdSecLoaded = false;
98
99
    /**
100
     * Subset of $mdSec storing only the dmdSec entries; kept for compatibility.
101
     *
102
     * @var array
103
     * @access protected
104
     */
105
    protected $dmdSec = [];
106
107
    /**
108
     * The extension key
109
     *
110
     * @var	string
111
     * @access public
112
     */
113
    public static $extKey = 'dlf';
114
115
    /**
116
     * This holds the file ID -> USE concordance
117
     * @see _getFileGrps()
118
     *
119
     * @var array
120
     * @access protected
121
     */
122
    protected $fileGrps = [];
123
124
    /**
125
     * Are the image file groups loaded?
126
     * @see $fileGrps
127
     *
128
     * @var bool
129
     * @access protected
130
     */
131
    protected $fileGrpsLoaded = false;
132
133
    /**
134
     * Additional information about files (e.g., ADMID), indexed by ID.
135
     * TODO: Consider using this for `getFileMimeType()` and `getFileLocation()`.
136
     * @see _getFileInfos()
137
     *
138
     * @var array
139
     * @access protected
140
     */
141
    protected $fileInfos = [];
142
143
    /**
144
     * This holds the XML file's METS part as \SimpleXMLElement object
145
     *
146
     * @var \SimpleXMLElement
147
     * @access protected
148
     */
149
    protected $mets;
150
151
    /**
152
     * This holds the whole XML file as \SimpleXMLElement object
153
     *
154
     * @var \SimpleXMLElement
155
     * @access protected
156
     */
157
    protected $xml;
158
159
    /**
160
     * This adds metadata from METS structural map to metadata array.
161
     *
162
     * @access	public
163
     *
164
     * @param	array	&$metadata: The metadata array to extend
165
     * @param	string	$id: The "@ID" attribute of the logical structure node
166
     *
167
     * @return  void
168
     */
169
    public function addMetadataFromMets(&$metadata, $id)
170
    {
171
        $details = $this->getLogicalStructure($id);
172
        if (!empty($details)) {
173
            $metadata['mets_order'][0] = $details['order'];
174
            $metadata['mets_label'][0] = $details['label'];
175
            $metadata['mets_orderlabel'][0] = $details['orderlabel'];
176
        }
177
    }
178
179
    /**
180
     *
181
     * {@inheritDoc}
182
     * @see \Kitodo\Dlf\Common\Doc::establishRecordId()
183
     */
184
    protected function establishRecordId($pid)
185
    {
186
        // Check for METS object @ID.
187
        if (!empty($this->mets['OBJID'])) {
188
            $this->recordId = (string) $this->mets['OBJID'];
189
        }
190
        // Get hook objects.
191
        $hookObjects = Helper::getHookObjects('Classes/Common/MetsDocument.php');
192
        // Apply hooks.
193
        foreach ($hookObjects as $hookObj) {
194
            if (method_exists($hookObj, 'construct_postProcessRecordId')) {
195
                $hookObj->construct_postProcessRecordId($this->xml, $this->recordId);
196
            }
197
        }
198
    }
199
200
    /**
201
     *
202
     * {@inheritDoc}
203
     * @see \Kitodo\Dlf\Common\Doc::getDownloadLocation()
204
     */
205
    public function getDownloadLocation($id)
206
    {
207
        $fileMimeType = $this->getFileMimeType($id);
208
        $fileLocation = $this->getFileLocation($id);
209
        if ($fileMimeType === 'application/vnd.kitodo.iiif') {
210
            $fileLocation = (strrpos($fileLocation, 'info.json') === strlen($fileLocation) - 9) ? $fileLocation : (strrpos($fileLocation, '/') === strlen($fileLocation) ? $fileLocation . 'info.json' : $fileLocation . '/info.json');
211
            $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
212
            IiifHelper::setUrlReader(IiifUrlReader::getInstance());
213
            IiifHelper::setMaxThumbnailHeight($conf['iiifThumbnailHeight']);
214
            IiifHelper::setMaxThumbnailWidth($conf['iiifThumbnailWidth']);
215
            $service = IiifHelper::loadIiifResource($fileLocation);
216
            if ($service !== null && $service instanceof AbstractImageService) {
217
                return $service->getImageUrl();
218
            }
219
        } elseif ($fileMimeType === 'application/vnd.netfpx') {
220
            $baseURL = $fileLocation . (strpos($fileLocation, '?') === false ? '?' : '');
221
            // TODO CVT is an optional IIP server capability; in theory, capabilities should be determined in the object request with '&obj=IIP-server'
222
            return $baseURL . '&CVT=jpeg';
223
        }
224
        return $fileLocation;
225
    }
226
227
    /**
228
     * {@inheritDoc}
229
     * @see \Kitodo\Dlf\Common\Doc::getFileLocation()
230
     */
231
    public function getFileLocation($id)
232
    {
233
        $location = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/mets:FLocat[@LOCTYPE="URL"]');
234
        if (
235
            !empty($id)
236
            && !empty($location)
237
        ) {
238
            return (string) $location[0]->attributes('http://www.w3.org/1999/xlink')->href;
239
        } else {
240
            $this->logger->warning('There is no file node with @ID "' . $id . '"');
241
            return '';
242
        }
243
    }
244
245
    /**
246
     * {@inheritDoc}
247
     * @see \Kitodo\Dlf\Common\Doc::getFileMimeType()
248
     */
249
    public function getFileMimeType($id)
250
    {
251
        $mimetype = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/@MIMETYPE');
252
        if (
253
            !empty($id)
254
            && !empty($mimetype)
255
        ) {
256
            return (string) $mimetype[0];
257
        } else {
258
            $this->logger->warning('There is no file node with @ID "' . $id . '" or no MIME type specified');
259
            return '';
260
        }
261
    }
262
263
    /**
264
     * {@inheritDoc}
265
     * @see \Kitodo\Dlf\Common\Doc::getLogicalStructure()
266
     */
267
    public function getLogicalStructure($id, $recursive = false)
268
    {
269
        $details = [];
270
        // Is the requested logical unit already loaded?
271
        if (
272
            !$recursive
273
            && !empty($this->logicalUnits[$id])
274
        ) {
275
            // Yes. Return it.
276
            return $this->logicalUnits[$id];
277
        } elseif (!empty($id)) {
278
            // Get specified logical unit.
279
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]');
280
        } else {
281
            // Get all logical units at top level.
282
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]/mets:div');
283
        }
284
        if (!empty($divs)) {
285
            if (!$recursive) {
286
                // Get the details for the first xpath hit.
287
                $details = $this->getLogicalStructureInfo($divs[0]);
288
            } else {
289
                // Walk the logical structure recursively and fill the whole table of contents.
290
                foreach ($divs as $div) {
291
                    $this->tableOfContents[] = $this->getLogicalStructureInfo($div, $recursive);
292
                }
293
            }
294
        }
295
        return $details;
296
    }
297
298
    /**
299
     * This gets details about a logical structure element
300
     *
301
     * @access protected
302
     *
303
     * @param \SimpleXMLElement $structure: The logical structure node
304
     * @param bool $recursive: Whether to include the child elements
305
     *
306
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
307
     */
308
    protected function getLogicalStructureInfo(\SimpleXMLElement $structure, $recursive = false)
309
    {
310
        // Get attributes.
311
        foreach ($structure->attributes() as $attribute => $value) {
312
            $attributes[$attribute] = (string) $value;
313
        }
314
        // Load plugin configuration.
315
        $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
316
        // Extract identity information.
317
        $details = [];
318
        $details['id'] = $attributes['ID'];
319
        $details['dmdId'] = (isset($attributes['DMDID']) ? $attributes['DMDID'] : '');
320
        $details['admId'] = (isset($attributes['ADMID']) ? $attributes['ADMID'] : '');
321
        $details['order'] = (isset($attributes['ORDER']) ? $attributes['ORDER'] : '');
322
        $details['label'] = (isset($attributes['LABEL']) ? $attributes['LABEL'] : '');
323
        $details['orderlabel'] = (isset($attributes['ORDERLABEL']) ? $attributes['ORDERLABEL'] : '');
324
        $details['contentIds'] = (isset($attributes['CONTENTIDS']) ? $attributes['CONTENTIDS'] : '');
325
        $details['volume'] = '';
326
        // Set volume information only if no label is set and this is the toplevel structure element.
327
        if (
328
            empty($details['label'])
329
            && $details['id'] == $this->_getToplevelId()
330
        ) {
331
            $metadata = $this->getMetadata($details['id']);
332
            if (!empty($metadata['volume'][0])) {
333
                $details['volume'] = $metadata['volume'][0];
334
            }
335
        }
336
        $details['pagination'] = '';
337
        $details['type'] = $attributes['TYPE'];
338
        $details['thumbnailId'] = '';
339
        // Load smLinks.
340
        $this->_getSmLinks();
341
        // Load physical structure.
342
        $this->_getPhysicalStructure();
343
        // Get the physical page or external file this structure element is pointing at.
344
        $details['points'] = '';
345
        // Is there a mptr node?
346
        if (count($structure->children('http://www.loc.gov/METS/')->mptr)) {
347
            // Yes. Get the file reference.
348
            $details['points'] = (string) $structure->children('http://www.loc.gov/METS/')->mptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
349
        } elseif (
350
            !empty($this->physicalStructure)
351
            && array_key_exists($details['id'], $this->smLinks['l2p'])
352
        ) {
353
            // Link logical structure to the first corresponding physical page/track.
354
            $details['points'] = max(intval(array_search($this->smLinks['l2p'][$details['id']][0], $this->physicalStructure, true)), 1);
355
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
356
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
357
                if (!empty($this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb])) {
358
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb];
359
                    break;
360
                }
361
            }
362
            // Get page/track number of the first page/track related to this structure element.
363
            $details['pagination'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['orderlabel'];
364
            $details['videoChapter'] = $this->getTimecode($details, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
365
        } elseif ($details['id'] == $this->_getToplevelId()) {
366
            // Point to self if this is the toplevel structure.
367
            $details['points'] = 1;
368
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
369
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
370
                if (
371
                    !empty($this->physicalStructure)
372
                    && !empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])
373
                ) {
374
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb];
375
                    break;
376
                }
377
            }
378
        }
379
        // Get the files this structure element is pointing at.
380
        $details['files'] = [];
381
        $details['all_files'] = [];
382
        $fileUse = $this->_getFileGrps();
383
        // Get the file representations from fileSec node.
384
        foreach ($structure->children('http://www.loc.gov/METS/')->fptr as $fptr) {
385
            // Check if file has valid @USE attribute.
386
            if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

386
            if (!empty($fileUse[(string) $fptr->/** @scrutinizer ignore-call */ attributes()->FILEID])) {

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
387
                $details['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
388
                $details['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
389
            }
390
        }
391
        // Keep for later usage.
392
        $this->logicalUnits[$details['id']] = $details;
393
        // Walk the structure recursively? And are there any children of the current element?
394
        if (
395
            $recursive
396
            && count($structure->children('http://www.loc.gov/METS/')->div)
397
        ) {
398
            $details['children'] = [];
399
            foreach ($structure->children('http://www.loc.gov/METS/')->div as $child) {
400
                // Repeat for all children.
401
                $details['children'][] = $this->getLogicalStructureInfo($child, true);
0 ignored issues
show
Bug introduced by
It seems like $child can also be of type null; however, parameter $structure of Kitodo\Dlf\Common\MetsDo...tLogicalStructureInfo() does only seem to accept SimpleXMLElement, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

401
                $details['children'][] = $this->getLogicalStructureInfo(/** @scrutinizer ignore-type */ $child, true);
Loading history...
402
            }
403
        }
404
        return $details;
405
    }
406
407
    /**
408
     * Get timecode and file IDs that link to first matching fileGrp/USE.
409
     *
410
     * Returns either `null` or an array with the following keys:
411
     * - `fileIds`: Array of linked file IDs
412
     * - `fileIdsJoin`: String where all `fileIds` are joined using ','.
413
     *    This is for convenience when passing `fileIds` in a Fluid template or similar.
414
     * - `timecode`: Time code specified in first matching `<mets:area>`
415
     *
416
     * @param array $logInfo
417
     * @param array $fileGrps
418
     * @return ?array
419
     */
420
    protected function getTimecode($logInfo, $fileGrps)
421
    {
422
        foreach ($fileGrps as $fileGrp) {
423
            $physInfo = $this->physicalStructureInfo[$this->smLinks['l2p'][$logInfo['id']][0]];
424
            $fileIds = $physInfo['all_files'][$fileGrp] ?? [];
425
426
            $chapter = null;
427
428
            foreach ($fileIds as $fileId) {
429
                $fileArea = $physInfo['fileInfos'][$fileId]['area'] ?? '';
430
                if (empty($fileArea) || $fileArea['betype'] !== 'TIME') {
431
                    continue;
432
                }
433
434
                if ($chapter === null) {
435
                    $chapter = [
436
                        'fileIds' => [],
437
                        'timecode' => Helper::timecodeToSeconds($fileArea['begin']),
438
                    ];
439
                }
440
441
                $chapter['fileIds'][] = $fileId;
442
            }
443
444
            if ($chapter !== null) {
445
                $chapter['fileIdsJoin'] = implode(',', $chapter['fileIds']);
446
                return $chapter;
447
            }
448
        }
449
    }
450
451
    /**
452
     * {@inheritDoc}
453
     * @see \Kitodo\Dlf\Common\Doc::getMetadata()
454
     */
455
    public function getMetadata($id, $cPid = 0)
456
    {
457
        // Make sure $cPid is a non-negative integer.
458
        $cPid = max(intval($cPid), 0);
459
        // If $cPid is not given, try to get it elsewhere.
460
        if (
461
            !$cPid
462
            && ($this->cPid || $this->pid)
463
        ) {
464
            // Retain current PID.
465
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
466
        } elseif (!$cPid) {
467
            $this->logger->warning('Invalid PID ' . $cPid . ' for metadata definitions');
468
            return [];
469
        }
470
        // Get metadata from parsed metadata array if available.
471
        if (
472
            !empty($this->metadataArray[$id])
473
            && $this->metadataArray[0] == $cPid
474
        ) {
475
            return $this->metadataArray[$id];
476
        }
477
        // Initialize metadata array with empty values.
478
        $metadata = [
479
            'title' => [],
480
            'title_sorting' => [],
481
            'author' => [],
482
            'place' => [],
483
            'year' => [],
484
            'prod_id' => [],
485
            'record_id' => [],
486
            'opac_id' => [],
487
            'union_id' => [],
488
            'urn' => [],
489
            'purl' => [],
490
            'type' => [],
491
            'volume' => [],
492
            'volume_sorting' => [],
493
            'license' => [],
494
            'terms' => [],
495
            'restrictions' => [],
496
            'out_of_print' => [],
497
            'rights_info' => [],
498
            'collection' => [],
499
            'owner' => [],
500
            'mets_label' => [],
501
            'mets_orderlabel' => [],
502
            'document_format' => ['METS'],
503
        ];
504
        $mdIds = $this->getMetadataIds($id);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $mdIds is correct as $this->getMetadataIds($id) targeting Kitodo\Dlf\Common\MetsDocument::getMetadataIds() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
505
        if (empty($mdIds)) {
506
            // There is no metadata section for this structure node.
507
            return [];
508
        }
509
        // Associative array used as set of available section types (dmdSec, techMD, ...)
510
        $hasMetadataSection = [];
511
        // Load available metadata formats and metadata sections.
512
        $this->loadFormats();
513
        $this->_getMdSec();
514
        // Get the structure's type.
515
        if (!empty($this->logicalUnits[$id])) {
516
            $metadata['type'] = [$this->logicalUnits[$id]['type']];
517
        } else {
518
            $struct = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]/@TYPE');
519
            if (!empty($struct)) {
520
                $metadata['type'] = [(string) $struct[0]];
521
            }
522
        }
523
        foreach ($mdIds as $dmdId) {
0 ignored issues
show
Bug introduced by
The expression $mdIds of type void is not traversable.
Loading history...
524
            $mdSectionType = $this->mdSec[$dmdId]['section'];
525
526
            // To preserve behavior of previous Kitodo versions, extract metadata only from first supported dmdSec
527
            // However, we want to extract, for example, all techMD sections (VIDEOMD, AUDIOMD)
528
            if ($mdSectionType === 'dmdSec' && isset($hasMetadataSection['dmdSec'])) {
529
                continue;
530
            }
531
532
            // Is this metadata format supported?
533
            if (!empty($this->formats[$this->mdSec[$dmdId]['type']])) {
534
                if (!empty($this->formats[$this->mdSec[$dmdId]['type']]['class'])) {
535
                    $class = $this->formats[$this->mdSec[$dmdId]['type']]['class'];
536
                    // Get the metadata from class.
537
                    if (
538
                        class_exists($class)
539
                        && ($obj = GeneralUtility::makeInstance($class)) instanceof MetadataInterface
540
                    ) {
541
                        $obj->extractMetadata($this->mdSec[$dmdId]['xml'], $metadata);
542
                    } else {
543
                        $this->logger->warning('Invalid class/method "' . $class . '->extractMetadata()" for metadata format "' . $this->mdSec[$dmdId]['type'] . '"');
544
                    }
545
                }
546
            } else {
547
                $this->logger->notice('Unsupported metadata format "' . $this->mdSec[$dmdId]['type'] . '" in ' . $mdSectionType . ' with @ID "' . $dmdId . '"');
548
                // Continue searching for supported metadata with next @DMDID.
549
                continue;
550
            }
551
            // Get the additional metadata from database.
552
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
553
                ->getQueryBuilderForTable('tx_dlf_metadata');
554
            // Get hidden records, too.
555
            $queryBuilder
556
                ->getRestrictions()
557
                ->removeByType(HiddenRestriction::class);
558
            // Get all metadata with configured xpath and applicable format first.
559
            $resultWithFormat = $queryBuilder
560
                ->select(
561
                    'tx_dlf_metadata.index_name AS index_name',
562
                    'tx_dlf_metadataformat_joins.xpath AS xpath',
563
                    'tx_dlf_metadataformat_joins.xpath_sorting AS xpath_sorting',
564
                    'tx_dlf_metadata.is_sortable AS is_sortable',
565
                    'tx_dlf_metadata.default_value AS default_value',
566
                    'tx_dlf_metadata.format AS format'
567
                )
568
                ->from('tx_dlf_metadata')
569
                ->innerJoin(
570
                    'tx_dlf_metadata',
571
                    'tx_dlf_metadataformat',
572
                    'tx_dlf_metadataformat_joins',
573
                    $queryBuilder->expr()->eq(
574
                        'tx_dlf_metadataformat_joins.parent_id',
575
                        'tx_dlf_metadata.uid'
576
                    )
577
                )
578
                ->innerJoin(
579
                    'tx_dlf_metadataformat_joins',
580
                    'tx_dlf_formats',
581
                    'tx_dlf_formats_joins',
582
                    $queryBuilder->expr()->eq(
583
                        'tx_dlf_formats_joins.uid',
584
                        'tx_dlf_metadataformat_joins.encoded'
585
                    )
586
                )
587
                ->where(
588
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
589
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
590
                    $queryBuilder->expr()->eq('tx_dlf_metadataformat_joins.pid', intval($cPid)),
591
                    $queryBuilder->expr()->eq('tx_dlf_formats_joins.type', $queryBuilder->createNamedParameter($this->mdSec[$dmdId]['type']))
592
                )
593
                ->execute();
594
            // Get all metadata without a format, but with a default value next.
595
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
596
                ->getQueryBuilderForTable('tx_dlf_metadata');
597
            // Get hidden records, too.
598
            $queryBuilder
599
                ->getRestrictions()
600
                ->removeByType(HiddenRestriction::class);
601
            $resultWithoutFormat = $queryBuilder
602
                ->select(
603
                    'tx_dlf_metadata.index_name AS index_name',
604
                    'tx_dlf_metadata.is_sortable AS is_sortable',
605
                    'tx_dlf_metadata.default_value AS default_value',
606
                    'tx_dlf_metadata.format AS format'
607
                )
608
                ->from('tx_dlf_metadata')
609
                ->where(
610
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
611
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
612
                    $queryBuilder->expr()->eq('tx_dlf_metadata.format', 0),
613
                    $queryBuilder->expr()->neq('tx_dlf_metadata.default_value', $queryBuilder->createNamedParameter(''))
614
                )
615
                ->execute();
616
            // Merge both result sets.
617
            $allResults = array_merge($resultWithFormat->fetchAll(), $resultWithoutFormat->fetchAll());
618
            // We need a \DOMDocument here, because SimpleXML doesn't support XPath functions properly.
619
            $domNode = dom_import_simplexml($this->mdSec[$dmdId]['xml']);
620
            $domXPath = new \DOMXPath($domNode->ownerDocument);
0 ignored issues
show
Bug introduced by
It seems like $domNode->ownerDocument can also be of type null; however, parameter $document of DOMXPath::__construct() does only seem to accept DOMDocument, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

620
            $domXPath = new \DOMXPath(/** @scrutinizer ignore-type */ $domNode->ownerDocument);
Loading history...
621
            $this->registerNamespaces($domXPath);
622
            // OK, now make the XPath queries.
623
            foreach ($allResults as $resArray) {
624
                // Set metadata field's value(s).
625
                if (
626
                    $resArray['format'] > 0
627
                    && !empty($resArray['xpath'])
628
                    && ($values = $domXPath->evaluate($resArray['xpath'], $domNode))
629
                ) {
630
                    if (
631
                        $values instanceof \DOMNodeList
632
                        && $values->length > 0
633
                    ) {
634
                        $metadata[$resArray['index_name']] = [];
635
                        foreach ($values as $value) {
636
                            $metadata[$resArray['index_name']][] = trim((string) $value->nodeValue);
637
                        }
638
                    } elseif (!($values instanceof \DOMNodeList)) {
639
                        $metadata[$resArray['index_name']] = [trim((string) $values)];
640
                    }
641
                }
642
                // Set default value if applicable.
643
                if (
644
                    empty($metadata[$resArray['index_name']][0])
645
                    && strlen($resArray['default_value']) > 0
646
                ) {
647
                    $metadata[$resArray['index_name']] = [$resArray['default_value']];
648
                }
649
                // Set sorting value if applicable.
650
                if (
651
                    !empty($metadata[$resArray['index_name']])
652
                    && $resArray['is_sortable']
653
                ) {
654
                    if (
655
                        $resArray['format'] > 0
656
                        && !empty($resArray['xpath_sorting'])
657
                        && ($values = $domXPath->evaluate($resArray['xpath_sorting'], $domNode))
658
                    ) {
659
                        if (
660
                            $values instanceof \DOMNodeList
661
                            && $values->length > 0
662
                        ) {
663
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values->item(0)->nodeValue);
664
                        } elseif (!($values instanceof \DOMNodeList)) {
665
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values);
666
                        }
667
                    }
668
                    if (empty($metadata[$resArray['index_name'] . '_sorting'][0])) {
669
                        $metadata[$resArray['index_name'] . '_sorting'][0] = $metadata[$resArray['index_name']][0];
670
                    }
671
                }
672
            }
673
674
            $hasMetadataSection[$mdSectionType] = true;
675
        }
676
        // Set title to empty string if not present.
677
        if (empty($metadata['title'][0])) {
678
            $metadata['title'][0] = '';
679
            $metadata['title_sorting'][0] = '';
680
        }
681
        // Files are not expected to reference a dmdSec
682
        if (isset($this->fileInfos[$id]) || isset($hasMetadataSection['dmdSec'])) {
683
            return $metadata;
684
        } else {
685
            $this->logger->warning('No supported descriptive metadata found for logical structure with @ID "' . $id . '"');
686
            return [];
687
        }
688
    }
689
690
    /**
691
     * Get IDs of (descriptive and administrative) metadata sections
692
     * referenced by node of given $id. The $id may refer to either
693
     * a logical structure node or to a file.
694
     *
695
     * @access protected
696
     * @param string $id: The "@ID" attribute of the file node
697
     * @return void
698
     */
699
    protected function getMetadataIds($id)
700
    {
701
        // ­Load amdSecChildIds concordance
702
        $this->_getMdSec();
703
        $this->_getFileInfos();
704
705
        // Get DMDID and ADMID of logical structure node
706
        if (!empty($this->logicalUnits[$id])) {
707
            $dmdIds = $this->logicalUnits[$id]['dmdId'] ?? '';
708
            $admIds = $this->logicalUnits[$id]['admId'] ?? '';
709
        } else {
710
            $mdSec = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]')[0];
711
            if ($mdSec) {
0 ignored issues
show
introduced by
$mdSec is of type SimpleXMLElement, thus it always evaluated to true.
Loading history...
712
                $dmdIds = (string) $mdSec->attributes()->DMDID;
713
                $admIds = (string) $mdSec->attributes()->ADMID;
714
            } else if (isset($this->fileInfos[$id])) {
715
                $dmdIds = $this->fileInfos[$id]['dmdId'];
716
                $admIds = $this->fileInfos[$id]['admId'];
717
            } else {
718
                $dmdIds = '';
719
                $admIds = '';
720
            }
721
        }
722
723
        // Handle multiple DMDIDs/ADMIDs
724
        $allMdIds = explode(' ', $dmdIds);
725
726
        foreach (explode(' ', $admIds) as $admId) {
727
            if (isset($this->mdSec[$admId])) {
728
                // $admId references an actual metadata section such as techMD
729
                $allMdIds[] = $admId;
730
            } elseif (isset($this->amdSecChildIds[$admId])) {
731
                // $admId references a <mets:amdSec> element. Resolve child elements.
732
                foreach ($this->amdSecChildIds[$admId] as $childId) {
0 ignored issues
show
Bug introduced by
The expression $this->amdSecChildIds[$admId] of type string is not traversable.
Loading history...
733
                    $allMdIds[] = $childId;
734
                }
735
            }
736
        }
737
738
        return array_filter($allMdIds, function ($element) {
0 ignored issues
show
Bug Best Practice introduced by
The expression return array_filter($all...ion(...) { /* ... */ }) returns the type array which is incompatible with the documented return type void.
Loading history...
739
            return !empty($element);
740
        });
741
    }
742
743
    /**
744
     * {@inheritDoc}
745
     * @see \Kitodo\Dlf\Common\Doc::getFullText()
746
     */
747
    public function getFullText($id)
748
    {
749
        $fullText = '';
750
751
        // Load fileGrps and check for full text files.
752
        $this->_getFileGrps();
753
        if ($this->hasFulltext) {
754
            $fullText = $this->getFullTextFromXml($id);
755
        }
756
        return $fullText;
757
    }
758
759
    /**
760
     * {@inheritDoc}
761
     * @see Doc::getStructureDepth()
762
     */
763
    public function getStructureDepth($logId)
764
    {
765
        $ancestors = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $logId . '"]/ancestor::*');
766
        if (!empty($ancestors)) {
767
            return count($ancestors);
768
        } else {
769
            return 0;
770
        }
771
    }
772
773
    /**
774
     * {@inheritDoc}
775
     * @see \Kitodo\Dlf\Common\Doc::init()
776
     */
777
    protected function init($location)
778
    {
779
        $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class($this));
780
        // Get METS node from XML file.
781
        $this->registerNamespaces($this->xml);
782
        $mets = $this->xml->xpath('//mets:mets');
783
        if (!empty($mets)) {
784
            $this->mets = $mets[0];
785
            // Register namespaces.
786
            $this->registerNamespaces($this->mets);
787
        } else {
788
            if (!empty($location)) {
789
                $this->logger->error('No METS part found in document with location "' . $location . '".');
790
            } else if (!empty($this->recordId)) {
791
                $this->logger->error('No METS part found in document with recordId "' . $this->recordId . '".');
792
            } else {
793
                $this->logger->error('No METS part found in current document.');
794
            }
795
        }
796
    }
797
798
    /**
799
     * {@inheritDoc}
800
     * @see \Kitodo\Dlf\Common\Doc::loadLocation()
801
     */
802
    protected function loadLocation($location)
803
    {
804
        $fileResource = Helper::getUrl($location);
805
        if ($fileResource !== false) {
806
            $xml = Helper::getXmlFileAsString($fileResource);
807
            // Set some basic properties.
808
            if ($xml !== false) {
809
                $this->xml = $xml;
810
                return true;
811
            }
812
        }
813
        $this->logger->error('Could not load XML file from "' . $location . '"');
814
        return false;
815
    }
816
817
    /**
818
     * {@inheritDoc}
819
     * @see \Kitodo\Dlf\Common\Doc::ensureHasFulltextIsSet()
820
     */
821
    protected function ensureHasFulltextIsSet()
822
    {
823
        // Are the fileGrps already loaded?
824
        if (!$this->fileGrpsLoaded) {
825
            $this->_getFileGrps();
826
        }
827
    }
828
829
    /**
830
     * {@inheritDoc}
831
     * @see Doc::setPreloadedDocument()
832
     */
833
    protected function setPreloadedDocument($preloadedDocument)
834
    {
835
836
        if ($preloadedDocument instanceof \SimpleXMLElement) {
837
            $this->xml = $preloadedDocument;
838
            return true;
839
        }
840
        return false;
841
    }
842
843
    /**
844
     * {@inheritDoc}
845
     * @see Doc::getDocument()
846
     */
847
    protected function getDocument()
848
    {
849
        return $this->mets;
850
    }
851
852
    /**
853
     * This builds an array of the document's metadata sections
854
     *
855
     * @access protected
856
     *
857
     * @return array Array of metadata sections with their IDs as array key
858
     */
859
    protected function _getMdSec()
860
    {
861
        if (!$this->mdSecLoaded) {
862
            $this->loadFormats();
863
864
            foreach ($this->mets->xpath('./mets:dmdSec') as $dmdSecTag) {
865
                $dmdSec = $this->processMdSec($dmdSecTag);
866
867
                if ($dmdSec !== null) {
868
                    $this->mdSec[$dmdSec['id']] = $dmdSec;
0 ignored issues
show
Bug introduced by
The property mdSec is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
869
                    $this->dmdSec[$dmdSec['id']] = $dmdSec;
870
                }
871
            }
872
873
            foreach ($this->mets->xpath('./mets:amdSec') as $amdSecTag) {
874
                $childIds = [];
875
876
                foreach ($amdSecTag->children('http://www.loc.gov/METS/') as $mdSecTag) {
877
                    if (!in_array($mdSecTag->getName(), self::ALLOWED_AMD_SEC)) {
878
                        continue;
879
                    }
880
881
                    // TODO: Should we check that the format may occur within this type (e.g., to ignore VIDEOMD within rightsMD)?
882
                    $mdSec = $this->processMdSec($mdSecTag);
883
884
                    if ($mdSec !== null) {
885
                        $this->mdSec[$mdSec['id']] = $mdSec;
886
887
                        $childIds[] = $mdSec['id'];
888
                    }
889
                }
890
891
                $amdSecId = (string) $amdSecTag->attributes()->ID;
892
                if (!empty($amdSecId)) {
893
                    $this->amdSecChildIds[$amdSecId] = $childIds;
894
                }
895
            }
896
897
            $this->mdSecLoaded = true;
898
        }
899
        return $this->mdSec;
900
    }
901
902
    protected function _getDmdSec()
903
    {
904
        $this->_getMdSec();
905
        return $this->dmdSec;
906
    }
907
908
    /**
909
     * Processes an element of METS `mdSecType`.
910
     *
911
     * @access protected
912
     *
913
     * @param \SimpleXMLElement $element
914
     *
915
     * @return array|null The processed metadata section
916
     */
917
    protected function processMdSec($element)
918
    {
919
        $mdId = (string) $element->attributes()->ID;
920
        if (empty($mdId)) {
921
            return null;
922
        }
923
924
        $this->registerNamespaces($element);
925
        if ($type = $element->xpath('./mets:mdWrap[not(@MDTYPE="OTHER")]/@MDTYPE')) {
926
            if (!empty($this->formats[(string) $type[0]])) {
927
                $type = (string) $type[0];
928
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
929
            }
930
        } elseif ($type = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"]/@OTHERMDTYPE')) {
931
            if (!empty($this->formats[(string) $type[0]])) {
932
                $type = (string) $type[0];
933
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"][@OTHERMDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
934
            }
935
        }
936
937
        if (empty($xml)) {
938
            return null;
939
        }
940
941
        $this->registerNamespaces($xml[0]);
942
943
        return [
944
            'id' => $mdId,
945
            'section' => $element->getName(),
946
            'type' => $type,
947
            'xml' => $xml[0],
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $xml does not seem to be defined for all execution paths leading up to this point.
Loading history...
948
        ];
949
    }
950
951
    /**
952
     * This builds the file ID -> USE concordance
953
     *
954
     * @access protected
955
     *
956
     * @return array Array of file use groups with file IDs
957
     */
958
    protected function _getFileGrps()
959
    {
960
        if (!$this->fileGrpsLoaded) {
961
            // Get configured USE attributes.
962
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
963
            $useGrps = GeneralUtility::trimExplode(',', $extConf['fileGrpImages']);
964
            if (!empty($extConf['fileGrpThumbs'])) {
965
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']));
966
            }
967
            if (!empty($extConf['fileGrpDownload'])) {
968
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpDownload']));
969
            }
970
            if (!empty($extConf['fileGrpFulltext'])) {
971
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']));
972
            }
973
            if (!empty($extConf['fileGrpAudio'])) {
974
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpAudio']));
975
            }
976
            if (!empty($extConf['fileGrpVideo'])) {
977
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
978
            }
979
            if (!empty($extConf['fileGrpWaveform'])) {
980
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpWaveform']));
981
            }
982
            // Get all file groups.
983
            $fileGrps = $this->mets->xpath('./mets:fileSec/mets:fileGrp');
984
            if (!empty($fileGrps)) {
985
                // Build concordance for configured USE attributes.
986
                foreach ($fileGrps as $fileGrp) {
987
                    if (in_array((string) $fileGrp['USE'], $useGrps)) {
988
                        foreach ($fileGrp->children('http://www.loc.gov/METS/')->file as $file) {
989
                            $fileId = (string) $file->attributes()->ID;
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

989
                            $fileId = (string) $file->/** @scrutinizer ignore-call */ attributes()->ID;

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
990
                            $this->fileGrps[$fileId] = (string) $fileGrp['USE'];
991
                            $this->fileInfos[$fileId] = [
0 ignored issues
show
Bug introduced by
The property fileInfos is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
992
                                'fileGrp' => (string) $fileGrp['USE'],
993
                                'admId' => (string) $file->attributes()->ADMID,
994
                                'dmdId' => (string) $file->attributes()->DMDID,
995
                            ];
996
                        }
997
                    }
998
                }
999
            }
1000
            // Are there any fulltext files available?
1001
            if (
1002
                !empty($extConf['fileGrpFulltext'])
1003
                && array_intersect(GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']), $this->fileGrps) !== []
1004
            ) {
1005
                $this->hasFulltext = true;
1006
            }
1007
            $this->fileGrpsLoaded = true;
1008
        }
1009
        return $this->fileGrps;
1010
    }
1011
1012
    /**
1013
     *
1014
     * @access protected
1015
     * @return array
1016
     */
1017
    protected function _getFileInfos()
1018
    {
1019
        $this->_getFileGrps();
1020
        return $this->fileInfos;
1021
    }
1022
1023
    /**
1024
     * {@inheritDoc}
1025
     * @see \Kitodo\Dlf\Common\Doc::prepareMetadataArray()
1026
     */
1027
    protected function prepareMetadataArray($cPid)
1028
    {
1029
        $ids = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID]/@ID');
1030
        // Get all logical structure nodes with metadata.
1031
        if (!empty($ids)) {
1032
            foreach ($ids as $id) {
1033
                $this->metadataArray[(string) $id] = $this->getMetadata((string) $id, $cPid);
1034
            }
1035
        }
1036
        // Set current PID for metadata definitions.
1037
    }
1038
1039
    /**
1040
     * This returns $this->mets via __get()
1041
     *
1042
     * @access protected
1043
     *
1044
     * @return \SimpleXMLElement The XML's METS part as \SimpleXMLElement object
1045
     */
1046
    protected function _getMets()
1047
    {
1048
        return $this->mets;
1049
    }
1050
1051
    /**
1052
     * {@inheritDoc}
1053
     * @see \Kitodo\Dlf\Common\Doc::_getPhysicalStructure()
1054
     */
1055
    protected function _getPhysicalStructure()
1056
    {
1057
        // Is there no physical structure array yet?
1058
        if (!$this->physicalStructureLoaded) {
1059
            // Does the document have a structMap node of type "PHYSICAL"?
1060
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div');
1061
            if (!empty($elementNodes)) {
1062
                // Get file groups.
1063
                $fileUse = $this->_getFileGrps();
1064
                // Get the physical sequence's metadata.
1065
                $physNode = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]');
1066
                $physSeq[0] = (string) $physNode[0]['ID'];
1067
                $this->physicalStructureInfo[$physSeq[0]]['id'] = (string) $physNode[0]['ID'];
1068
                $this->physicalStructureInfo[$physSeq[0]]['dmdId'] = (isset($physNode[0]['DMDID']) ? (string) $physNode[0]['DMDID'] : '');
1069
                $this->physicalStructureInfo[$physSeq[0]]['admId'] = (isset($physNode[0]['ADMID']) ? (string) $physNode[0]['ADMID'] : '');
1070
                $this->physicalStructureInfo[$physSeq[0]]['order'] = (isset($physNode[0]['ORDER']) ? (string) $physNode[0]['ORDER'] : '');
1071
                $this->physicalStructureInfo[$physSeq[0]]['label'] = (isset($physNode[0]['LABEL']) ? (string) $physNode[0]['LABEL'] : '');
1072
                $this->physicalStructureInfo[$physSeq[0]]['orderlabel'] = (isset($physNode[0]['ORDERLABEL']) ? (string) $physNode[0]['ORDERLABEL'] : '');
1073
                $this->physicalStructureInfo[$physSeq[0]]['type'] = (string) $physNode[0]['TYPE'];
1074
                $this->physicalStructureInfo[$physSeq[0]]['contentIds'] = (isset($physNode[0]['CONTENTIDS']) ? (string) $physNode[0]['CONTENTIDS'] : '');
1075
                // Get the file representations from fileSec node.
1076
                foreach ($physNode[0]->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1077
                    // Check if file has valid @USE attribute.
1078
                    if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1079
                        $this->physicalStructureInfo[$physSeq[0]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1080
                        $this->physicalStructureInfo[$physSeq[0]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1081
                    }
1082
                }
1083
                // Build the physical elements' array from the physical structMap node.
1084
                foreach ($elementNodes as $elementNode) {
1085
                    $elements[(int) $elementNode['ORDER']] = (string) $elementNode['ID'];
1086
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['id'] = (string) $elementNode['ID'];
1087
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['dmdId'] = (isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '');
1088
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['admId'] = (isset($elementNode['ADMID']) ? (string) $elementNode['ADMID'] : '');
1089
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['order'] = (isset($elementNode['ORDER']) ? (string) $elementNode['ORDER'] : '');
1090
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['label'] = (isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '');
1091
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['orderlabel'] = (isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '');
1092
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['type'] = (string) $elementNode['TYPE'];
1093
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['contentIds'] = (isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '');
1094
                    // Get the file representations from fileSec node.
1095
                    foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1096
                        // Check if file has valid @USE attribute.
1097
                        if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1098
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1099
                            // List all files of the fileGrp that are referenced on the page, not only the last one
1100
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1101
                        } elseif ($area = $fptr->children('http://www.loc.gov/METS/')->area) {
1102
                            $areaAttrs = $area->attributes();
1103
                            $fileId = (string) $areaAttrs->FILEID;
1104
                            $physInfo = &$this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]];
1105
1106
                            $physInfo['files'][$fileUse[$fileId]] = $fileId;
1107
                            $physInfo['all_files'][$fileUse[$fileId]][] = $fileId;
1108
                            // Info about how the file is referenced/used on the page
1109
                            $physInfo['fileInfos'][$fileId]['area'] = [
1110
                                'begin' => (string) $areaAttrs->BEGIN,
1111
                                'betype' => (string) $areaAttrs->BETYPE,
1112
                                'extent' => (string) $areaAttrs->EXTENT,
1113
                                'exttype' => (string) $areaAttrs->EXTTYPE,
1114
                            ];
1115
                        }
1116
                    }
1117
                }
1118
                // Sort array by keys (= @ORDER).
1119
                if (ksort($elements)) {
1120
                    // Set total number of pages/tracks.
1121
                    $this->numPages = count($elements);
1122
                    // Merge and re-index the array to get nice numeric indexes.
1123
                    $this->physicalStructure = array_merge($physSeq, $elements);
1124
                }
1125
            }
1126
            $this->physicalStructureLoaded = true;
1127
        }
1128
        return $this->physicalStructure;
1129
    }
1130
1131
    /**
1132
     * {@inheritDoc}
1133
     * @see \Kitodo\Dlf\Common\Doc::_getSmLinks()
1134
     */
1135
    protected function _getSmLinks()
1136
    {
1137
        if (!$this->smLinksLoaded) {
1138
            $smLinks = $this->mets->xpath('./mets:structLink/mets:smLink');
1139
            if (!empty($smLinks)) {
1140
                foreach ($smLinks as $smLink) {
1141
                    $this->smLinks['l2p'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->from][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->to;
1142
                    $this->smLinks['p2l'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->to][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->from;
1143
                }
1144
            }
1145
            $this->smLinksLoaded = true;
1146
        }
1147
        return $this->smLinks;
1148
    }
1149
1150
    /**
1151
     * {@inheritDoc}
1152
     * @see \Kitodo\Dlf\Common\Doc::_getThumbnail()
1153
     */
1154
    protected function _getThumbnail($forceReload = false)
1155
    {
1156
        if (
1157
            !$this->thumbnailLoaded
1158
            || $forceReload
1159
        ) {
1160
            // Retain current PID.
1161
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
1162
            if (!$cPid) {
1163
                $this->logger->error('Invalid PID ' . $cPid . ' for structure definitions');
1164
                $this->thumbnailLoaded = true;
1165
                return $this->thumbnail;
1166
            }
1167
            // Load extension configuration.
1168
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
1169
            if (empty($extConf['fileGrpThumbs'])) {
1170
                $this->logger->warning('No fileGrp for thumbnails specified');
1171
                $this->thumbnailLoaded = true;
1172
                return $this->thumbnail;
1173
            }
1174
            $strctId = $this->_getToplevelId();
1175
            $metadata = $this->getTitledata($cPid);
1176
1177
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1178
                ->getQueryBuilderForTable('tx_dlf_structures');
1179
1180
            // Get structure element to get thumbnail from.
1181
            $result = $queryBuilder
1182
                ->select('tx_dlf_structures.thumbnail AS thumbnail')
1183
                ->from('tx_dlf_structures')
1184
                ->where(
1185
                    $queryBuilder->expr()->eq('tx_dlf_structures.pid', intval($cPid)),
1186
                    $queryBuilder->expr()->eq('tx_dlf_structures.index_name', $queryBuilder->expr()->literal($metadata['type'][0])),
1187
                    Helper::whereExpression('tx_dlf_structures')
1188
                )
1189
                ->setMaxResults(1)
1190
                ->execute();
1191
1192
            $allResults = $result->fetchAll();
1193
1194
            if (count($allResults) == 1) {
1195
                $resArray = $allResults[0];
1196
                // Get desired thumbnail structure if not the toplevel structure itself.
1197
                if (!empty($resArray['thumbnail'])) {
1198
                    $strctType = Helper::getIndexNameFromUid($resArray['thumbnail'], 'tx_dlf_structures', $cPid);
1199
                    // Check if this document has a structure element of the desired type.
1200
                    $strctIds = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@TYPE="' . $strctType . '"]/@ID');
1201
                    if (!empty($strctIds)) {
1202
                        $strctId = (string) $strctIds[0];
1203
                    }
1204
                }
1205
                // Load smLinks.
1206
                $this->_getSmLinks();
1207
                // Get thumbnail location.
1208
                $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
1209
                while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
1210
                    if (
1211
                        $this->_getPhysicalStructure()
1212
                        && !empty($this->smLinks['l2p'][$strctId])
1213
                        && !empty($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb])
1214
                    ) {
1215
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb]);
1216
                        break;
1217
                    } elseif (!empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])) {
1218
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb]);
1219
                        break;
1220
                    }
1221
                }
1222
            } else {
1223
                $this->logger->error('No structure of type "' . $metadata['type'][0] . '" found in database');
1224
            }
1225
            $this->thumbnailLoaded = true;
1226
        }
1227
        return $this->thumbnail;
1228
    }
1229
1230
    /**
1231
     * {@inheritDoc}
1232
     * @see \Kitodo\Dlf\Common\Doc::_getToplevelId()
1233
     */
1234
    protected function _getToplevelId()
1235
    {
1236
        if (empty($this->toplevelId)) {
1237
            // Get all logical structure nodes with metadata, but without associated METS-Pointers.
1238
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID and not(./mets:mptr)]');
1239
            if (!empty($divs)) {
1240
                // Load smLinks.
1241
                $this->_getSmLinks();
1242
                foreach ($divs as $div) {
1243
                    $id = (string) $div['ID'];
1244
                    // Are there physical structure nodes for this logical structure?
1245
                    if (array_key_exists($id, $this->smLinks['l2p'])) {
1246
                        // Yes. That's what we're looking for.
1247
                        $this->toplevelId = $id;
1248
                        break;
1249
                    } elseif (empty($this->toplevelId)) {
1250
                        // No. Remember this anyway, but keep looking for a better one.
1251
                        $this->toplevelId = $id;
1252
                    }
1253
                }
1254
            }
1255
        }
1256
        return $this->toplevelId;
1257
    }
1258
1259
    /**
1260
     * This magic method is executed prior to any serialization of the object
1261
     * @see __wakeup()
1262
     *
1263
     * @access public
1264
     *
1265
     * @return array Properties to be serialized
1266
     */
1267
    public function __sleep()
1268
    {
1269
        // \SimpleXMLElement objects can't be serialized, thus save the XML as string for serialization
1270
        $this->asXML = $this->xml->asXML();
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->xml->asXML() can also be of type true. However, the property $asXML is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
1271
        return ['uid', 'pid', 'recordId', 'parentId', 'asXML'];
1272
    }
1273
1274
    /**
1275
     * This magic method is used for setting a string value for the object
1276
     *
1277
     * @access public
1278
     *
1279
     * @return string String representing the METS object
1280
     */
1281
    public function __toString()
1282
    {
1283
        $xml = new \DOMDocument('1.0', 'utf-8');
1284
        $xml->appendChild($xml->importNode(dom_import_simplexml($this->mets), true));
1285
        $xml->formatOutput = true;
1286
        return $xml->saveXML();
1287
    }
1288
1289
    /**
1290
     * This magic method is executed after the object is deserialized
1291
     * @see __sleep()
1292
     *
1293
     * @access public
1294
     *
1295
     * @return void
1296
     */
1297
    public function __wakeup()
1298
    {
1299
        $xml = Helper::getXmlFileAsString($this->asXML);
1300
        if ($xml !== false) {
1301
            $this->asXML = '';
1302
            $this->xml = $xml;
1303
            // Rebuild the unserializable properties.
1304
            $this->init('');
1305
        } else {
1306
            $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(static::class);
1307
            $this->logger->error('Could not load XML after deserialization');
1308
        }
1309
    }
1310
}
1311