Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#818)
by Sebastian
03:23
created

MetsDocument::getFileLocation()   A

Complexity

Conditions 3
Paths 2

Size

Total Lines 11
Code Lines 8

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 8
c 0
b 0
f 0
dl 0
loc 11
rs 10
cc 3
nc 2
nop 1
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
16
use TYPO3\CMS\Core\Database\ConnectionPool;
17
use TYPO3\CMS\Core\Database\Query\Restriction\HiddenRestriction;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
use Ubl\Iiif\Tools\IiifHelper;
20
use Ubl\Iiif\Services\AbstractImageService;
21
use TYPO3\CMS\Core\Log\LogManager;
22
23
/**
24
 * MetsDocument class for the 'dlf' extension.
25
 *
26
 * @author Sebastian Meyer <[email protected]>
27
 * @author Henrik Lochmann <[email protected]>
28
 * @package TYPO3
29
 * @subpackage dlf
30
 * @access public
31
 * @property int $cPid This holds the PID for the configuration
32
 * @property-read array $mdSec Associative array of METS metadata sections indexed by their IDs.
33
 * @property-read array $dmdSec Subset of `$mdSec` storing only the dmdSec entries; kept for compatibility.
34
 * @property-read array $fileGrps This holds the file ID -> USE concordance
35
 * @property-read array $fileInfos Additional information about files (e.g., ADMID), indexed by ID.
36
 * @property-read bool $hasFulltext Are there any fulltext files available?
37
 * @property-read array $metadataArray This holds the documents' parsed metadata array
38
 * @property-read \SimpleXMLElement $mets This holds the XML file's METS part as \SimpleXMLElement object
39
 * @property-read int $numPages The holds the total number of pages
40
 * @property-read int $parentId This holds the UID of the parent document or zero if not multi-volumed
41
 * @property-read array $physicalStructure This holds the physical structure
42
 * @property-read array $physicalStructureInfo This holds the physical structure metadata
43
 * @property-read int $pid This holds the PID of the document or zero if not in database
44
 * @property-read bool $ready Is the document instantiated successfully?
45
 * @property-read string $recordId The METS file's / IIIF manifest's record identifier
46
 * @property-read int $rootId This holds the UID of the root document or zero if not multi-volumed
47
 * @property-read array $smLinks This holds the smLinks between logical and physical structMap
48
 * @property-read array $tableOfContents This holds the logical structure
49
 * @property-read string $thumbnail This holds the document's thumbnail location
50
 * @property-read string $toplevelId This holds the toplevel structure's @ID (METS) or the manifest's @id (IIIF)
51
 * @property-read string $parentHref URL of the parent document (determined via mptr element), or empty string if none is available
52
 */
53
final class MetsDocument extends Doc
54
{
55
    /**
56
     * Subsections / tags that may occur within `<mets:amdSec>`.
57
     *
58
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#amdSec
59
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#mdSecType
60
     *
61
     * @var string[]
62
     */
63
    protected const ALLOWED_AMD_SEC = ['techMD', 'rightsMD', 'sourceMD', 'digiprovMD'];
64
65
    /**
66
     * This holds the whole XML file as string for serialization purposes
67
     * @see __sleep() / __wakeup()
68
     *
69
     * @var string
70
     * @access protected
71
     */
72
    protected $asXML = '';
73
74
    /**
75
     * This maps the ID of each amdSec to the IDs of its children (techMD etc.).
76
     * When an ADMID references an amdSec instead of techMD etc., this is used to iterate the child elements.
77
     *
78
     * @var string[]
79
     * @access protected
80
     */
81
    protected $amdSecChildIds = [];
82
83
    /**
84
     * Associative array of METS metadata sections indexed by their IDs.
85
     *
86
     * @var array
87
     * @access protected
88
     */
89
    protected $mdSec = [];
90
91
    /**
92
     * Are the METS file's metadata sections loaded?
93
     * @see MetsDocument::$mdSec
94
     *
95
     * @var bool
96
     * @access protected
97
     */
98
    protected $mdSecLoaded = false;
99
100
    /**
101
     * Subset of $mdSec storing only the dmdSec entries; kept for compatibility.
102
     *
103
     * @var array
104
     * @access protected
105
     */
106
    protected $dmdSec = [];
107
108
    /**
109
     * The extension key
110
     *
111
     * @var	string
112
     * @access public
113
     */
114
    public static $extKey = 'dlf';
115
116
    /**
117
     * This holds the file ID -> USE concordance
118
     * @see _getFileGrps()
119
     *
120
     * @var array
121
     * @access protected
122
     */
123
    protected $fileGrps = [];
124
125
    /**
126
     * Are the image file groups loaded?
127
     * @see $fileGrps
128
     *
129
     * @var bool
130
     * @access protected
131
     */
132
    protected $fileGrpsLoaded = false;
133
134
    /**
135
     * Additional information about files (e.g., ADMID), indexed by ID.
136
     * TODO: Consider using this for `getFileMimeType()` and `getFileLocation()`.
137
     * @see _getFileInfos()
138
     *
139
     * @var array
140
     * @access protected
141
     */
142
    protected $fileInfos = [];
143
144
    /**
145
     * This holds the XML file's METS part as \SimpleXMLElement object
146
     *
147
     * @var \SimpleXMLElement
148
     * @access protected
149
     */
150
    protected $mets;
151
152
    /**
153
     * This holds the whole XML file as \SimpleXMLElement object
154
     *
155
     * @var \SimpleXMLElement
156
     * @access protected
157
     */
158
    protected $xml;
159
160
    /**
161
     * URL of the parent document (determined via mptr element),
162
     * or empty string if none is available
163
     *
164
     * @var string|null
165
     * @access protected
166
     */
167
    protected $parentHref;
168
169
    /**
170
     * This adds metadata from METS structural map to metadata array.
171
     *
172
     * @access	public
173
     *
174
     * @param	array	&$metadata: The metadata array to extend
175
     * @param	string	$id: The "@ID" attribute of the logical structure node
176
     *
177
     * @return  void
178
     */
179
    public function addMetadataFromMets(&$metadata, $id)
180
    {
181
        $details = $this->getLogicalStructure($id);
182
        if (!empty($details)) {
183
            $metadata['mets_order'][0] = $details['order'];
184
            $metadata['mets_label'][0] = $details['label'];
185
            $metadata['mets_orderlabel'][0] = $details['orderlabel'];
186
        }
187
    }
188
189
    /**
190
     *
191
     * {@inheritDoc}
192
     * @see \Kitodo\Dlf\Common\Doc::establishRecordId()
193
     */
194
    protected function establishRecordId($pid)
195
    {
196
        // Check for METS object @ID.
197
        if (!empty($this->mets['OBJID'])) {
198
            $this->recordId = (string) $this->mets['OBJID'];
199
        }
200
        // Get hook objects.
201
        $hookObjects = Helper::getHookObjects('Classes/Common/MetsDocument.php');
202
        // Apply hooks.
203
        foreach ($hookObjects as $hookObj) {
204
            if (method_exists($hookObj, 'construct_postProcessRecordId')) {
205
                $hookObj->construct_postProcessRecordId($this->xml, $this->recordId);
206
            }
207
        }
208
    }
209
210
    /**
211
     *
212
     * {@inheritDoc}
213
     * @see \Kitodo\Dlf\Common\Doc::getDownloadLocation()
214
     */
215
    public function getDownloadLocation($id)
216
    {
217
        $fileMimeType = $this->getFileMimeType($id);
218
        $fileLocation = $this->getFileLocation($id);
219
        if ($fileMimeType === 'application/vnd.kitodo.iiif') {
220
            $fileLocation = (strrpos($fileLocation, 'info.json') === strlen($fileLocation) - 9) ? $fileLocation : (strrpos($fileLocation, '/') === strlen($fileLocation) ? $fileLocation . 'info.json' : $fileLocation . '/info.json');
221
            $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
222
            IiifHelper::setUrlReader(IiifUrlReader::getInstance());
223
            IiifHelper::setMaxThumbnailHeight($conf['iiifThumbnailHeight']);
224
            IiifHelper::setMaxThumbnailWidth($conf['iiifThumbnailWidth']);
225
            $service = IiifHelper::loadIiifResource($fileLocation);
226
            if ($service !== null && $service instanceof AbstractImageService) {
227
                return $service->getImageUrl();
228
            }
229
        } elseif ($fileMimeType === 'application/vnd.netfpx') {
230
            $baseURL = $fileLocation . (strpos($fileLocation, '?') === false ? '?' : '');
231
            // TODO CVT is an optional IIP server capability; in theory, capabilities should be determined in the object request with '&obj=IIP-server'
232
            return $baseURL . '&CVT=jpeg';
233
        }
234
        return $fileLocation;
235
    }
236
237
    /**
238
     * {@inheritDoc}
239
     * @see \Kitodo\Dlf\Common\Doc::getFileLocation()
240
     */
241
    public function getFileLocation($id)
242
    {
243
        $location = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/mets:FLocat[@LOCTYPE="URL"]');
244
        if (
245
            !empty($id)
246
            && !empty($location)
247
        ) {
248
            return (string) $location[0]->attributes('http://www.w3.org/1999/xlink')->href;
249
        } else {
250
            $this->logger->warning('There is no file node with @ID "' . $id . '"');
251
            return '';
252
        }
253
    }
254
255
    /**
256
     * {@inheritDoc}
257
     * @see \Kitodo\Dlf\Common\Doc::getFileMimeType()
258
     */
259
    public function getFileMimeType($id)
260
    {
261
        $mimetype = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/@MIMETYPE');
262
        if (
263
            !empty($id)
264
            && !empty($mimetype)
265
        ) {
266
            return (string) $mimetype[0];
267
        } else {
268
            $this->logger->warning('There is no file node with @ID "' . $id . '" or no MIME type specified');
269
            return '';
270
        }
271
    }
272
273
    /**
274
     * {@inheritDoc}
275
     * @see \Kitodo\Dlf\Common\Doc::getLogicalStructure()
276
     */
277
    public function getLogicalStructure($id, $recursive = false)
278
    {
279
        $details = [];
280
        // Is the requested logical unit already loaded?
281
        if (
282
            !$recursive
283
            && !empty($this->logicalUnits[$id])
284
        ) {
285
            // Yes. Return it.
286
            return $this->logicalUnits[$id];
287
        } elseif (!empty($id)) {
288
            // Get specified logical unit.
289
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]');
290
        } else {
291
            // Get all logical units at top level.
292
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]/mets:div');
293
        }
294
        if (!empty($divs)) {
295
            if (!$recursive) {
296
                // Get the details for the first xpath hit.
297
                $details = $this->getLogicalStructureInfo($divs[0]);
298
            } else {
299
                // Walk the logical structure recursively and fill the whole table of contents.
300
                foreach ($divs as $div) {
301
                    $this->tableOfContents[] = $this->getLogicalStructureInfo($div, $recursive);
302
                }
303
            }
304
        }
305
        return $details;
306
    }
307
308
    /**
309
     * This gets details about a logical structure element
310
     *
311
     * @access protected
312
     *
313
     * @param \SimpleXMLElement $structure: The logical structure node
314
     * @param bool $recursive: Whether to include the child elements
315
     *
316
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
317
     */
318
    protected function getLogicalStructureInfo(\SimpleXMLElement $structure, $recursive = false)
319
    {
320
        // Get attributes.
321
        foreach ($structure->attributes() as $attribute => $value) {
322
            $attributes[$attribute] = (string) $value;
323
        }
324
        // Load plugin configuration.
325
        $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
326
        // Extract identity information.
327
        $details = [];
328
        $details['id'] = $attributes['ID'];
329
        $details['dmdId'] = (isset($attributes['DMDID']) ? $attributes['DMDID'] : '');
330
        $details['admId'] = (isset($attributes['ADMID']) ? $attributes['ADMID'] : '');
331
        $details['order'] = (isset($attributes['ORDER']) ? $attributes['ORDER'] : '');
332
        $details['label'] = (isset($attributes['LABEL']) ? $attributes['LABEL'] : '');
333
        $details['orderlabel'] = (isset($attributes['ORDERLABEL']) ? $attributes['ORDERLABEL'] : '');
334
        $details['contentIds'] = (isset($attributes['CONTENTIDS']) ? $attributes['CONTENTIDS'] : '');
335
        $details['volume'] = '';
336
        // Set volume information only if no label is set and this is the toplevel structure element.
337
        if (
338
            empty($details['label'])
339
            && $details['id'] == $this->_getToplevelId()
340
        ) {
341
            $metadata = $this->getMetadata($details['id']);
342
            if (!empty($metadata['volume'][0])) {
343
                $details['volume'] = $metadata['volume'][0];
344
            }
345
        }
346
        $details['pagination'] = '';
347
        $details['type'] = $attributes['TYPE'];
348
        $details['thumbnailId'] = '';
349
        // Load smLinks.
350
        $this->_getSmLinks();
351
        // Load physical structure.
352
        $this->_getPhysicalStructure();
353
        // Get the physical page or external file this structure element is pointing at.
354
        $details['points'] = '';
355
        // Is there a mptr node?
356
        if (count($structure->children('http://www.loc.gov/METS/')->mptr)) {
357
            // Yes. Get the file reference.
358
            $details['points'] = (string) $structure->children('http://www.loc.gov/METS/')->mptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
359
        } elseif (
360
            !empty($this->physicalStructure)
361
            && array_key_exists($details['id'], $this->smLinks['l2p'])
362
        ) {
363
            // Link logical structure to the first corresponding physical page/track.
364
            $details['points'] = max(intval(array_search($this->smLinks['l2p'][$details['id']][0], $this->physicalStructure, true)), 1);
365
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
366
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
367
                if (!empty($this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb])) {
368
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$fileGrpThumb];
369
                    break;
370
                }
371
            }
372
            // Get page/track number of the first page/track related to this structure element.
373
            $details['pagination'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['orderlabel'];
374
            $details['videoChapter'] = $this->getTimecode($details, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
375
        } elseif ($details['id'] == $this->_getToplevelId()) {
376
            // Point to self if this is the toplevel structure.
377
            $details['points'] = 1;
378
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
379
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
380
                if (
381
                    !empty($this->physicalStructure)
382
                    && !empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])
383
                ) {
384
                    $details['thumbnailId'] = $this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb];
385
                    break;
386
                }
387
            }
388
        }
389
        // Get the files this structure element is pointing at.
390
        $details['files'] = [];
391
        $details['all_files'] = [];
392
        $fileUse = $this->_getFileGrps();
393
        // Get the file representations from fileSec node.
394
        foreach ($structure->children('http://www.loc.gov/METS/')->fptr as $fptr) {
395
            // Check if file has valid @USE attribute.
396
            if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

396
            if (!empty($fileUse[(string) $fptr->/** @scrutinizer ignore-call */ attributes()->FILEID])) {

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
397
                $details['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
398
                $details['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
399
            }
400
        }
401
        // Keep for later usage.
402
        $this->logicalUnits[$details['id']] = $details;
403
        // Walk the structure recursively? And are there any children of the current element?
404
        if (
405
            $recursive
406
            && count($structure->children('http://www.loc.gov/METS/')->div)
407
        ) {
408
            $details['children'] = [];
409
            foreach ($structure->children('http://www.loc.gov/METS/')->div as $child) {
410
                // Repeat for all children.
411
                $details['children'][] = $this->getLogicalStructureInfo($child, true);
0 ignored issues
show
Bug introduced by
It seems like $child can also be of type null; however, parameter $structure of Kitodo\Dlf\Common\MetsDo...tLogicalStructureInfo() does only seem to accept SimpleXMLElement, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

411
                $details['children'][] = $this->getLogicalStructureInfo(/** @scrutinizer ignore-type */ $child, true);
Loading history...
412
            }
413
        }
414
        return $details;
415
    }
416
417
    /**
418
     * Get timecode and file IDs that link to first matching fileGrp/USE.
419
     *
420
     * Returns either `null` or an array with the following keys:
421
     * - `fileIds`: Array of linked file IDs
422
     * - `fileIdsJoin`: String where all `fileIds` are joined using ','.
423
     *    This is for convenience when passing `fileIds` in a Fluid template or similar.
424
     * - `timecode`: Time code specified in first matching `<mets:area>`
425
     *
426
     * @param array $logInfo
427
     * @param array $fileGrps
428
     * @return ?array
429
     */
430
    protected function getTimecode($logInfo, $fileGrps)
431
    {
432
        foreach ($fileGrps as $fileGrp) {
433
            $physInfo = $this->physicalStructureInfo[$this->smLinks['l2p'][$logInfo['id']][0]];
434
            $fileIds = $physInfo['all_files'][$fileGrp] ?? [];
435
436
            $chapter = null;
437
438
            foreach ($fileIds as $fileId) {
439
                $fileArea = $physInfo['fileInfos'][$fileId]['area'] ?? '';
440
                if (empty($fileArea) || $fileArea['betype'] !== 'TIME') {
441
                    continue;
442
                }
443
444
                if ($chapter === null) {
445
                    $chapter = [
446
                        'fileIds' => [],
447
                        'timecode' => Helper::timecodeToSeconds($fileArea['begin']),
448
                    ];
449
                }
450
451
                $chapter['fileIds'][] = $fileId;
452
            }
453
454
            if ($chapter !== null) {
455
                $chapter['fileIdsJoin'] = implode(',', $chapter['fileIds']);
456
                return $chapter;
457
            }
458
        }
459
    }
460
461
    /**
462
     * {@inheritDoc}
463
     * @see \Kitodo\Dlf\Common\Doc::getMetadata()
464
     */
465
    public function getMetadata($id, $cPid = 0)
466
    {
467
        // Make sure $cPid is a non-negative integer.
468
        $cPid = max(intval($cPid), 0);
469
        // If $cPid is not given, try to get it elsewhere.
470
        if (
471
            !$cPid
472
            && ($this->cPid || $this->pid)
473
        ) {
474
            // Retain current PID.
475
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
476
        } elseif (!$cPid) {
477
            $this->logger->warning('Invalid PID ' . $cPid . ' for metadata definitions');
478
            return [];
479
        }
480
        // Get metadata from parsed metadata array if available.
481
        if (
482
            !empty($this->metadataArray[$id])
483
            && $this->metadataArray[0] == $cPid
484
        ) {
485
            return $this->metadataArray[$id];
486
        }
487
        // Initialize metadata array with empty values.
488
        $metadata = [
489
            'title' => [],
490
            'title_sorting' => [],
491
            'author' => [],
492
            'place' => [],
493
            'year' => [],
494
            'prod_id' => [],
495
            'record_id' => [],
496
            'opac_id' => [],
497
            'union_id' => [],
498
            'urn' => [],
499
            'purl' => [],
500
            'type' => [],
501
            'volume' => [],
502
            'volume_sorting' => [],
503
            'license' => [],
504
            'terms' => [],
505
            'restrictions' => [],
506
            'out_of_print' => [],
507
            'rights_info' => [],
508
            'collection' => [],
509
            'owner' => [],
510
            'mets_label' => [],
511
            'mets_orderlabel' => [],
512
            'document_format' => ['METS'],
513
        ];
514
        $mdIds = $this->getMetadataIds($id);
0 ignored issues
show
Bug introduced by
Are you sure the assignment to $mdIds is correct as $this->getMetadataIds($id) targeting Kitodo\Dlf\Common\MetsDocument::getMetadataIds() seems to always return null.

This check looks for function or method calls that always return null and whose return value is assigned to a variable.

class A
{
    function getObject()
    {
        return null;
    }

}

$a = new A();
$object = $a->getObject();

The method getObject() can return nothing but null, so it makes no sense to assign that value to a variable.

The reason is most likely that a function or method is imcomplete or has been reduced for debug purposes.

Loading history...
515
        if (empty($mdIds)) {
516
            // There is no metadata section for this structure node.
517
            return [];
518
        }
519
        // Associative array used as set of available section types (dmdSec, techMD, ...)
520
        $hasMetadataSection = [];
521
        // Load available metadata formats and metadata sections.
522
        $this->loadFormats();
523
        $this->_getMdSec();
524
        // Get the structure's type.
525
        if (!empty($this->logicalUnits[$id])) {
526
            $metadata['type'] = [$this->logicalUnits[$id]['type']];
527
        } else {
528
            $struct = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]/@TYPE');
529
            if (!empty($struct)) {
530
                $metadata['type'] = [(string) $struct[0]];
531
            }
532
        }
533
        foreach ($mdIds as $dmdId) {
0 ignored issues
show
Bug introduced by
The expression $mdIds of type void is not traversable.
Loading history...
534
            $mdSectionType = $this->mdSec[$dmdId]['section'];
535
536
            // To preserve behavior of previous Kitodo versions, extract metadata only from first supported dmdSec
537
            // However, we want to extract, for example, all techMD sections (VIDEOMD, AUDIOMD)
538
            if ($mdSectionType === 'dmdSec' && isset($hasMetadataSection['dmdSec'])) {
539
                continue;
540
            }
541
542
            // Is this metadata format supported?
543
            if (!empty($this->formats[$this->mdSec[$dmdId]['type']])) {
544
                if (!empty($this->formats[$this->mdSec[$dmdId]['type']]['class'])) {
545
                    $class = $this->formats[$this->mdSec[$dmdId]['type']]['class'];
546
                    // Get the metadata from class.
547
                    if (
548
                        class_exists($class)
549
                        && ($obj = GeneralUtility::makeInstance($class)) instanceof MetadataInterface
550
                    ) {
551
                        $obj->extractMetadata($this->mdSec[$dmdId]['xml'], $metadata);
552
                    } else {
553
                        $this->logger->warning('Invalid class/method "' . $class . '->extractMetadata()" for metadata format "' . $this->mdSec[$dmdId]['type'] . '"');
554
                    }
555
                }
556
            } else {
557
                $this->logger->notice('Unsupported metadata format "' . $this->mdSec[$dmdId]['type'] . '" in ' . $mdSectionType . ' with @ID "' . $dmdId . '"');
558
                // Continue searching for supported metadata with next @DMDID.
559
                continue;
560
            }
561
            // Get the additional metadata from database.
562
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
563
                ->getQueryBuilderForTable('tx_dlf_metadata');
564
            // Get hidden records, too.
565
            $queryBuilder
566
                ->getRestrictions()
567
                ->removeByType(HiddenRestriction::class);
568
            // Get all metadata with configured xpath and applicable format first.
569
            $resultWithFormat = $queryBuilder
570
                ->select(
571
                    'tx_dlf_metadata.index_name AS index_name',
572
                    'tx_dlf_metadataformat_joins.xpath AS xpath',
573
                    'tx_dlf_metadataformat_joins.xpath_sorting AS xpath_sorting',
574
                    'tx_dlf_metadata.is_sortable AS is_sortable',
575
                    'tx_dlf_metadata.default_value AS default_value',
576
                    'tx_dlf_metadata.format AS format'
577
                )
578
                ->from('tx_dlf_metadata')
579
                ->innerJoin(
580
                    'tx_dlf_metadata',
581
                    'tx_dlf_metadataformat',
582
                    'tx_dlf_metadataformat_joins',
583
                    $queryBuilder->expr()->eq(
584
                        'tx_dlf_metadataformat_joins.parent_id',
585
                        'tx_dlf_metadata.uid'
586
                    )
587
                )
588
                ->innerJoin(
589
                    'tx_dlf_metadataformat_joins',
590
                    'tx_dlf_formats',
591
                    'tx_dlf_formats_joins',
592
                    $queryBuilder->expr()->eq(
593
                        'tx_dlf_formats_joins.uid',
594
                        'tx_dlf_metadataformat_joins.encoded'
595
                    )
596
                )
597
                ->where(
598
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
599
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
600
                    $queryBuilder->expr()->eq('tx_dlf_metadataformat_joins.pid', intval($cPid)),
601
                    $queryBuilder->expr()->eq('tx_dlf_formats_joins.type', $queryBuilder->createNamedParameter($this->mdSec[$dmdId]['type']))
602
                )
603
                ->execute();
604
            // Get all metadata without a format, but with a default value next.
605
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
606
                ->getQueryBuilderForTable('tx_dlf_metadata');
607
            // Get hidden records, too.
608
            $queryBuilder
609
                ->getRestrictions()
610
                ->removeByType(HiddenRestriction::class);
611
            $resultWithoutFormat = $queryBuilder
612
                ->select(
613
                    'tx_dlf_metadata.index_name AS index_name',
614
                    'tx_dlf_metadata.is_sortable AS is_sortable',
615
                    'tx_dlf_metadata.default_value AS default_value',
616
                    'tx_dlf_metadata.format AS format'
617
                )
618
                ->from('tx_dlf_metadata')
619
                ->where(
620
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($cPid)),
621
                    $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
622
                    $queryBuilder->expr()->eq('tx_dlf_metadata.format', 0),
623
                    $queryBuilder->expr()->neq('tx_dlf_metadata.default_value', $queryBuilder->createNamedParameter(''))
624
                )
625
                ->execute();
626
            // Merge both result sets.
627
            $allResults = array_merge($resultWithFormat->fetchAll(), $resultWithoutFormat->fetchAll());
628
            // We need a \DOMDocument here, because SimpleXML doesn't support XPath functions properly.
629
            $domNode = dom_import_simplexml($this->mdSec[$dmdId]['xml']);
630
            $domXPath = new \DOMXPath($domNode->ownerDocument);
0 ignored issues
show
Bug introduced by
It seems like $domNode->ownerDocument can also be of type null; however, parameter $document of DOMXPath::__construct() does only seem to accept DOMDocument, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

630
            $domXPath = new \DOMXPath(/** @scrutinizer ignore-type */ $domNode->ownerDocument);
Loading history...
631
            $this->registerNamespaces($domXPath);
632
            // OK, now make the XPath queries.
633
            foreach ($allResults as $resArray) {
634
                // Set metadata field's value(s).
635
                if (
636
                    $resArray['format'] > 0
637
                    && !empty($resArray['xpath'])
638
                    && ($values = $domXPath->evaluate($resArray['xpath'], $domNode))
639
                ) {
640
                    if (
641
                        $values instanceof \DOMNodeList
642
                        && $values->length > 0
643
                    ) {
644
                        $metadata[$resArray['index_name']] = [];
645
                        foreach ($values as $value) {
646
                            $metadata[$resArray['index_name']][] = trim((string) $value->nodeValue);
647
                        }
648
                    } elseif (!($values instanceof \DOMNodeList)) {
649
                        $metadata[$resArray['index_name']] = [trim((string) $values)];
650
                    }
651
                }
652
                // Set default value if applicable.
653
                if (
654
                    empty($metadata[$resArray['index_name']][0])
655
                    && strlen($resArray['default_value']) > 0
656
                ) {
657
                    $metadata[$resArray['index_name']] = [$resArray['default_value']];
658
                }
659
                // Set sorting value if applicable.
660
                if (
661
                    !empty($metadata[$resArray['index_name']])
662
                    && $resArray['is_sortable']
663
                ) {
664
                    if (
665
                        $resArray['format'] > 0
666
                        && !empty($resArray['xpath_sorting'])
667
                        && ($values = $domXPath->evaluate($resArray['xpath_sorting'], $domNode))
668
                    ) {
669
                        if (
670
                            $values instanceof \DOMNodeList
671
                            && $values->length > 0
672
                        ) {
673
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values->item(0)->nodeValue);
674
                        } elseif (!($values instanceof \DOMNodeList)) {
675
                            $metadata[$resArray['index_name'] . '_sorting'][0] = trim((string) $values);
676
                        }
677
                    }
678
                    if (empty($metadata[$resArray['index_name'] . '_sorting'][0])) {
679
                        $metadata[$resArray['index_name'] . '_sorting'][0] = $metadata[$resArray['index_name']][0];
680
                    }
681
                }
682
            }
683
684
            $hasMetadataSection[$mdSectionType] = true;
685
        }
686
        // Set title to empty string if not present.
687
        if (empty($metadata['title'][0])) {
688
            $metadata['title'][0] = '';
689
            $metadata['title_sorting'][0] = '';
690
        }
691
        // Files are not expected to reference a dmdSec
692
        if (isset($this->fileInfos[$id]) || isset($hasMetadataSection['dmdSec'])) {
693
            return $metadata;
694
        } else {
695
            $this->logger->warning('No supported descriptive metadata found for logical structure with @ID "' . $id . '"');
696
            return [];
697
        }
698
    }
699
700
    /**
701
     * Get IDs of (descriptive and administrative) metadata sections
702
     * referenced by node of given $id. The $id may refer to either
703
     * a logical structure node or to a file.
704
     *
705
     * @access protected
706
     * @param string $id: The "@ID" attribute of the file node
707
     * @return void
708
     */
709
    protected function getMetadataIds($id)
710
    {
711
        // ­Load amdSecChildIds concordance
712
        $this->_getMdSec();
713
        $this->_getFileInfos();
714
715
        // Get DMDID and ADMID of logical structure node
716
        if (!empty($this->logicalUnits[$id])) {
717
            $dmdIds = $this->logicalUnits[$id]['dmdId'] ?? '';
718
            $admIds = $this->logicalUnits[$id]['admId'] ?? '';
719
        } else {
720
            $mdSec = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]')[0];
721
            if ($mdSec) {
0 ignored issues
show
introduced by
$mdSec is of type SimpleXMLElement, thus it always evaluated to true.
Loading history...
722
                $dmdIds = (string) $mdSec->attributes()->DMDID;
723
                $admIds = (string) $mdSec->attributes()->ADMID;
724
            } else if (isset($this->fileInfos[$id])) {
725
                $dmdIds = $this->fileInfos[$id]['dmdId'];
726
                $admIds = $this->fileInfos[$id]['admId'];
727
            } else {
728
                $dmdIds = '';
729
                $admIds = '';
730
            }
731
        }
732
733
        // Handle multiple DMDIDs/ADMIDs
734
        $allMdIds = explode(' ', $dmdIds);
735
736
        foreach (explode(' ', $admIds) as $admId) {
737
            if (isset($this->mdSec[$admId])) {
738
                // $admId references an actual metadata section such as techMD
739
                $allMdIds[] = $admId;
740
            } elseif (isset($this->amdSecChildIds[$admId])) {
741
                // $admId references a <mets:amdSec> element. Resolve child elements.
742
                foreach ($this->amdSecChildIds[$admId] as $childId) {
0 ignored issues
show
Bug introduced by
The expression $this->amdSecChildIds[$admId] of type string is not traversable.
Loading history...
743
                    $allMdIds[] = $childId;
744
                }
745
            }
746
        }
747
748
        return array_filter($allMdIds, function ($element) {
0 ignored issues
show
Bug Best Practice introduced by
The expression return array_filter($all...ion(...) { /* ... */ }) returns the type array which is incompatible with the documented return type void.
Loading history...
749
            return !empty($element);
750
        });
751
    }
752
753
    /**
754
     * {@inheritDoc}
755
     * @see \Kitodo\Dlf\Common\Doc::getFullText()
756
     */
757
    public function getFullText($id)
758
    {
759
        $fullText = '';
760
761
        // Load fileGrps and check for full text files.
762
        $this->_getFileGrps();
763
        if ($this->hasFulltext) {
764
            $fullText = $this->getFullTextFromXml($id);
765
        }
766
        return $fullText;
767
    }
768
769
    /**
770
     * {@inheritDoc}
771
     * @see Doc::getStructureDepth()
772
     */
773
    public function getStructureDepth($logId)
774
    {
775
        $ancestors = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $logId . '"]/ancestor::*');
776
        if (!empty($ancestors)) {
777
            return count($ancestors);
778
        } else {
779
            return 0;
780
        }
781
    }
782
783
    /**
784
     * {@inheritDoc}
785
     * @see \Kitodo\Dlf\Common\Doc::init()
786
     */
787
    protected function init($location)
788
    {
789
        $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class($this));
790
        // Get METS node from XML file.
791
        $this->registerNamespaces($this->xml);
792
        $mets = $this->xml->xpath('//mets:mets');
793
        if (!empty($mets)) {
794
            $this->mets = $mets[0];
795
            // Register namespaces.
796
            $this->registerNamespaces($this->mets);
797
        } else {
798
            if (!empty($location)) {
799
                $this->logger->error('No METS part found in document with location "' . $location . '".');
800
            } else if (!empty($this->recordId)) {
801
                $this->logger->error('No METS part found in document with recordId "' . $this->recordId . '".');
802
            } else {
803
                $this->logger->error('No METS part found in current document.');
804
            }
805
        }
806
    }
807
808
    /**
809
     * {@inheritDoc}
810
     * @see \Kitodo\Dlf\Common\Doc::loadLocation()
811
     */
812
    protected function loadLocation($location)
813
    {
814
        $fileResource = Helper::getUrl($location);
815
        if ($fileResource !== false) {
816
            $xml = Helper::getXmlFileAsString($fileResource);
817
            // Set some basic properties.
818
            if ($xml !== false) {
819
                $this->xml = $xml;
820
                return true;
821
            }
822
        }
823
        $this->logger->error('Could not load XML file from "' . $location . '"');
824
        return false;
825
    }
826
827
    /**
828
     * {@inheritDoc}
829
     * @see \Kitodo\Dlf\Common\Doc::ensureHasFulltextIsSet()
830
     */
831
    protected function ensureHasFulltextIsSet()
832
    {
833
        // Are the fileGrps already loaded?
834
        if (!$this->fileGrpsLoaded) {
835
            $this->_getFileGrps();
836
        }
837
    }
838
839
    /**
840
     * {@inheritDoc}
841
     * @see Doc::setPreloadedDocument()
842
     */
843
    protected function setPreloadedDocument($preloadedDocument)
844
    {
845
846
        if ($preloadedDocument instanceof \SimpleXMLElement) {
847
            $this->xml = $preloadedDocument;
848
            return true;
849
        }
850
        return false;
851
    }
852
853
    /**
854
     * {@inheritDoc}
855
     * @see Doc::getDocument()
856
     */
857
    protected function getDocument()
858
    {
859
        return $this->mets;
860
    }
861
862
    /**
863
     * This builds an array of the document's metadata sections
864
     *
865
     * @access protected
866
     *
867
     * @return array Array of metadata sections with their IDs as array key
868
     */
869
    protected function _getMdSec()
870
    {
871
        if (!$this->mdSecLoaded) {
872
            $this->loadFormats();
873
874
            foreach ($this->mets->xpath('./mets:dmdSec') as $dmdSecTag) {
875
                $dmdSec = $this->processMdSec($dmdSecTag);
876
877
                if ($dmdSec !== null) {
878
                    $this->mdSec[$dmdSec['id']] = $dmdSec;
0 ignored issues
show
Bug introduced by
The property mdSec is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
879
                    $this->dmdSec[$dmdSec['id']] = $dmdSec;
880
                }
881
            }
882
883
            foreach ($this->mets->xpath('./mets:amdSec') as $amdSecTag) {
884
                $childIds = [];
885
886
                foreach ($amdSecTag->children('http://www.loc.gov/METS/') as $mdSecTag) {
887
                    if (!in_array($mdSecTag->getName(), self::ALLOWED_AMD_SEC)) {
888
                        continue;
889
                    }
890
891
                    // TODO: Should we check that the format may occur within this type (e.g., to ignore VIDEOMD within rightsMD)?
892
                    $mdSec = $this->processMdSec($mdSecTag);
893
894
                    if ($mdSec !== null) {
895
                        $this->mdSec[$mdSec['id']] = $mdSec;
896
897
                        $childIds[] = $mdSec['id'];
898
                    }
899
                }
900
901
                $amdSecId = (string) $amdSecTag->attributes()->ID;
902
                if (!empty($amdSecId)) {
903
                    $this->amdSecChildIds[$amdSecId] = $childIds;
904
                }
905
            }
906
907
            $this->mdSecLoaded = true;
908
        }
909
        return $this->mdSec;
910
    }
911
912
    protected function _getDmdSec()
913
    {
914
        $this->_getMdSec();
915
        return $this->dmdSec;
916
    }
917
918
    /**
919
     * Processes an element of METS `mdSecType`.
920
     *
921
     * @access protected
922
     *
923
     * @param \SimpleXMLElement $element
924
     *
925
     * @return array|null The processed metadata section
926
     */
927
    protected function processMdSec($element)
928
    {
929
        $mdId = (string) $element->attributes()->ID;
930
        if (empty($mdId)) {
931
            return null;
932
        }
933
934
        $this->registerNamespaces($element);
935
        if ($type = $element->xpath('./mets:mdWrap[not(@MDTYPE="OTHER")]/@MDTYPE')) {
936
            if (!empty($this->formats[(string) $type[0]])) {
937
                $type = (string) $type[0];
938
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
939
            }
940
        } elseif ($type = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"]/@OTHERMDTYPE')) {
941
            if (!empty($this->formats[(string) $type[0]])) {
942
                $type = (string) $type[0];
943
                $xml = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"][@OTHERMDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
944
            }
945
        }
946
947
        if (empty($xml)) {
948
            return null;
949
        }
950
951
        $this->registerNamespaces($xml[0]);
952
953
        return [
954
            'id' => $mdId,
955
            'section' => $element->getName(),
956
            'type' => $type,
957
            'xml' => $xml[0],
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $xml does not seem to be defined for all execution paths leading up to this point.
Loading history...
958
        ];
959
    }
960
961
    /**
962
     * This builds the file ID -> USE concordance
963
     *
964
     * @access protected
965
     *
966
     * @return array Array of file use groups with file IDs
967
     */
968
    protected function _getFileGrps()
969
    {
970
        if (!$this->fileGrpsLoaded) {
971
            // Get configured USE attributes.
972
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
973
            $useGrps = GeneralUtility::trimExplode(',', $extConf['fileGrpImages']);
974
            if (!empty($extConf['fileGrpThumbs'])) {
975
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']));
976
            }
977
            if (!empty($extConf['fileGrpDownload'])) {
978
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpDownload']));
979
            }
980
            if (!empty($extConf['fileGrpFulltext'])) {
981
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']));
982
            }
983
            if (!empty($extConf['fileGrpAudio'])) {
984
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpAudio']));
985
            }
986
            if (!empty($extConf['fileGrpVideo'])) {
987
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpVideo']));
988
            }
989
            if (!empty($extConf['fileGrpWaveform'])) {
990
                $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf['fileGrpWaveform']));
991
            }
992
            // Get all file groups.
993
            $fileGrps = $this->mets->xpath('./mets:fileSec/mets:fileGrp');
994
            if (!empty($fileGrps)) {
995
                // Build concordance for configured USE attributes.
996
                foreach ($fileGrps as $fileGrp) {
997
                    if (in_array((string) $fileGrp['USE'], $useGrps)) {
998
                        foreach ($fileGrp->children('http://www.loc.gov/METS/')->file as $file) {
999
                            $fileId = (string) $file->attributes()->ID;
0 ignored issues
show
Bug introduced by
The method attributes() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

999
                            $fileId = (string) $file->/** @scrutinizer ignore-call */ attributes()->ID;

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
1000
                            $this->fileGrps[$fileId] = (string) $fileGrp['USE'];
1001
                            $this->fileInfos[$fileId] = [
0 ignored issues
show
Bug introduced by
The property fileInfos is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1002
                                'fileGrp' => (string) $fileGrp['USE'],
1003
                                'admId' => (string) $file->attributes()->ADMID,
1004
                                'dmdId' => (string) $file->attributes()->DMDID,
1005
                            ];
1006
                        }
1007
                    }
1008
                }
1009
            }
1010
            // Are there any fulltext files available?
1011
            if (
1012
                !empty($extConf['fileGrpFulltext'])
1013
                && array_intersect(GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']), $this->fileGrps) !== []
1014
            ) {
1015
                $this->hasFulltext = true;
1016
            }
1017
            $this->fileGrpsLoaded = true;
1018
        }
1019
        return $this->fileGrps;
1020
    }
1021
1022
    /**
1023
     *
1024
     * @access protected
1025
     * @return array
1026
     */
1027
    protected function _getFileInfos()
1028
    {
1029
        $this->_getFileGrps();
1030
        return $this->fileInfos;
1031
    }
1032
1033
    /**
1034
     * {@inheritDoc}
1035
     * @see \Kitodo\Dlf\Common\Doc::prepareMetadataArray()
1036
     */
1037
    protected function prepareMetadataArray($cPid)
1038
    {
1039
        $ids = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID]/@ID');
1040
        // Get all logical structure nodes with metadata.
1041
        if (!empty($ids)) {
1042
            foreach ($ids as $id) {
1043
                $this->metadataArray[(string) $id] = $this->getMetadata((string) $id, $cPid);
1044
            }
1045
        }
1046
        // Set current PID for metadata definitions.
1047
    }
1048
1049
    /**
1050
     * This returns $this->mets via __get()
1051
     *
1052
     * @access protected
1053
     *
1054
     * @return \SimpleXMLElement The XML's METS part as \SimpleXMLElement object
1055
     */
1056
    protected function _getMets()
1057
    {
1058
        return $this->mets;
1059
    }
1060
1061
    /**
1062
     * {@inheritDoc}
1063
     * @see \Kitodo\Dlf\Common\Doc::_getPhysicalStructure()
1064
     */
1065
    protected function _getPhysicalStructure()
1066
    {
1067
        // Is there no physical structure array yet?
1068
        if (!$this->physicalStructureLoaded) {
1069
            // Does the document have a structMap node of type "PHYSICAL"?
1070
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div');
1071
            if (!empty($elementNodes)) {
1072
                // Get file groups.
1073
                $fileUse = $this->_getFileGrps();
1074
                // Get the physical sequence's metadata.
1075
                $physNode = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]');
1076
                $physSeq[0] = (string) $physNode[0]['ID'];
1077
                $this->physicalStructureInfo[$physSeq[0]]['id'] = (string) $physNode[0]['ID'];
1078
                $this->physicalStructureInfo[$physSeq[0]]['dmdId'] = (isset($physNode[0]['DMDID']) ? (string) $physNode[0]['DMDID'] : '');
1079
                $this->physicalStructureInfo[$physSeq[0]]['admId'] = (isset($physNode[0]['ADMID']) ? (string) $physNode[0]['ADMID'] : '');
1080
                $this->physicalStructureInfo[$physSeq[0]]['order'] = (isset($physNode[0]['ORDER']) ? (string) $physNode[0]['ORDER'] : '');
1081
                $this->physicalStructureInfo[$physSeq[0]]['label'] = (isset($physNode[0]['LABEL']) ? (string) $physNode[0]['LABEL'] : '');
1082
                $this->physicalStructureInfo[$physSeq[0]]['orderlabel'] = (isset($physNode[0]['ORDERLABEL']) ? (string) $physNode[0]['ORDERLABEL'] : '');
1083
                $this->physicalStructureInfo[$physSeq[0]]['type'] = (string) $physNode[0]['TYPE'];
1084
                $this->physicalStructureInfo[$physSeq[0]]['contentIds'] = (isset($physNode[0]['CONTENTIDS']) ? (string) $physNode[0]['CONTENTIDS'] : '');
1085
                // Get the file representations from fileSec node.
1086
                foreach ($physNode[0]->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1087
                    // Check if file has valid @USE attribute.
1088
                    if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1089
                        $this->physicalStructureInfo[$physSeq[0]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1090
                        $this->physicalStructureInfo[$physSeq[0]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1091
                    }
1092
                }
1093
                // Build the physical elements' array from the physical structMap node.
1094
                foreach ($elementNodes as $elementNode) {
1095
                    $elements[(int) $elementNode['ORDER']] = (string) $elementNode['ID'];
1096
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['id'] = (string) $elementNode['ID'];
1097
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['dmdId'] = (isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '');
1098
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['admId'] = (isset($elementNode['ADMID']) ? (string) $elementNode['ADMID'] : '');
1099
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['order'] = (isset($elementNode['ORDER']) ? (string) $elementNode['ORDER'] : '');
1100
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['label'] = (isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '');
1101
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['orderlabel'] = (isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '');
1102
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['type'] = (string) $elementNode['TYPE'];
1103
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['contentIds'] = (isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '');
1104
                    // Get the file representations from fileSec node.
1105
                    foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1106
                        // Check if file has valid @USE attribute.
1107
                        if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1108
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1109
                            // List all files of the fileGrp that are referenced on the page, not only the last one
1110
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['all_files'][$fileUse[(string) $fptr->attributes()->FILEID]][] = (string) $fptr->attributes()->FILEID;
1111
                        } elseif ($area = $fptr->children('http://www.loc.gov/METS/')->area) {
1112
                            $areaAttrs = $area->attributes();
1113
                            $fileId = (string) $areaAttrs->FILEID;
1114
                            $physInfo = &$this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]];
1115
1116
                            $physInfo['files'][$fileUse[$fileId]] = $fileId;
1117
                            $physInfo['all_files'][$fileUse[$fileId]][] = $fileId;
1118
                            // Info about how the file is referenced/used on the page
1119
                            $physInfo['fileInfos'][$fileId]['area'] = [
1120
                                'begin' => (string) $areaAttrs->BEGIN,
1121
                                'betype' => (string) $areaAttrs->BETYPE,
1122
                                'extent' => (string) $areaAttrs->EXTENT,
1123
                                'exttype' => (string) $areaAttrs->EXTTYPE,
1124
                            ];
1125
                        }
1126
                    }
1127
                }
1128
                // Sort array by keys (= @ORDER).
1129
                if (ksort($elements)) {
1130
                    // Set total number of pages/tracks.
1131
                    $this->numPages = count($elements);
1132
                    // Merge and re-index the array to get nice numeric indexes.
1133
                    $this->physicalStructure = array_merge($physSeq, $elements);
1134
                }
1135
            }
1136
            $this->physicalStructureLoaded = true;
1137
        }
1138
        return $this->physicalStructure;
1139
    }
1140
1141
    /**
1142
     * {@inheritDoc}
1143
     * @see \Kitodo\Dlf\Common\Doc::_getSmLinks()
1144
     */
1145
    protected function _getSmLinks()
1146
    {
1147
        if (!$this->smLinksLoaded) {
1148
            $smLinks = $this->mets->xpath('./mets:structLink/mets:smLink');
1149
            if (!empty($smLinks)) {
1150
                foreach ($smLinks as $smLink) {
1151
                    $this->smLinks['l2p'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->from][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->to;
1152
                    $this->smLinks['p2l'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->to][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->from;
1153
                }
1154
            }
1155
            $this->smLinksLoaded = true;
1156
        }
1157
        return $this->smLinks;
1158
    }
1159
1160
    /**
1161
     * {@inheritDoc}
1162
     * @see \Kitodo\Dlf\Common\Doc::_getThumbnail()
1163
     */
1164
    protected function _getThumbnail($forceReload = false)
1165
    {
1166
        if (
1167
            !$this->thumbnailLoaded
1168
            || $forceReload
1169
        ) {
1170
            // Retain current PID.
1171
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
1172
            if (!$cPid) {
1173
                $this->logger->error('Invalid PID ' . $cPid . ' for structure definitions');
1174
                $this->thumbnailLoaded = true;
1175
                return $this->thumbnail;
1176
            }
1177
            // Load extension configuration.
1178
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
1179
            if (empty($extConf['fileGrpThumbs'])) {
1180
                $this->logger->warning('No fileGrp for thumbnails specified');
1181
                $this->thumbnailLoaded = true;
1182
                return $this->thumbnail;
1183
            }
1184
            $strctId = $this->_getToplevelId();
1185
            $metadata = $this->getTitledata($cPid);
1186
1187
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1188
                ->getQueryBuilderForTable('tx_dlf_structures');
1189
1190
            // Get structure element to get thumbnail from.
1191
            $result = $queryBuilder
1192
                ->select('tx_dlf_structures.thumbnail AS thumbnail')
1193
                ->from('tx_dlf_structures')
1194
                ->where(
1195
                    $queryBuilder->expr()->eq('tx_dlf_structures.pid', intval($cPid)),
1196
                    $queryBuilder->expr()->eq('tx_dlf_structures.index_name', $queryBuilder->expr()->literal($metadata['type'][0])),
1197
                    Helper::whereExpression('tx_dlf_structures')
1198
                )
1199
                ->setMaxResults(1)
1200
                ->execute();
1201
1202
            $allResults = $result->fetchAll();
1203
1204
            if (count($allResults) == 1) {
1205
                $resArray = $allResults[0];
1206
                // Get desired thumbnail structure if not the toplevel structure itself.
1207
                if (!empty($resArray['thumbnail'])) {
1208
                    $strctType = Helper::getIndexNameFromUid($resArray['thumbnail'], 'tx_dlf_structures', $cPid);
1209
                    // Check if this document has a structure element of the desired type.
1210
                    $strctIds = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@TYPE="' . $strctType . '"]/@ID');
1211
                    if (!empty($strctIds)) {
1212
                        $strctId = (string) $strctIds[0];
1213
                    }
1214
                }
1215
                // Load smLinks.
1216
                $this->_getSmLinks();
1217
                // Get thumbnail location.
1218
                $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
1219
                while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
1220
                    if (
1221
                        $this->_getPhysicalStructure()
1222
                        && !empty($this->smLinks['l2p'][$strctId])
1223
                        && !empty($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb])
1224
                    ) {
1225
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb]);
1226
                        break;
1227
                    } elseif (!empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])) {
1228
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb]);
1229
                        break;
1230
                    }
1231
                }
1232
            } else {
1233
                $this->logger->error('No structure of type "' . $metadata['type'][0] . '" found in database');
1234
            }
1235
            $this->thumbnailLoaded = true;
1236
        }
1237
        return $this->thumbnail;
1238
    }
1239
1240
    /**
1241
     * {@inheritDoc}
1242
     * @see \Kitodo\Dlf\Common\Doc::_getToplevelId()
1243
     */
1244
    protected function _getToplevelId()
1245
    {
1246
        if (empty($this->toplevelId)) {
1247
            // Get all logical structure nodes with metadata, but without associated METS-Pointers.
1248
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID and not(./mets:mptr)]');
1249
            if (!empty($divs)) {
1250
                // Load smLinks.
1251
                $this->_getSmLinks();
1252
                foreach ($divs as $div) {
1253
                    $id = (string) $div['ID'];
1254
                    // Are there physical structure nodes for this logical structure?
1255
                    if (array_key_exists($id, $this->smLinks['l2p'])) {
1256
                        // Yes. That's what we're looking for.
1257
                        $this->toplevelId = $id;
1258
                        break;
1259
                    } elseif (empty($this->toplevelId)) {
1260
                        // No. Remember this anyway, but keep looking for a better one.
1261
                        $this->toplevelId = $id;
1262
                    }
1263
                }
1264
            }
1265
        }
1266
        return $this->toplevelId;
1267
    }
1268
1269
    /**
1270
     * Try to determine URL of parent document.
1271
     *
1272
     * @return string|null
1273
     */
1274
    public function _getParentHref()
1275
    {
1276
        if ($this->parentHref === null) {
1277
            $this->parentHref = '';
0 ignored issues
show
Bug introduced by
The property parentHref is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1278
1279
            // Get the closest ancestor of the current document which has a MPTR child.
1280
            $parentMptr = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $this->toplevelId . '"]/ancestor::mets:div[./mets:mptr][1]/mets:mptr');
1281
            if (!empty($parentMptr)) {
1282
                $this->parentHref = (string) $parentMptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
1283
            }
1284
        }
1285
1286
        return $this->parentHref;
1287
    }
1288
1289
    /**
1290
     * This magic method is executed prior to any serialization of the object
1291
     * @see __wakeup()
1292
     *
1293
     * @access public
1294
     *
1295
     * @return array Properties to be serialized
1296
     */
1297
    public function __sleep()
1298
    {
1299
        // \SimpleXMLElement objects can't be serialized, thus save the XML as string for serialization
1300
        $this->asXML = $this->xml->asXML();
0 ignored issues
show
Documentation Bug introduced by
It seems like $this->xml->asXML() can also be of type true. However, the property $asXML is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
1301
        return ['uid', 'pid', 'recordId', 'parentId', 'asXML'];
1302
    }
1303
1304
    /**
1305
     * This magic method is used for setting a string value for the object
1306
     *
1307
     * @access public
1308
     *
1309
     * @return string String representing the METS object
1310
     */
1311
    public function __toString()
1312
    {
1313
        $xml = new \DOMDocument('1.0', 'utf-8');
1314
        $xml->appendChild($xml->importNode(dom_import_simplexml($this->mets), true));
1315
        $xml->formatOutput = true;
1316
        return $xml->saveXML();
1317
    }
1318
1319
    /**
1320
     * This magic method is executed after the object is deserialized
1321
     * @see __sleep()
1322
     *
1323
     * @access public
1324
     *
1325
     * @return void
1326
     */
1327
    public function __wakeup()
1328
    {
1329
        $xml = Helper::getXmlFileAsString($this->asXML);
1330
        if ($xml !== false) {
1331
            $this->asXML = '';
1332
            $this->xml = $xml;
1333
            // Rebuild the unserializable properties.
1334
            $this->init('');
1335
        } else {
1336
            $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(static::class);
1337
            $this->logger->error('Could not load XML after deserialization');
1338
        }
1339
    }
1340
}
1341