Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.

MetsDocument::hasMetadataSection()   A
last analyzed

Complexity

Conditions 2
Paths 2

Size

Total Lines 3
Code Lines 1

Duplication

Lines 0
Ratio 0 %

Importance

Changes 2
Bugs 0 Features 0
Metric Value
cc 2
eloc 1
c 2
b 0
f 0
nc 2
nop 3
dl 0
loc 3
rs 10
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use \DOMDocument;
0 ignored issues
show
Bug introduced by
The type \DOMDocument was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
16
use \DOMElement;
0 ignored issues
show
Bug introduced by
The type \DOMElement was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
17
use \DOMNode;
0 ignored issues
show
Bug introduced by
The type \DOMNode was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
18
use \DOMNodeList;
0 ignored issues
show
Bug introduced by
The type \DOMNodeList was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
19
use \DOMXPath;
0 ignored issues
show
Bug introduced by
The type \DOMXPath was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
20
use \SimpleXMLElement;
0 ignored issues
show
Bug introduced by
The type \SimpleXMLElement was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
21
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
22
use TYPO3\CMS\Core\Database\ConnectionPool;
23
use TYPO3\CMS\Core\Database\Query\Restriction\HiddenRestriction;
24
use TYPO3\CMS\Core\Log\LogManager;
25
use TYPO3\CMS\Core\Utility\GeneralUtility;
26
use Ubl\Iiif\Tools\IiifHelper;
27
use Ubl\Iiif\Services\AbstractImageService;
28
29
/**
30
 * MetsDocument class for the 'dlf' extension.
31
 *
32
 * @package TYPO3
33
 * @subpackage dlf
34
 *
35
 * @access public
36
 *
37
 * @property int $cPid this holds the PID for the configuration
38
 * @property-read array $formats this holds the configuration for all supported metadata encodings
39
 * @property bool $formatsLoaded flag with information if the available metadata formats are loaded
40
 * @property-read bool $hasFulltext flag with information if there are any fulltext files available
41
 * @property array $lastSearchedPhysicalPage the last searched logical and physical page
42
 * @property array $logicalUnits this holds the logical units
43
 * @property-read array $metadataArray this holds the documents' parsed metadata array
44
 * @property bool $metadataArrayLoaded flag with information if the metadata array is loaded
45
 * @property-read int $numPages the holds the total number of pages
46
 * @property-read int $numMeasures This holds the total number of measures
47
 * @property-read int $parentId this holds the UID of the parent document or zero if not multi-volumed
48
 * @property-read array $physicalStructure this holds the physical structure
49
 * @property-read array $physicalStructureInfo this holds the physical structure metadata
50
 * @property-read array $musicalStructure This holds the musical structure
51
 * @property-read array $musicalStructureInfo This holds the musical structure metadata
52
 * @property bool $physicalStructureLoaded flag with information if the physical structure is loaded
53
 * @property-read int $pid this holds the PID of the document or zero if not in database
54
 * @property array $rawTextArray this holds the documents' raw text pages with their corresponding structMap//div's ID (METS) or Range / Manifest / Sequence ID (IIIF) as array key
55
 * @property-read bool $ready Is the document instantiated successfully?
56
 * @property-read string $recordId the METS file's / IIIF manifest's record identifier
57
 * @property-read int $rootId this holds the UID of the root document or zero if not multi-volumed
58
 * @property-read array $smLinks this holds the smLinks between logical and physical structMap
59
 * @property bool $smLinksLoaded flag with information if the smLinks are loaded
60
 * @property-read array $tableOfContents this holds the logical structure
61
 * @property bool $tableOfContentsLoaded flag with information if the table of contents is loaded
62
 * @property-read string $thumbnail this holds the document's thumbnail location
63
 * @property bool $thumbnailLoaded flag with information if the thumbnail is loaded
64
 * @property-read string $toplevelId this holds the toplevel structure's "@ID" (METS) or the manifest's "@id" (IIIF)
65
 * @property SimpleXMLElement $xml this holds the whole XML file as SimpleXMLElement object
66
 * @property-read array $mdSec associative array of METS metadata sections indexed by their IDs.
67
 * @property bool $mdSecLoaded flag with information if the array of METS metadata sections is loaded
68
 * @property-read array $dmdSec subset of `$mdSec` storing only the dmdSec entries; kept for compatibility.
69
 * @property-read array $fileGrps this holds the file ID -> USE concordance
70
 * @property bool $fileGrpsLoaded flag with information if file groups array is loaded
71
 * @property-read array $fileInfos additional information about files (e.g., ADMID), indexed by ID.
72
 * @property-read SimpleXMLElement $mets this holds the XML file's METS part as SimpleXMLElement object
73
 * @property-read string $parentHref URL of the parent document (determined via mptr element), or empty string if none is available
74
 */
75
final class MetsDocument extends AbstractDocument
76
{
77
    /**
78
     * @access protected
79
     * @var string[] Subsections / tags that may occur within `<mets:amdSec>`
80
     *
81
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#amdSec
82
     * @link https://www.loc.gov/standards/mets/docs/mets.v1-9.html#mdSecType
83
     */
84
    protected const ALLOWED_AMD_SEC = ['techMD', 'rightsMD', 'sourceMD', 'digiprovMD'];
85
86
    /**
87
     * @access protected
88
     * @var string This holds the whole XML file as string for serialization purposes
89
     *
90
     * @see __sleep() / __wakeup()
91
     */
92
    protected string $asXML = '';
93
94
    /**
95
     * @access protected
96
     * @var array This maps the ID of each amdSec to the IDs of its children (techMD etc.). When an ADMID references an amdSec instead of techMD etc., this is used to iterate the child elements.
97
     */
98
    protected array $amdSecChildIds = [];
99
100
    /**
101
     * @access protected
102
     * @var array Associative array of METS metadata sections indexed by their IDs.
103
     */
104
    protected array $mdSec = [];
105
106
    /**
107
     * @access protected
108
     * @var bool Are the METS file's metadata sections loaded?
109
     *
110
     * @see MetsDocument::$mdSec
111
     */
112
    protected bool $mdSecLoaded = false;
113
114
    /**
115
     * @access protected
116
     * @var array Subset of $mdSec storing only the dmdSec entries; kept for compatibility.
117
     */
118
    protected array $dmdSec = [];
119
120
    /**
121
     * @access protected
122
     * @var array This holds the file ID -> USE concordance
123
     *
124
     * @see magicGetFileGrps()
125
     */
126
    protected array $fileGrps = [];
127
128
    /**
129
     * @access protected
130
     * @var bool Are the image file groups loaded?
131
     *
132
     * @see $fileGrps
133
     */
134
    protected bool $fileGrpsLoaded = false;
135
136
    /**
137
     * @access protected
138
     * @var SimpleXMLElement This holds the XML file's METS part as SimpleXMLElement object
139
     */
140
    protected SimpleXMLElement $mets;
141
142
    /**
143
     * @access protected
144
     * @var string URL of the parent document (determined via mptr element), or empty string if none is available
145
     */
146
    protected string $parentHref = '';
147
148
    /**
149
     * @access protected
150
     * @var array the extension settings
151
     */
152
    protected array $settings = [];
153
154
    /**
155
     * This holds the musical structure
156
     *
157
     * @var array
158
     * @access protected
159
     */
160
    protected array $musicalStructure = [];
161
162
    /**
163
     * This holds the musical structure metadata
164
     *
165
     * @var array
166
     * @access protected
167
     */
168
    protected array $musicalStructureInfo = [];
169
170
    /**
171
     * Is the musical structure loaded?
172
     * @see $musicalStructure
173
     *
174
     * @var bool
175
     * @access protected
176
     */
177
    protected bool $musicalStructureLoaded = false;
178
179
    /**
180
     * The holds the total number of measures
181
     *
182
     * @var int
183
     * @access protected
184
     */
185
    protected int $numMeasures;
186
187
    /**
188
     * This adds metadata from METS structural map to metadata array.
189
     *
190
     * @access public
191
     *
192
     * @param array &$metadata The metadata array to extend
193
     * @param string $id The "@ID" attribute of the logical structure node
194
     *
195
     * @return void
196
     */
197
    public function addMetadataFromMets(array &$metadata, string $id): void
198
    {
199
        $details = $this->getLogicalStructure($id);
200
        if (!empty($details)) {
201
            $metadata['mets_order'][0] = $details['order'];
202
            $metadata['mets_label'][0] = $details['label'];
203
            $metadata['mets_orderlabel'][0] = $details['orderlabel'];
204
        }
205
    }
206
207
    /**
208
     * @see AbstractDocument::establishRecordId()
209
     */
210
    protected function establishRecordId(int $pid): void
211
    {
212
        // Check for METS object @ID.
213
        if (!empty($this->mets['OBJID'])) {
214
            $this->recordId = (string) $this->mets['OBJID'];
215
        }
216
        // Get hook objects.
217
        $hookObjects = Helper::getHookObjects('Classes/Common/MetsDocument.php');
218
        // Apply hooks.
219
        foreach ($hookObjects as $hookObj) {
220
            if (method_exists($hookObj, 'postProcessRecordId')) {
221
                $hookObj->postProcessRecordId($this->xml, $this->recordId);
222
            }
223
        }
224
    }
225
226
    /**
227
     * @see AbstractDocument::getDownloadLocation()
228
     */
229
    public function getDownloadLocation(string $id): string
230
    {
231
        $file = $this->getFileInfo($id);
232
        if ($file['mimeType'] === 'application/vnd.kitodo.iiif') {
233
            $file['location'] = (strrpos($file['location'], 'info.json') === strlen($file['location']) - 9) ? $file['location'] : (strrpos($file['location'], '/') === strlen($file['location']) ? $file['location'] . 'info.json' : $file['location'] . '/info.json');
234
            $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey, 'iiif');
235
            IiifHelper::setUrlReader(IiifUrlReader::getInstance());
236
            IiifHelper::setMaxThumbnailHeight($conf['thumbnailHeight']);
237
            IiifHelper::setMaxThumbnailWidth($conf['thumbnailWidth']);
238
            $service = IiifHelper::loadIiifResource($file['location']);
239
            if ($service instanceof AbstractImageService) {
240
                return $service->getImageUrl();
241
            }
242
        } elseif ($file['mimeType'] === 'application/vnd.netfpx') {
243
            $baseURL = $file['location'] . (strpos($file['location'], '?') === false ? '?' : '');
244
            // TODO CVT is an optional IIP server capability; in theory, capabilities should be determined in the object request with '&obj=IIP-server'
245
            return $baseURL . '&CVT=jpeg';
246
        }
247
        return $file['location'];
248
    }
249
250
    /**
251
     * {@inheritDoc}
252
     * @see AbstractDocument::getFileInfo()
253
     */
254
    public function getFileInfo($id): ?array
255
    {
256
        $this->magicGetFileGrps();
257
258
        if (isset($this->fileInfos[$id]) && empty($this->fileInfos[$id]['location'])) {
259
            $this->fileInfos[$id]['location'] = $this->getFileLocation($id);
0 ignored issues
show
Bug introduced by
The property fileInfos is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
260
        }
261
262
        if (isset($this->fileInfos[$id]) && empty($this->fileInfos[$id]['mimeType'])) {
263
            $this->fileInfos[$id]['mimeType'] = $this->getFileMimeType($id);
264
        }
265
266
        return $this->fileInfos[$id] ?? null;
267
    }
268
269
    /**
270
     * @see AbstractDocument::getFileLocation()
271
     */
272
    public function getFileLocation(string $id): string
273
    {
274
        $location = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/mets:FLocat[@LOCTYPE="URL"]');
275
        if (
276
            !empty($id)
277
            && !empty($location)
278
        ) {
279
            return (string) $location[0]->attributes('http://www.w3.org/1999/xlink')->href;
280
        } else {
281
            $this->logger->warning('There is no file node with @ID "' . $id . '"');
282
            return '';
283
        }
284
    }
285
286
    /**
287
     * This gets the measure beginning of a page
288
     */
289
    public function getPageBeginning($pageId, $fileId)
290
    {
291
        $mets = $this->mets
292
            ->xpath(
293
                './mets:structMap[@TYPE="PHYSICAL"]' .
294
                '//mets:div[@ID="' .  $pageId .  '"]' .
295
                '/mets:fptr[@FILEID="' .  $fileId .  '"]' .
296
                '/mets:area/@BEGIN'
297
            );
298
        return empty($mets) ? '' : $mets[0]->__toString();
299
    }
300
301
    /**
302
     * {@inheritDoc}
303
     * @see AbstractDocument::getFileMimeType()
304
     */
305
    public function getFileMimeType(string $id): string
306
    {
307
        $mimetype = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="' . $id . '"]/@MIMETYPE');
308
        if (
309
            !empty($id)
310
            && !empty($mimetype)
311
        ) {
312
            return (string) $mimetype[0];
313
        } else {
314
            $this->logger->warning('There is no file node with @ID "' . $id . '" or no MIME type specified');
315
            return '';
316
        }
317
    }
318
319
    /**
320
     * @see AbstractDocument::getLogicalStructure()
321
     */
322
    public function getLogicalStructure(string $id, bool $recursive = false): array
323
    {
324
        $details = [];
325
        // Is the requested logical unit already loaded?
326
        if (
327
            !$recursive
328
            && !empty($this->logicalUnits[$id])
329
        ) {
330
            // Yes. Return it.
331
            return $this->logicalUnits[$id];
332
        } elseif (!empty($id)) {
333
            // Get specified logical unit.
334
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]');
335
        } else {
336
            // Get all logical units at top level.
337
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]/mets:div');
338
        }
339
        if (!empty($divs)) {
340
            if (!$recursive) {
341
                // Get the details for the first xpath hit.
342
                $details = $this->getLogicalStructureInfo($divs[0]);
343
            } else {
344
                // Walk the logical structure recursively and fill the whole table of contents.
345
                foreach ($divs as $div) {
346
                    $this->tableOfContents[] = $this->getLogicalStructureInfo($div, $recursive);
347
                }
348
            }
349
        }
350
        return $details;
351
    }
352
353
    /**
354
     * This gets details about a logical structure element
355
     *
356
     * @access protected
357
     *
358
     * @param SimpleXMLElement $structure The logical structure node
359
     * @param bool $recursive Whether to include the child elements
360
     *
361
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
362
     */
363
    protected function getLogicalStructureInfo(SimpleXMLElement $structure, bool $recursive = false): array
364
    {
365
        $attributes = $structure->attributes();
366
367
        // Extract identity information.
368
        $details = [
369
            'id' => (string) $attributes['ID'],
370
            'dmdId' => isset($attributes['DMDID']) ? (string) $attributes['DMDID'] : '',
371
            'admId' => isset($attributes['ADMID']) ? (string) $attributes['ADMID'] : '',
372
            'order' => isset($attributes['ORDER']) ? (string) $attributes['ORDER'] : '',
373
            'label' => isset($attributes['LABEL']) ? (string) $attributes['LABEL'] : '',
374
            'orderlabel' => isset($attributes['ORDERLABEL']) ? (string) $attributes['ORDERLABEL'] : '',
375
            'contentIds' => isset($attributes['CONTENTIDS']) ? (string) $attributes['CONTENTIDS'] : '',
376
            'volume' => '',
377
            'year' => '',
378
            'pagination' => '',
379
            'type' => isset($attributes['TYPE']) ? (string) $attributes['TYPE'] : '',
380
            'description' => '',
381
            'thumbnailId' => null,
382
            'files' => [],
383
        ];
384
385
        // Set volume and year information only if no label is set and this is the toplevel structure element.
386
        if (empty($details['label']) && empty($details['orderlabel'])) {
387
            $metadata = $this->getMetadata($details['id']);
388
            $details['volume'] = $metadata['volume'][0] ?? '';
389
            $details['year'] = $metadata['year'][0] ?? '';
390
        }
391
392
        // add description for 3D objects
393
        if ($details['type'] == 'object') {
394
            $metadata = $this->getMetadata($details['id']);
395
            $details['description'] = $metadata['description'][0] ?? '';
396
        }
397
398
        // Load smLinks.
399
        $this->magicGetSmLinks();
400
        // Load physical structure.
401
        $this->magicGetPhysicalStructure();
402
403
        $this->getPage($details, $structure->children('http://www.loc.gov/METS/')->mptr);
404
        $this->getFiles($details, $structure->children('http://www.loc.gov/METS/')->fptr);
405
406
        // Keep for later usage.
407
        $this->logicalUnits[$details['id']] = $details;
408
        // Walk the structure recursively? And are there any children of the current element?
409
        if (
410
            $recursive
411
            && count($structure->children('http://www.loc.gov/METS/')->div)
412
        ) {
413
            $details['children'] = [];
414
            foreach ($structure->children('http://www.loc.gov/METS/')->div as $child) {
415
                // Repeat for all children.
416
                $details['children'][] = $this->getLogicalStructureInfo($child, true);
417
            }
418
        }
419
        return $details;
420
    }
421
422
    /**
423
     * Get the files this structure element is pointing at.
424
     *
425
     * @param ?SimpleXMLElement $filePointers
426
     *
427
     * @return void
428
     */
429
    private function getFiles(array &$details, ?SimpleXMLElement $filePointers): void
430
    {
431
        $fileUse = $this->magicGetFileGrps();
432
        // Get the file representations from fileSec node.
433
        foreach ($filePointers as $filePointer) {
434
            $fileId = (string) $filePointer->attributes()->FILEID;
435
            // Check if file has valid @USE attribute.
436
            if (!empty($fileUse[$fileId])) {
437
                $details['files'][$fileUse[$fileId]] = $fileId;
438
            }
439
        }
440
    }
441
442
    /**
443
     * Get the physical page or external file this structure element is pointing at.
444
     *
445
     * @access private
446
     *
447
     * @param array $details passed as reference
448
     * @param ?SimpleXMLElement $metsPointers
449
     *
450
     * @return void
451
     */
452
    private function getPage(array &$details, ?SimpleXMLElement $metsPointers): void
453
    {
454
        if (count($metsPointers)) {
0 ignored issues
show
Bug introduced by
It seems like $metsPointers can also be of type null; however, parameter $value of count() does only seem to accept Countable|array, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

454
        if (count(/** @scrutinizer ignore-type */ $metsPointers)) {
Loading history...
455
            // Yes. Get the file reference.
456
            $details['points'] = (string) $metsPointers[0]->attributes('http://www.w3.org/1999/xlink')->href;
457
        } elseif (
458
            !empty($this->physicalStructure)
459
            && array_key_exists($details['id'], $this->smLinks['l2p'])
460
        ) {
461
            // Link logical structure to the first corresponding physical page/track.
462
            $details['points'] = max((int) array_search($this->smLinks['l2p'][$details['id']][0], $this->physicalStructure, true), 1);
463
            $details['thumbnailId'] = $this->getThumbnail();
464
            // Get page/track number of the first page/track related to this structure element.
465
            $details['pagination'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['orderlabel'];
466
        } elseif ($details['id'] == $this->magicGetToplevelId()) {
467
            // Point to self if this is the toplevel structure.
468
            $details['points'] = 1;
469
            $details['thumbnailId'] = $this->getThumbnail();
470
        }
471
        if ($details['thumbnailId'] === null) {
472
            unset($details['thumbnailId']);
473
        }
474
    }
475
476
    /**
477
     * Get thumbnail for logical structure info.
478
     *
479
     * @access private
480
     *
481
     * @param string $id empty if top level document, else passed the id of parent document
482
     *
483
     * @return ?string thumbnail or null if not found
484
     */
485
    private function getThumbnail(string $id = '')
486
    {
487
        // Load plugin configuration.
488
        $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey, 'files');
489
        $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
490
491
        $thumbnail = null;
492
493
        while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
494
            if (empty($id)) {
495
                $thumbnail = $this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb] ?? null;
496
            } else {
497
                $parentId = $this->smLinks['l2p'][$id][0] ?? null;
498
                $thumbnail = $this->physicalStructureInfo[$parentId]['files'][$fileGrpThumb] ?? null;
499
            }
500
501
            if (!empty($thumbnail)) {
502
                break;
503
            }
504
        }
505
        return $thumbnail;
506
    }
507
508
    /**
509
     * @see AbstractDocument::getMetadata()
510
     */
511
    public function getMetadata(string $id, int $cPid = 0): array
512
    {
513
        $cPid = $this->ensureValidPid($cPid);
514
515
        if ($cPid == 0) {
516
            $this->logger->warning('Invalid PID for metadata definitions');
517
            return [];
518
        }
519
520
        $metadata = $this->getMetadataFromArray($id, $cPid);
521
522
        if (empty($metadata)) {
523
            return [];
524
        }
525
526
        $metadata = $this->processMetadataSections($id, $cPid, $metadata);
527
528
        if (!empty($metadata)) {
529
            $metadata = $this->setDefaultTitleAndDate($metadata);
530
        }
531
532
        return $metadata;
533
    }
534
535
    /**
536
     * Ensure that pId is valid.
537
     *
538
     * @access private
539
     *
540
     * @param integer $cPid
541
     *
542
     * @return integer
543
     */
544
    private function ensureValidPid(int $cPid): int
545
    {
546
        $cPid = max($cPid, 0);
547
        if ($cPid == 0 && ($this->cPid || $this->pid)) {
548
            // Retain current PID.
549
            $cPid = $this->cPid ?: $this->pid;
550
        }
551
        return $cPid;
552
    }
553
554
    /**
555
     * Get metadata from array.
556
     *
557
     * @access private
558
     *
559
     * @param string $id
560
     * @param integer $cPid
561
     *
562
     * @return array
563
     */
564
    private function getMetadataFromArray(string $id, int $cPid): array
565
    {
566
        if (!empty($this->metadataArray[$id]) && $this->metadataArray[0] == $cPid) {
567
            return $this->metadataArray[$id];
568
        }
569
        return $this->initializeMetadata('METS');
570
    }
571
572
    /**
573
     * Process metadata sections.
574
     *
575
     * @access private
576
     *
577
     * @param string $id
578
     * @param integer $cPid
579
     * @param array $metadata
580
     *
581
     * @return array
582
     */
583
    private function processMetadataSections(string $id, int $cPid, array $metadata): array
584
    {
585
        $mdIds = $this->getMetadataIds($id);
586
        if (empty($mdIds)) {
587
            // There is no metadata section for this structure node.
588
            return [];
589
        }
590
        // Array used as set of available section types (dmdSec, techMD, ...)
591
        $metadataSections = [];
592
        // Load available metadata formats and metadata sections.
593
        $this->loadFormats();
594
        $this->magicGetMdSec();
595
596
        $metadata['type'] = $this->getLogicalUnitType($id);
597
598
        if (!empty($this->mdSec)) {
599
            foreach ($mdIds as $dmdId) {
600
                $mdSectionType = $this->mdSec[$dmdId]['section'];
601
    
602
                if ($this->hasMetadataSection($metadataSections, $mdSectionType, 'dmdSec')) {
603
                    continue;
604
                }
605
    
606
                if (!$this->extractAndProcessMetadata($dmdId, $mdSectionType, $metadata, $cPid, $metadataSections)) {
607
                    continue;
608
                }
609
    
610
                $metadataSections[] = $mdSectionType;
611
            }
612
        }
613
614
        // Files are not expected to reference a dmdSec
615
        if (isset($this->fileInfos[$id]) || in_array('dmdSec', $metadataSections)) {
616
            return $metadata;
617
        } else {
618
            $this->logger->warning('No supported descriptive metadata found for logical structure with @ID "' . $id . '"');
619
            return [];
620
        }
621
    }
622
623
    /**
624
     * @param array $allSubentries
625
     * @param string $parentIndex
626
     * @param DOMNode $parentNode
627
     * @return array|false
628
     */
629
    private function getSubentries($allSubentries, string $parentIndex, DOMNode $parentNode)
630
    {
631
        $domXPath = new DOMXPath($parentNode->ownerDocument);
632
        $this->registerNamespaces($domXPath);
633
        $theseSubentries = [];
634
        foreach ($allSubentries as $subentry) {
635
            if ($subentry['parent_index_name'] == $parentIndex) {
636
                $values = $domXPath->evaluate($subentry['xpath'], $parentNode);
637
                if (!empty($subentry['xpath']) && ($values)) {
638
                    $theseSubentries = array_merge($theseSubentries, $this->getSubentryValue($values, $subentry));
639
                }
640
                // Set default value if applicable.
641
                if (
642
                    empty($theseSubentries[$subentry['index_name']][0])
643
                    && strlen($subentry['default_value']) > 0
644
                ) {
645
                    $theseSubentries[$subentry['index_name']] = [$subentry['default_value']];
646
                }
647
            }
648
        }
649
        if (empty($theseSubentries)) {
650
            return false;
651
        }
652
        return $theseSubentries;
653
    }
654
655
    /**
656
     * @param $values
657
     * @param $subentry
658
     * @return array
659
     */
660
    private function getSubentryValue($values, $subentry)
661
    {
662
        $theseSubentries = [];
663
        if (
664
            ($values instanceof DOMNodeList
665
                && $values->length > 0) || is_string($values)
666
        ) {
667
            if (is_string($values)) {
668
                // if concat is used evaluate returns a string
669
                $theseSubentries[$subentry['index_name']][] = trim($values);
670
            } else {
671
                foreach ($values as $value) {
672
                    if (!empty(trim((string) $value->nodeValue))) {
673
                        $theseSubentries[$subentry['index_name']][] = trim((string) $value->nodeValue);
674
                    }
675
                }
676
            }
677
        } elseif (!($values instanceof DOMNodeList)) {
678
            $theseSubentries[$subentry['index_name']] = [trim((string) $values->nodeValue)];
679
        }
680
        return $theseSubentries;
681
    }
682
683
    /**
684
     * Get logical unit type.
685
     *
686
     * @access private
687
     *
688
     * @param string $id
689
     *
690
     * @return array
691
     */
692
    private function getLogicalUnitType(string $id): array
693
    {
694
        if (!empty($this->logicalUnits[$id])) {
695
            return [$this->logicalUnits[$id]['type']];
696
        } else {
697
            $struct = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]/@TYPE');
698
            if (!empty($struct)) {
699
                return [(string) $struct[0]];
700
            }
701
        }
702
        return [];
703
    }
704
705
    /**
706
     * Extract and process metadata.
707
     *
708
     * @access private
709
     *
710
     * @param string $dmdId
711
     * @param string $mdSectionType
712
     * @param array $metadata
713
     * @param integer $cPid
714
     * @param array $metadataSections
715
     *
716
     * @return boolean
717
     */
718
    private function extractAndProcessMetadata(string $dmdId, string $mdSectionType, array &$metadata, int $cPid, array $metadataSections): bool
719
    {
720
        if ($this->hasMetadataSection($metadataSections, $mdSectionType, 'dmdSec')) {
721
            return true;
722
        }
723
724
        $metadataExtracted = $this->extractMetadataIfTypeSupported($dmdId, $mdSectionType, $metadata);
725
726
        if (!$metadataExtracted) {
727
            return false;
728
        }
729
730
        $additionalMetadata = $this->getAdditionalMetadataFromDatabase($cPid, $dmdId);
731
        // We need a DOMDocument here, because SimpleXML doesn't support XPath functions properly.
732
        $domNode = dom_import_simplexml($this->mdSec[$dmdId]['xml']);
733
        $domXPath = new DOMXPath($domNode->ownerDocument);
734
        $this->registerNamespaces($domXPath);
735
736
        $this->processAdditionalMetadata($additionalMetadata, $domXPath, $domNode, $metadata);
737
738
        return true;
739
    }
740
741
    /**
742
     * Check if searched metadata section is stored in the array.
743
     *
744
     * @access private
745
     *
746
     * @param array $metadataSections
747
     * @param string $currentMetadataSection
748
     * @param string $searchedMetadataSection
749
     *
750
     * @return boolean
751
     */
752
    private function hasMetadataSection(array $metadataSections, string $currentMetadataSection, string $searchedMetadataSection): bool
753
    {
754
        return $currentMetadataSection === $searchedMetadataSection && in_array($searchedMetadataSection, $metadataSections);
755
    }
756
757
    /**
758
     * Process additional metadata.
759
     *
760
     * @access private
761
     *
762
     * @param array $additionalMetadata
763
     * @param DOMXPath $domXPath
764
     * @param DOMElement $domNode
765
     * @param array $metadata
766
     *
767
     * @return void
768
     */
769
    private function processAdditionalMetadata(array $additionalMetadata, DOMXPath $domXPath, DOMElement $domNode, array &$metadata): void
770
    {
771
        $subentries = [];
772
        if (isset($additionalMetadata['subentries'])) {
773
            $subentries = $additionalMetadata['subentries'];
774
            unset($additionalMetadata['subentries']);
775
        }
776
        foreach ($additionalMetadata as $resArray) {
777
            $this->setMetadataFieldValues($resArray, $domXPath, $domNode, $metadata, $subentries);
778
            $this->setDefaultMetadataValue($resArray, $metadata);
779
            $this->setSortableMetadataValue($resArray, $domXPath, $domNode, $metadata);
780
        }
781
    }
782
783
    /**
784
     * Set metadata field values.
785
     *
786
     * @access private
787
     *
788
     * @param array $resArray
789
     * @param DOMXPath $domXPath
790
     * @param DOMElement $domNode
791
     * @param array $metadata
792
     * @param array $subentryResults
793
     *
794
     * @return void
795
     */
796
    private function setMetadataFieldValues(array $resArray, DOMXPath $domXPath, DOMElement $domNode, array &$metadata, array $subentryResults): void
797
    {
798
        if ($resArray['format'] > 0 && !empty($resArray['xpath'])) {
799
            $values = $domXPath->evaluate($resArray['xpath'], $domNode);
800
            if ($values instanceof DOMNodeList && $values->length > 0) {
801
                $metadata[$resArray['index_name']] = [];
802
                foreach ($values as $value) {
803
                    $subentries = $this->getSubentries($subentryResults, $resArray['index_name'], $value);
804
                    if ($subentries) {
805
                        $metadata[$resArray['index_name']][] = $subentries;
806
                    } else {
807
                        $metadata[$resArray['index_name']][] = trim((string) $value->nodeValue);
808
                    }
809
                }
810
            } elseif (!($values instanceof DOMNodeList)) {
811
                $metadata[$resArray['index_name']] = [trim((string) $values)];
812
            }
813
        }
814
    }
815
816
    /**
817
     * Set default metadata value.
818
     *
819
     * @access private
820
     *
821
     * @param array $resArray
822
     * @param array $metadata
823
     *
824
     * @return void
825
     */
826
    private function setDefaultMetadataValue(array $resArray, array &$metadata): void
827
    {
828
        if (empty($metadata[$resArray['index_name']][0]) && strlen($resArray['default_value']) > 0) {
829
            $metadata[$resArray['index_name']] = [$resArray['default_value']];
830
        }
831
    }
832
833
    /**
834
     * Set sortable metadata value.
835
     *
836
     * @access private
837
     *
838
     * @param array $resArray
839
     * @param  $domXPath
840
     * @param DOMElement $domNode
841
     * @param array $metadata
842
     *
843
     * @return void
844
     */
845
    private function setSortableMetadataValue(array $resArray, DOMXPath $domXPath, DOMElement $domNode, array &$metadata): void
846
    {
847
        $indexName = $resArray['index_name'];
848
        if (!empty($metadata[$indexName]) && $resArray['is_sortable']) {
849
            $currentMetadata = $metadata[$indexName][0];
850
            if ($resArray['format'] > 0 && !empty($resArray['xpath_sorting'])) {
851
                $values = $domXPath->evaluate($resArray['xpath_sorting'], $domNode);
852
                if ($values instanceof DOMNodeList && $values->length > 0) {
853
                    $metadata[$indexName . '_sorting'][0] = trim((string) $values->item(0)->nodeValue);
854
                } elseif (!($values instanceof DOMNodeList)) {
855
                    $metadata[$indexName . '_sorting'][0] = trim((string) $values);
856
                }
857
            }
858
            if (empty($metadata[$indexName . '_sorting'][0])) {
859
                if (is_array($currentMetadata)) {
860
                    $sortingValue = implode(',', array_column($currentMetadata, 0));
861
                    $metadata[$indexName . '_sorting'][0] = $sortingValue;
862
                } else {
863
                    $metadata[$indexName . '_sorting'][0] = $currentMetadata;
864
                }
865
            }
866
        }
867
    }
868
869
    /**
870
     * Set default title and date if those metadata is not set.
871
     *
872
     * @access private
873
     *
874
     * @param array $metadata
875
     *
876
     * @return array
877
     */
878
    private function setDefaultTitleAndDate(array $metadata): array
879
    {
880
        // Set title to empty string if not present.
881
        if (empty($metadata['title'][0])) {
882
            $metadata['title'][0] = '';
883
            $metadata['title_sorting'][0] = '';
884
        }
885
886
        // Set title_sorting to title as default.
887
        if (empty($metadata['title_sorting'][0])) {
888
            $metadata['title_sorting'][0] = $metadata['title'][0];
889
        }
890
891
        // Set date to empty string if not present.
892
        if (empty($metadata['date'][0])) {
893
            $metadata['date'][0] = '';
894
        }
895
896
        return $metadata;
897
    }
898
899
    /**
900
     * Extract metadata if metadata type is supported.
901
     *
902
     * @access private
903
     *
904
     * @param string $dmdId descriptive metadata id
905
     * @param string $mdSectionType metadata section type
906
     * @param array &$metadata
907
     *
908
     * @return bool true if extraction successful, false otherwise
909
     */
910
    private function extractMetadataIfTypeSupported(string $dmdId, string $mdSectionType, array &$metadata)
911
    {
912
        // Is this metadata format supported?
913
        if (!empty($this->formats[$this->mdSec[$dmdId]['type']])) {
914
            if (!empty($this->formats[$this->mdSec[$dmdId]['type']]['class'])) {
915
                $class = $this->formats[$this->mdSec[$dmdId]['type']]['class'];
916
                // Get the metadata from class.
917
                if (class_exists($class)) {
918
                    $obj = GeneralUtility::makeInstance($class);
919
                    if ($obj instanceof MetadataInterface) {
920
                        $obj->extractMetadata($this->mdSec[$dmdId]['xml'], $metadata, GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey, 'general')['useExternalApisForMetadata']);
921
                        return true;
922
                    }
923
                } else {
924
                    $this->logger->warning('Invalid class/method "' . $class . '->extractMetadata()" for metadata format "' . $this->mdSec[$dmdId]['type'] . '"');
925
                }
926
            }
927
        } else {
928
            $this->logger->notice('Unsupported metadata format "' . $this->mdSec[$dmdId]['type'] . '" in ' . $mdSectionType . ' with @ID "' . $dmdId . '"');
929
        }
930
        return false;
931
    }
932
933
    /**
934
     * Get additional data from database.
935
     *
936
     * @access private
937
     *
938
     * @param int $cPid page id
939
     * @param string $dmdId descriptive metadata id
940
     *
941
     * @return array additional metadata data queried from database
942
     */
943
    private function getAdditionalMetadataFromDatabase(int $cPid, string $dmdId)
944
    {
945
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
946
            ->getQueryBuilderForTable('tx_dlf_metadata');
947
        // Get hidden records, too.
948
        $queryBuilder
949
            ->getRestrictions()
950
            ->removeByType(HiddenRestriction::class);
951
        // Get all metadata with configured xpath and applicable format first.
952
        // Exclude metadata with subentries, we will fetch them later.
953
        $resultWithFormat = $queryBuilder
954
            ->select(
955
                'tx_dlf_metadata.index_name AS index_name',
956
                'tx_dlf_metadataformat_joins.xpath AS xpath',
957
                'tx_dlf_metadataformat_joins.xpath_sorting AS xpath_sorting',
958
                'tx_dlf_metadata.is_sortable AS is_sortable',
959
                'tx_dlf_metadata.default_value AS default_value',
960
                'tx_dlf_metadata.format AS format'
961
            )
962
            ->from('tx_dlf_metadata')
963
            ->innerJoin(
964
                'tx_dlf_metadata',
965
                'tx_dlf_metadataformat',
966
                'tx_dlf_metadataformat_joins',
967
                $queryBuilder->expr()->eq(
968
                    'tx_dlf_metadataformat_joins.parent_id',
969
                    'tx_dlf_metadata.uid'
970
                )
971
            )
972
            ->innerJoin(
973
                'tx_dlf_metadataformat_joins',
974
                'tx_dlf_formats',
975
                'tx_dlf_formats_joins',
976
                $queryBuilder->expr()->eq(
977
                    'tx_dlf_formats_joins.uid',
978
                    'tx_dlf_metadataformat_joins.encoded'
979
                )
980
            )
981
            ->where(
982
                $queryBuilder->expr()->eq('tx_dlf_metadata.pid', $cPid),
983
                $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
984
                $queryBuilder->expr()->eq('tx_dlf_metadataformat_joins.pid', $cPid),
985
                $queryBuilder->expr()->eq('tx_dlf_formats_joins.type', $queryBuilder->createNamedParameter($this->mdSec[$dmdId]['type']))
986
            )
987
            ->execute();
988
        // Get all metadata without a format, but with a default value next.
989
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
990
            ->getQueryBuilderForTable('tx_dlf_metadata');
991
        // Get hidden records, too.
992
        $queryBuilder
993
            ->getRestrictions()
994
            ->removeByType(HiddenRestriction::class);
995
        $resultWithoutFormat = $queryBuilder
996
            ->select(
997
                'tx_dlf_metadata.index_name AS index_name',
998
                'tx_dlf_metadata.is_sortable AS is_sortable',
999
                'tx_dlf_metadata.default_value AS default_value',
1000
                'tx_dlf_metadata.format AS format'
1001
            )
1002
            ->from('tx_dlf_metadata')
1003
            ->where(
1004
                $queryBuilder->expr()->eq('tx_dlf_metadata.pid', $cPid),
1005
                $queryBuilder->expr()->eq('tx_dlf_metadata.l18n_parent', 0),
1006
                $queryBuilder->expr()->eq('tx_dlf_metadata.format', 0),
1007
                $queryBuilder->expr()->neq('tx_dlf_metadata.default_value', $queryBuilder->createNamedParameter(''))
1008
            )
1009
            ->execute();
1010
        // Merge both result sets.
1011
        $allResults = array_merge($resultWithFormat->fetchAllAssociative(), $resultWithoutFormat->fetchAllAssociative());
1012
1013
        // Get subentries separately.
1014
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1015
            ->getQueryBuilderForTable('tx_dlf_metadata');
1016
        // Get hidden records, too.
1017
        $queryBuilder
1018
            ->getRestrictions()
1019
            ->removeByType(HiddenRestriction::class);
1020
        $subentries = $queryBuilder
1021
            ->select(
1022
                'tx_dlf_subentries_joins.index_name AS index_name',
1023
                'tx_dlf_metadata.index_name AS parent_index_name',
1024
                'tx_dlf_subentries_joins.xpath AS xpath',
1025
                'tx_dlf_subentries_joins.default_value AS default_value'
1026
            )
1027
            ->from('tx_dlf_metadata')
1028
            ->innerJoin(
1029
                'tx_dlf_metadata',
1030
                'tx_dlf_metadataformat',
1031
                'tx_dlf_metadataformat_joins',
1032
                $queryBuilder->expr()->eq(
1033
                    'tx_dlf_metadataformat_joins.parent_id',
1034
                    'tx_dlf_metadata.uid'
1035
                )
1036
            )
1037
            ->innerJoin(
1038
                'tx_dlf_metadataformat_joins',
1039
                'tx_dlf_metadatasubentries',
1040
                'tx_dlf_subentries_joins',
1041
                $queryBuilder->expr()->eq(
1042
                    'tx_dlf_subentries_joins.parent_id',
1043
                    'tx_dlf_metadataformat_joins.uid'
1044
                )
1045
            )
1046
            ->where(
1047
                $queryBuilder->expr()->eq('tx_dlf_metadata.pid', (int) $cPid),
1048
                $queryBuilder->expr()->gt('tx_dlf_metadataformat_joins.subentries', 0),
1049
                $queryBuilder->expr()->eq('tx_dlf_subentries_joins.l18n_parent', 0),
1050
                $queryBuilder->expr()->eq('tx_dlf_subentries_joins.pid', (int) $cPid)
1051
            )
1052
            ->orderBy('tx_dlf_subentries_joins.sorting')
1053
            ->execute();
1054
        $subentriesResult = $subentries->fetchAll();
1055
1056
        return array_merge($allResults, ['subentries' => $subentriesResult]);
1057
    }
1058
1059
    /**
1060
     * Get IDs of (descriptive and administrative) metadata sections
1061
     * referenced by node of given $id. The $id may refer to either
1062
     * a logical structure node or to a file.
1063
     *
1064
     * @access protected
1065
     *
1066
     * @param string $id The "@ID" attribute of the file node
1067
     *
1068
     * @return array
1069
     */
1070
    protected function getMetadataIds(string $id): array
1071
    {
1072
        // Load amdSecChildIds concordance
1073
        $this->magicGetMdSec();
1074
        $fileInfo = $this->getFileInfo($id);
1075
1076
        // Get DMDID and ADMID of logical structure node
1077
        if (!empty($this->logicalUnits[$id])) {
1078
            $dmdIds = $this->logicalUnits[$id]['dmdId'] ?? '';
1079
            $admIds = $this->logicalUnits[$id]['admId'] ?? '';
1080
        } else {
1081
            $mdSec = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $id . '"]')[0];
1082
            if ($mdSec) {
1083
                $dmdIds = (string) $mdSec->attributes()->DMDID;
1084
                $admIds = (string) $mdSec->attributes()->ADMID;
1085
            } elseif (isset($fileInfo)) {
1086
                $dmdIds = $fileInfo['dmdId'];
1087
                $admIds = $fileInfo['admId'];
1088
            } else {
1089
                $dmdIds = '';
1090
                $admIds = '';
1091
            }
1092
        }
1093
1094
        // Handle multiple DMDIDs/ADMIDs
1095
        $allMdIds = explode(' ', $dmdIds);
1096
1097
        foreach (explode(' ', $admIds) as $admId) {
1098
            if (isset($this->mdSec[$admId])) {
1099
                // $admId references an actual metadata section such as techMD
1100
                $allMdIds[] = $admId;
1101
            } elseif (isset($this->amdSecChildIds[$admId])) {
1102
                // $admId references a <mets:amdSec> element. Resolve child elements.
1103
                foreach ($this->amdSecChildIds[$admId] as $childId) {
1104
                    $allMdIds[] = $childId;
1105
                }
1106
            }
1107
        }
1108
1109
        return array_filter(
1110
            $allMdIds,
1111
            function ($element) {
1112
                return !empty($element);
1113
            }
1114
        );
1115
    }
1116
1117
    /**
1118
     * @see AbstractDocument::getFullText()
1119
     */
1120
    public function getFullText(string $id): string
1121
    {
1122
        $fullText = '';
1123
1124
        // Load fileGrps and check for full text files.
1125
        $this->magicGetFileGrps();
1126
        if ($this->hasFulltext) {
1127
            $fullText = $this->getFullTextFromXml($id);
1128
        }
1129
        return $fullText;
1130
    }
1131
1132
    /**
1133
     * @see AbstractDocument::getStructureDepth()
1134
     */
1135
    public function getStructureDepth(string $logId)
1136
    {
1137
        $ancestors = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $logId . '"]/ancestor::*');
1138
        if (!empty($ancestors)) {
1139
            return count($ancestors);
1140
        } else {
1141
            return 0;
1142
        }
1143
    }
1144
1145
    /**
1146
     * @see AbstractDocument::init()
1147
     */
1148
    protected function init(string $location, array $settings): void
1149
    {
1150
        $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class($this));
1151
        $this->settings = $settings;
1152
        // Get METS node from XML file.
1153
        $this->registerNamespaces($this->xml);
1154
        $mets = $this->xml->xpath('//mets:mets');
1155
        if (!empty($mets)) {
1156
            $this->mets = $mets[0];
1157
            // Register namespaces.
1158
            $this->registerNamespaces($this->mets);
1159
        } else {
1160
            if (!empty($location)) {
1161
                $this->logger->error('No METS part found in document with location "' . $location . '".');
1162
            } elseif (!empty($this->recordId)) {
1163
                $this->logger->error('No METS part found in document with recordId "' . $this->recordId . '".');
1164
            } else {
1165
                $this->logger->error('No METS part found in current document.');
1166
            }
1167
        }
1168
    }
1169
1170
    /**
1171
     * @see AbstractDocument::loadLocation()
1172
     */
1173
    protected function loadLocation(string $location): bool
1174
    {
1175
        $fileResource = Helper::getUrl($location);
1176
        if ($fileResource !== false) {
1177
            $xml = Helper::getXmlFileAsString($fileResource);
1178
            // Set some basic properties.
1179
            if ($xml !== false) {
1180
                $this->xml = $xml;
0 ignored issues
show
Documentation Bug introduced by
It seems like $xml of type SimpleXMLElement is incompatible with the declared type \SimpleXMLElement of property $xml.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
1181
                return true;
1182
            }
1183
        }
1184
        $this->logger->error('Could not load XML file from "' . $location . '"');
1185
        return false;
1186
    }
1187
1188
    /**
1189
     * @see AbstractDocument::ensureHasFulltextIsSet()
1190
     */
1191
    protected function ensureHasFulltextIsSet(): void
1192
    {
1193
        // Are the fileGrps already loaded?
1194
        if (!$this->fileGrpsLoaded) {
1195
            $this->magicGetFileGrps();
1196
        }
1197
    }
1198
1199
    /**
1200
     * @see AbstractDocument::setPreloadedDocument()
1201
     */
1202
    protected function setPreloadedDocument($preloadedDocument): bool
1203
    {
1204
1205
        if ($preloadedDocument instanceof SimpleXMLElement) {
1206
            $this->xml = $preloadedDocument;
1207
            return true;
1208
        }
1209
        return false;
1210
    }
1211
1212
    /**
1213
     * @see AbstractDocument::getDocument()
1214
     */
1215
    protected function getDocument(): SimpleXMLElement
1216
    {
1217
        return $this->mets;
1218
    }
1219
1220
    /**
1221
     * This builds an array of the document's metadata sections
1222
     *
1223
     * @access protected
1224
     *
1225
     * @return array Array of metadata sections with their IDs as array key
1226
     */
1227
    protected function magicGetMdSec(): array
1228
    {
1229
        if (!$this->mdSecLoaded) {
1230
            $this->loadFormats();
1231
1232
            foreach ($this->mets->xpath('./mets:dmdSec') as $dmdSecTag) {
1233
                $dmdSec = $this->processMdSec($dmdSecTag);
1234
1235
                if ($dmdSec !== null) {
1236
                    $this->mdSec[$dmdSec['id']] = $dmdSec;
0 ignored issues
show
Bug introduced by
The property mdSec is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1237
                    $this->dmdSec[$dmdSec['id']] = $dmdSec;
1238
                }
1239
            }
1240
1241
            foreach ($this->mets->xpath('./mets:amdSec') as $amdSecTag) {
1242
                $childIds = [];
1243
1244
                foreach ($amdSecTag->children('http://www.loc.gov/METS/') as $mdSecTag) {
1245
                    if (!in_array($mdSecTag->getName(), self::ALLOWED_AMD_SEC)) {
1246
                        continue;
1247
                    }
1248
1249
                    // TODO: Should we check that the format may occur within this type (e.g., to ignore VIDEOMD within rightsMD)?
1250
                    $mdSec = $this->processMdSec($mdSecTag);
1251
1252
                    if ($mdSec !== null) {
1253
                        $this->mdSec[$mdSec['id']] = $mdSec;
1254
1255
                        $childIds[] = $mdSec['id'];
1256
                    }
1257
                }
1258
1259
                $amdSecId = (string) $amdSecTag->attributes()->ID;
1260
                if (!empty($amdSecId)) {
1261
                    $this->amdSecChildIds[$amdSecId] = $childIds;
1262
                }
1263
            }
1264
1265
            $this->mdSecLoaded = true;
1266
        }
1267
        return $this->mdSec;
1268
    }
1269
1270
    /**
1271
     * Gets the document's metadata sections
1272
     *
1273
     * @access protected
1274
     *
1275
     * @return array Array of metadata sections with their IDs as array key
1276
     */
1277
    protected function magicGetDmdSec(): array
1278
    {
1279
        $this->magicGetMdSec();
1280
        return $this->dmdSec;
1281
    }
1282
1283
    /**
1284
     * Processes an element of METS `mdSecType`.
1285
     *
1286
     * @access protected
1287
     *
1288
     * @param SimpleXMLElement $element
1289
     *
1290
     * @return array|null The processed metadata section
1291
     */
1292
    protected function processMdSec(SimpleXMLElement $element): ?array
1293
    {
1294
        $mdId = (string) $element->attributes()->ID;
1295
        if (empty($mdId)) {
1296
            return null;
1297
        }
1298
1299
        $this->registerNamespaces($element);
1300
1301
        $type = '';
1302
        $mdType = $element->xpath('./mets:mdWrap[not(@MDTYPE="OTHER")]/@MDTYPE');
1303
        $otherMdType = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"]/@OTHERMDTYPE');
1304
1305
        if (!empty($mdType) && !empty($this->formats[(string) $mdType[0]])) {
1306
            $type = (string) $mdType[0];
1307
            $xml = $element->xpath('./mets:mdWrap[@MDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
1308
        } elseif (!empty($otherMdType) && !empty($this->formats[(string) $otherMdType[0]])) {
1309
            $type = (string) $otherMdType[0];
1310
            $xml = $element->xpath('./mets:mdWrap[@MDTYPE="OTHER"][@OTHERMDTYPE="' . $type . '"]/mets:xmlData/' . strtolower($type) . ':' . $this->formats[$type]['rootElement']);
1311
        }
1312
1313
        if (empty($xml)) {
1314
            return null;
1315
        }
1316
1317
        $this->registerNamespaces($xml[0]);
1318
1319
        return [
1320
            'id' => $mdId,
1321
            'section' => $element->getName(),
1322
            'type' => $type,
1323
            'xml' => $xml[0],
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $xml does not seem to be defined for all execution paths leading up to this point.
Loading history...
1324
        ];
1325
    }
1326
1327
    /**
1328
     * This builds the file ID -> USE concordance
1329
     *
1330
     * @access protected
1331
     *
1332
     * @return array Array of file use groups with file IDs
1333
     */
1334
    protected function magicGetFileGrps(): array
1335
    {
1336
        if (!$this->fileGrpsLoaded) {
1337
            // Get configured USE attributes.
1338
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey, 'files');
1339
            $useGrps = GeneralUtility::trimExplode(',', $extConf['fileGrpImages']);
1340
1341
            $configKeys = [
1342
                'fileGrpThumbs',
1343
                'fileGrpDownload',
1344
                'fileGrpFulltext',
1345
                'fileGrpAudio',
1346
                'fileGrpScore'
1347
            ];
1348
1349
            foreach ($configKeys as $key) {
1350
                if (!empty($extConf[$key])) {
1351
                    $useGrps = array_merge($useGrps, GeneralUtility::trimExplode(',', $extConf[$key]));
1352
                }
1353
            }
1354
1355
            foreach ($useGrps as $useGrp) {
1356
                // Perform XPath query for each configured USE attribute
1357
                $fileGrps = $this->mets->xpath("./mets:fileSec/mets:fileGrp[@USE='$useGrp']");
1358
                if (!empty($fileGrps)) {
1359
                    foreach ($fileGrps as $fileGrp) {
1360
                        foreach ($fileGrp->children('http://www.loc.gov/METS/')->file as $file) {
1361
                            $fileId = (string) $file->attributes()->ID;
1362
                            $this->fileGrps[$fileId] = $useGrp;
1363
                            $this->fileInfos[$fileId] = [
0 ignored issues
show
Bug introduced by
The property fileInfos is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1364
                                'fileGrp' => $useGrp,
1365
                                'admId' => (string) $file->attributes()->ADMID,
1366
                                'dmdId' => (string) $file->attributes()->DMDID,
1367
                            ];
1368
                        }
1369
                    }
1370
                }
1371
            }
1372
1373
            // Are there any fulltext files available?
1374
            if (
1375
                !empty($extConf['fileGrpFulltext'])
1376
                && array_intersect(GeneralUtility::trimExplode(',', $extConf['fileGrpFulltext']), $this->fileGrps) !== []
1377
            ) {
1378
                $this->hasFulltext = true;
1379
            }
1380
            $this->fileGrpsLoaded = true;
1381
        }
1382
        return $this->fileGrps;
1383
    }
1384
1385
    /**
1386
     * @see AbstractDocument::prepareMetadataArray()
1387
     */
1388
    protected function prepareMetadataArray(int $cPid): void
1389
    {
1390
        $ids = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID]/@ID');
1391
        // Get all logical structure nodes with metadata.
1392
        if (!empty($ids)) {
1393
            foreach ($ids as $id) {
1394
                $this->metadataArray[(string) $id] = $this->getMetadata((string) $id, $cPid);
1395
            }
1396
        }
1397
        // Set current PID for metadata definitions.
1398
    }
1399
1400
    /**
1401
     * This returns $this->mets via __get()
1402
     *
1403
     * @access protected
1404
     *
1405
     * @return SimpleXMLElement The XML's METS part as SimpleXMLElement object
1406
     */
1407
    protected function magicGetMets(): SimpleXMLElement
1408
    {
1409
        return $this->mets;
1410
    }
1411
1412
    /**
1413
     * @see AbstractDocument::magicGetPhysicalStructure()
1414
     */
1415
    protected function magicGetPhysicalStructure(): array
1416
    {
1417
        // Is there no physical structure array yet?
1418
        if (!$this->physicalStructureLoaded) {
1419
            // Does the document have a structMap node of type "PHYSICAL"?
1420
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div');
1421
            if (!empty($elementNodes)) {
1422
                // Get file groups.
1423
                $fileUse = $this->magicGetFileGrps();
1424
                // Get the physical sequence's metadata.
1425
                $physNode = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]');
1426
                $firstNode = $physNode[0];
1427
                $id = (string) $firstNode['ID'];
1428
                $this->physicalStructureInfo[$id]['id'] = $id;
1429
                $this->physicalStructureInfo[$id]['dmdId'] = isset($firstNode['DMDID']) ? (string) $firstNode['DMDID'] : '';
1430
                $this->physicalStructureInfo[$id]['admId'] = isset($firstNode['ADMID']) ? (string) $firstNode['ADMID'] : '';
1431
                $this->physicalStructureInfo[$id]['order'] = isset($firstNode['ORDER']) ? (string) $firstNode['ORDER'] : '';
1432
                $this->physicalStructureInfo[$id]['label'] = isset($firstNode['LABEL']) ? (string) $firstNode['LABEL'] : '';
1433
                $this->physicalStructureInfo[$id]['orderlabel'] = isset($firstNode['ORDERLABEL']) ? (string) $firstNode['ORDERLABEL'] : '';
1434
                $this->physicalStructureInfo[$id]['type'] = (string) $firstNode['TYPE'];
1435
                $this->physicalStructureInfo[$id]['contentIds'] = isset($firstNode['CONTENTIDS']) ? (string) $firstNode['CONTENTIDS'] : '';
1436
1437
                $this->getFileRepresentation($id, $firstNode);
1438
1439
                $this->physicalStructure = $this->getPhysicalElements($elementNodes, $fileUse);
1440
            }
1441
            $this->physicalStructureLoaded = true;
1442
1443
        }
1444
1445
        return $this->physicalStructure;
1446
    }
1447
1448
    /**
1449
     * Get the file representations from fileSec node.
1450
     *
1451
     * @access private
1452
     *
1453
     * @param string $id
1454
     * @param SimpleXMLElement $physicalNode
1455
     *
1456
     * @return void
1457
     */
1458
    private function getFileRepresentation(string $id, SimpleXMLElement $physicalNode): void
1459
    {
1460
        // Get file groups.
1461
        $fileUse = $this->magicGetFileGrps();
1462
1463
        foreach ($physicalNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1464
            $fileNode = $fptr->area ?? $fptr;
1465
            $fileId = (string) $fileNode->attributes()->FILEID;
1466
1467
            // Check if file has valid @USE attribute.
1468
            if (!empty($fileUse[$fileId])) {
1469
                $this->physicalStructureInfo[$id]['files'][$fileUse[$fileId]] = $fileId;
1470
            }
1471
        }
1472
    }
1473
1474
    /**
1475
     * Build the physical elements' array from the physical structMap node.
1476
     *
1477
     * @access private
1478
     *
1479
     * @param array $elementNodes
1480
     * @param array $fileUse
1481
     *
1482
     * @return array
1483
     */
1484
    private function getPhysicalElements(array $elementNodes, array $fileUse): array
1485
    {
1486
        $elements = [];
1487
        $id = '';
1488
1489
        foreach ($elementNodes as $elementNode) {
1490
            $id = (string) $elementNode['ID'];
1491
            $order = (int) $elementNode['ORDER'];
1492
            $elements[$order] = $id;
1493
            $this->physicalStructureInfo[$elements[$order]]['id'] = $id;
1494
            $this->physicalStructureInfo[$elements[$order]]['dmdId'] = isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '';
1495
            $this->physicalStructureInfo[$elements[$order]]['admId'] = isset($elementNode['ADMID']) ? (string) $elementNode['ADMID'] : '';
1496
            $this->physicalStructureInfo[$elements[$order]]['order'] = isset($elementNode['ORDER']) ? (string) $elementNode['ORDER'] : '';
1497
            $this->physicalStructureInfo[$elements[$order]]['label'] = isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '';
1498
            $this->physicalStructureInfo[$elements[$order]]['orderlabel'] = isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '';
1499
            $this->physicalStructureInfo[$elements[$order]]['type'] = (string) $elementNode['TYPE'];
1500
            $this->physicalStructureInfo[$elements[$order]]['contentIds'] = isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '';
1501
            // Get the file representations from fileSec node.
1502
            foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1503
                $fileNode = $fptr->area ?? $fptr;
1504
                $fileId = (string) $fileNode->attributes()->FILEID;
1505
1506
                // Check if file has valid @USE attribute.
1507
                if (!empty($fileUse[(string) $fileId])) {
1508
                    $this->physicalStructureInfo[$elements[$order]]['files'][$fileUse[$fileId]] = $fileId;
1509
                }
1510
            }
1511
1512
            // Get track info with begin and extent time for later assignment with musical
1513
            if ((string) $elementNode['TYPE'] === 'track') {
1514
                foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1515
                    if (isset($fptr->area) &&  ((string) $fptr->area->attributes()->BETYPE === 'TIME')) {
1516
                        // Check if file has valid @USE attribute.
1517
                        if (!empty($fileUse[(string) $fptr->area->attributes()->FILEID])) {
1518
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['tracks'][$fileUse[(string) $fptr->area->attributes()->FILEID]] = [
1519
                                'fileid'  => (string) $fptr->area->attributes()->FILEID,
1520
                                'begin'   => (string) $fptr->area->attributes()->BEGIN,
1521
                                'betype'  => (string) $fptr->area->attributes()->BETYPE,
1522
                                'extent'  => (string) $fptr->area->attributes()->EXTENT,
1523
                                'exttype' => (string) $fptr->area->attributes()->EXTTYPE,
1524
                            ];
1525
                        }
1526
                    }
1527
                }
1528
            }
1529
        }
1530
1531
        // Sort array by keys (= @ORDER).
1532
        ksort($elements);
1533
        // Set total number of pages/tracks.
1534
        $this->numPages = count($elements);
1535
        // Merge and re-index the array to get numeric indexes.
1536
        array_unshift($elements, $id);
1537
1538
        return $elements;
1539
    }
1540
1541
    /**
1542
     * @see AbstractDocument::magicGetSmLinks()
1543
     */
1544
    protected function magicGetSmLinks(): array
1545
    {
1546
        if (!$this->smLinksLoaded) {
1547
            $smLinks = $this->mets->xpath('./mets:structLink/mets:smLink');
1548
            if (!empty($smLinks)) {
1549
                foreach ($smLinks as $smLink) {
1550
                    $this->smLinks['l2p'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->from][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->to;
1551
                    $this->smLinks['p2l'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->to][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->from;
1552
                }
1553
            }
1554
            $this->smLinksLoaded = true;
1555
        }
1556
        return $this->smLinks;
1557
    }
1558
1559
    /**
1560
     * @see AbstractDocument::magicGetThumbnail()
1561
     */
1562
    protected function magicGetThumbnail(bool $forceReload = false): string
1563
    {
1564
        if (
1565
            !$this->thumbnailLoaded
1566
            || $forceReload
1567
        ) {
1568
            // Retain current PID.
1569
            $cPid = $this->cPid ?: $this->pid;
1570
            if (!$cPid) {
1571
                $this->logger->error('Invalid PID ' . $cPid . ' for structure definitions');
1572
                $this->thumbnailLoaded = true;
1573
                return $this->thumbnail;
1574
            }
1575
            // Load extension configuration.
1576
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey, 'files');
1577
            if (empty($extConf['fileGrpThumbs'])) {
1578
                $this->logger->warning('No fileGrp for thumbnails specified');
1579
                $this->thumbnailLoaded = true;
1580
                return $this->thumbnail;
1581
            }
1582
            $strctId = $this->magicGetToplevelId();
1583
            $metadata = $this->getToplevelMetadata($cPid);
1584
1585
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1586
                ->getQueryBuilderForTable('tx_dlf_structures');
1587
1588
            // Get structure element to get thumbnail from.
1589
            $result = $queryBuilder
1590
                ->select('tx_dlf_structures.thumbnail AS thumbnail')
1591
                ->from('tx_dlf_structures')
1592
                ->where(
1593
                    $queryBuilder->expr()->eq('tx_dlf_structures.pid', $cPid),
1594
                    $queryBuilder->expr()->eq('tx_dlf_structures.index_name', $queryBuilder->expr()->literal($metadata['type'][0])),
1595
                    Helper::whereExpression('tx_dlf_structures')
1596
                )
1597
                ->setMaxResults(1)
1598
                ->execute();
1599
1600
            $allResults = $result->fetchAllAssociative();
1601
1602
            if (count($allResults) == 1) {
1603
                $resArray = $allResults[0];
1604
                // Get desired thumbnail structure if not the toplevel structure itself.
1605
                if (!empty($resArray['thumbnail'])) {
1606
                    $strctType = Helper::getIndexNameFromUid($resArray['thumbnail'], 'tx_dlf_structures', $cPid);
1607
                    // Check if this document has a structure element of the desired type.
1608
                    $strctIds = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@TYPE="' . $strctType . '"]/@ID');
1609
                    if (!empty($strctIds)) {
1610
                        $strctId = (string) $strctIds[0];
1611
                    }
1612
                }
1613
                // Load smLinks.
1614
                $this->magicGetSmLinks();
1615
                // Get thumbnail location.
1616
                $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
1617
                while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
1618
                    if (
1619
                        $this->magicGetPhysicalStructure()
1620
                        && !empty($this->smLinks['l2p'][$strctId])
1621
                        && !empty($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb])
1622
                    ) {
1623
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$fileGrpThumb]);
1624
                        break;
1625
                    } elseif (!empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb])) {
1626
                        $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$fileGrpThumb]);
1627
                        break;
1628
                    }
1629
                }
1630
            } else {
1631
                $this->logger->error('No structure of type "' . $metadata['type'][0] . '" found in database');
1632
            }
1633
            $this->thumbnailLoaded = true;
1634
        }
1635
        return $this->thumbnail;
1636
    }
1637
1638
    /**
1639
     * @see AbstractDocument::magicGetToplevelId()
1640
     */
1641
    protected function magicGetToplevelId(): string
1642
    {
1643
        if (empty($this->toplevelId)) {
1644
            // Get all logical structure nodes with metadata, but without associated METS-Pointers.
1645
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID and not(./mets:mptr)]');
1646
            if (!empty($divs)) {
1647
                // Load smLinks.
1648
                $this->magicGetSmLinks();
1649
                foreach ($divs as $div) {
1650
                    $id = (string) $div['ID'];
1651
                    // Are there physical structure nodes for this logical structure?
1652
                    if (array_key_exists($id, $this->smLinks['l2p'])) {
1653
                        // Yes. That's what we're looking for.
1654
                        $this->toplevelId = $id;
1655
                        break;
1656
                    } elseif (empty($this->toplevelId)) {
1657
                        // No. Remember this anyway, but keep looking for a better one.
1658
                        $this->toplevelId = $id;
1659
                    }
1660
                }
1661
            }
1662
        }
1663
        return $this->toplevelId;
1664
    }
1665
1666
    /**
1667
     * Try to determine URL of parent document.
1668
     *
1669
     * @access public
1670
     *
1671
     * @return string
1672
     */
1673
    public function magicGetParentHref(): string
1674
    {
1675
        if (empty($this->parentHref)) {
1676
            // Get the closest ancestor of the current document which has a MPTR child.
1677
            $parentMptr = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="' . $this->toplevelId . '"]/ancestor::mets:div[./mets:mptr][1]/mets:mptr');
1678
            if (!empty($parentMptr)) {
1679
                $this->parentHref = (string) $parentMptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
0 ignored issues
show
Bug introduced by
The property parentHref is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1680
            }
1681
        }
1682
1683
        return $this->parentHref;
1684
    }
1685
1686
    /**
1687
     * This magic method is executed prior to any serialization of the object
1688
     * @see __wakeup()
1689
     *
1690
     * @access public
1691
     *
1692
     * @return array Properties to be serialized
1693
     */
1694
    public function __sleep(): array
1695
    {
1696
        // SimpleXMLElement objects can't be serialized, thus save the XML as string for serialization
1697
        $this->asXML = $this->xml->asXML();
1698
        return ['pid', 'recordId', 'parentId', 'asXML'];
1699
    }
1700
1701
    /**
1702
     * This magic method is used for setting a string value for the object
1703
     *
1704
     * @access public
1705
     *
1706
     * @return string String representing the METS object
1707
     */
1708
    public function __toString(): string
1709
    {
1710
        $xml = new DOMDocument('1.0', 'utf-8');
1711
        $xml->appendChild($xml->importNode(dom_import_simplexml($this->mets), true));
1712
        $xml->formatOutput = true;
1713
        return $xml->saveXML();
1714
    }
1715
1716
    /**
1717
     * This magic method is executed after the object is deserialized
1718
     * @see __sleep()
1719
     *
1720
     * @access public
1721
     *
1722
     * @return void
1723
     */
1724
    public function __wakeup(): void
1725
    {
1726
        $xml = Helper::getXmlFileAsString($this->asXML);
1727
        if ($xml !== false) {
1728
            $this->asXML = '';
1729
            $this->xml = $xml;
0 ignored issues
show
Documentation Bug introduced by
It seems like $xml of type SimpleXMLElement is incompatible with the declared type \SimpleXMLElement of property $xml.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
1730
            // Rebuild the unserializable properties.
1731
            $this->init('', $this->settings);
1732
        } else {
1733
            $this->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(static::class);
1734
            $this->logger->error('Could not load XML after deserialization');
1735
        }
1736
    }
1737
1738
    /**
1739
     * This builds an array of the document's musical structure
1740
     *
1741
     * @access protected
1742
     *
1743
     * @return array Array of musical elements' id, type, label and file representations ordered
1744
     * by "@ORDER" attribute
1745
     */
1746
    protected function magicGetMusicalStructure(): array
1747
    {
1748
        // Is there no musical structure array yet?
1749
        if (!$this->musicalStructureLoaded) {
1750
            $this->numMeasures = 0;
0 ignored issues
show
Bug introduced by
The property numMeasures is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1751
            // Does the document have a structMap node of type "MUSICAL"?
1752
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="MUSICAL"]/mets:div[@TYPE="measures"]/mets:div');
1753
            if (!empty($elementNodes)) {
1754
                $musicalSeq = [];
1755
                // Get file groups.
1756
                $fileUse = $this->magicGetFileGrps();
1757
1758
                // Get the musical sequence's metadata.
1759
                $musicalNode = $this->mets->xpath('./mets:structMap[@TYPE="MUSICAL"]/mets:div[@TYPE="measures"]');
1760
                $musicalSeq[0] = (string) $musicalNode[0]['ID'];
1761
                $this->musicalStructureInfo[$musicalSeq[0]]['id'] = (string) $musicalNode[0]['ID'];
0 ignored issues
show
Bug introduced by
The property musicalStructureInfo is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1762
                $this->musicalStructureInfo[$musicalSeq[0]]['dmdId'] = (isset($musicalNode[0]['DMDID']) ? (string) $musicalNode[0]['DMDID'] : '');
1763
                $this->musicalStructureInfo[$musicalSeq[0]]['order'] = (isset($musicalNode[0]['ORDER']) ? (string) $musicalNode[0]['ORDER'] : '');
1764
                $this->musicalStructureInfo[$musicalSeq[0]]['label'] = (isset($musicalNode[0]['LABEL']) ? (string) $musicalNode[0]['LABEL'] : '');
1765
                $this->musicalStructureInfo[$musicalSeq[0]]['orderlabel'] = (isset($musicalNode[0]['ORDERLABEL']) ? (string) $musicalNode[0]['ORDERLABEL'] : '');
1766
                $this->musicalStructureInfo[$musicalSeq[0]]['type'] = (string) $musicalNode[0]['TYPE'];
1767
                $this->musicalStructureInfo[$musicalSeq[0]]['contentIds'] = (isset($musicalNode[0]['CONTENTIDS']) ? (string) $musicalNode[0]['CONTENTIDS'] : '');
1768
                // Get the file representations from fileSec node.
1769
                // TODO: Do we need this for the measurement container element? Can it have any files?
1770
                foreach ($musicalNode[0]->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1771
                    // Check if file has valid @USE attribute.
1772
                    if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1773
                        $this->musicalStructureInfo[$musicalSeq[0]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = [
1774
                            'fileid' => (string) $fptr->area->attributes()->FILEID,
1775
                            'begin' => (string) $fptr->area->attributes()->BEGIN,
1776
                            'end' => (string) $fptr->area->attributes()->END,
1777
                            'type' => (string) $fptr->area->attributes()->BETYPE,
1778
                            'shape' => (string) $fptr->area->attributes()->SHAPE,
1779
                            'coords' => (string) $fptr->area->attributes()->COORDS
1780
                        ];
1781
                    }
1782
1783
                    if ((string) $fptr->area->attributes()->BETYPE === 'TIME') {
1784
                        $this->musicalStructureInfo[$musicalSeq[0]]['begin'] = (string) $fptr->area->attributes()->BEGIN;
1785
                        $this->musicalStructureInfo[$musicalSeq[0]]['end'] = (string) $fptr->area->attributes()->END;
1786
                    }
1787
                }
1788
1789
                $elements = [];
1790
1791
                // Build the physical elements' array from the physical structMap node.
1792
                foreach ($elementNodes as $elementNode) {
1793
                    $elements[(int) $elementNode['ORDER']] = (string) $elementNode['ID'];
1794
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['id'] = (string) $elementNode['ID'];
1795
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['dmdId'] = (isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '');
1796
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['order'] = (isset($elementNode['ORDER']) ? (string) $elementNode['ORDER'] : '');
1797
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['label'] = (isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '');
1798
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['orderlabel'] = (isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '');
1799
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['type'] = (string) $elementNode['TYPE'];
1800
                    $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['contentIds'] = (isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '');
1801
                    // Get the file representations from fileSec node.
1802
1803
                    foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1804
                        // Check if file has valid @USE attribute.
1805
                        if (!empty($fileUse[(string) $fptr->area->attributes()->FILEID])) {
1806
                            $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['files'][$fileUse[(string) $fptr->area->attributes()->FILEID]] = [
1807
                                'fileid' => (string) $fptr->area->attributes()->FILEID,
1808
                                'begin' => (string) $fptr->area->attributes()->BEGIN,
1809
                                'end' => (string) $fptr->area->attributes()->END,
1810
                                'type' => (string) $fptr->area->attributes()->BETYPE,
1811
                                'shape' => (string) $fptr->area->attributes()->SHAPE,
1812
                                'coords' => (string) $fptr->area->attributes()->COORDS
1813
                            ];
1814
                        }
1815
1816
                        if ((string) $fptr->area->attributes()->BETYPE === 'TIME') {
1817
                            $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['begin'] = (string) $fptr->area->attributes()->BEGIN;
1818
                            $this->musicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['end'] = (string) $fptr->area->attributes()->END;
1819
                        }
1820
                    }
1821
                }
1822
1823
                // Sort array by keys (= @ORDER).
1824
                ksort($elements);
1825
                // Set total number of measures.
1826
                $this->numMeasures = count($elements);
1827
1828
                // Get the track/page info (begin and extent time).
1829
                $this->musicalStructure = [];
0 ignored issues
show
Bug introduced by
The property musicalStructure is declared read-only in Kitodo\Dlf\Common\MetsDocument.
Loading history...
1830
                $measurePages = [];
1831
                foreach ($this->magicGetPhysicalStructureInfo() as $physicalId => $page) {
1832
                    if ($page['files']['DEFAULT']) {
1833
                        $measurePages[$physicalId] = $page['files']['DEFAULT'];
1834
                    }
1835
                }
1836
                // Build final musicalStructure: assign pages to measures.
1837
                foreach ($this->musicalStructureInfo as $measureId => $measureInfo) {
1838
                    foreach ($measurePages as $physicalId => $file) {
1839
                        if ($measureInfo['files']['DEFAULT']['fileid'] === $file) {
1840
                            $this->musicalStructure[$measureInfo['order']] = [
1841
                                'measureid' => $measureId,
1842
                                'physicalid' => $physicalId,
1843
                                'page' => array_search($physicalId, $this->physicalStructure)
1844
                            ];
1845
                        }
1846
                    }
1847
                }
1848
1849
            }
1850
            $this->musicalStructureLoaded = true;
1851
        }
1852
1853
        return $this->musicalStructure;
1854
    }
1855
1856
    /**
1857
     * This gives an array of the document's musical structure metadata
1858
     *
1859
     * @access protected
1860
     *
1861
     * @return array Array of elements' type, label and file representations ordered by "@ID" attribute
1862
     */
1863
    protected function magicGetMusicalStructureInfo(): array
1864
    {
1865
        // Is there no musical structure array yet?
1866
        if (!$this->musicalStructureLoaded) {
1867
            // Build musical structure array.
1868
            $this->magicGetMusicalStructure();
1869
        }
1870
        return $this->musicalStructureInfo;
1871
    }
1872
1873
    /**
1874
     * This returns $this->numMeasures via __get()
1875
     *
1876
     * @access protected
1877
     *
1878
     * @return int The total number of measres
1879
     */
1880
    protected function magicGetNumMeasures(): int
1881
    {
1882
        $this->magicGetMusicalStructure();
1883
        return $this->numMeasures;
1884
    }
1885
}
1886