Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#650)
by Alexander
03:43 queued 51s
created

Document::getInstance()   F

Complexity

Conditions 28
Paths 1022

Size

Total Lines 152
Code Lines 94

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 94
dl 0
loc 152
rs 0
c 0
b 0
f 0
cc 28
nc 1022
nop 3

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common\Document;
14
15
use Kitodo\Dlf\Common\Helper;
16
use Kitodo\Dlf\Common\Indexer;
17
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
18
use TYPO3\CMS\Core\Database\ConnectionPool;
19
use TYPO3\CMS\Core\Log\LogManager;
20
use TYPO3\CMS\Core\Utility\GeneralUtility;
21
use TYPO3\CMS\Core\Utility\MathUtility;
22
use Ubl\Iiif\Presentation\Common\Model\Resources\IiifResourceInterface;
23
use Ubl\Iiif\Tools\IiifHelper;
24
25
/**
26
 * Document class for the 'dlf' extension
27
 *
28
 * @author Sebastian Meyer <[email protected]>
29
 * @author Henrik Lochmann <[email protected]>
30
 * @package TYPO3
31
 * @subpackage dlf
32
 * @access public
33
 * @property int $cPid This holds the PID for the configuration
34
 * @property-read string $location This holds the documents location
35
 * @property-read array $metadataArray This holds the documents' parsed metadata array
36
 * @property-read int $numPages The holds the total number of pages
37
 * @property-read int $parentId This holds the UID of the parent document or zero if not multi-volumed
38
 * @property-read array $physicalStructure This holds the physical structure
39
 * @property-read array $physicalStructureInfo This holds the physical structure metadata
40
 * @property-read int $pid This holds the PID of the document or zero if not in database
41
 * @property-read bool $ready Is the document instantiated successfully?
42
 * @property-read string $recordId The METS file's / IIIF manifest's record identifier
43
 * @property-read int $rootId This holds the UID of the root document or zero if not multi-volumed
44
 * @property-read array $smLinks This holds the smLinks between logical and physical structMap
45
 * @property-read array $tableOfContents This holds the logical structure
46
 * @property-read string $thumbnail This holds the document's thumbnail location
47
 * @property-read string $toplevelId This holds the toplevel structure's @ID (METS) or the manifest's @id (IIIF)
48
 * @property-read mixed $uid This holds the UID or the URL of the document
49
 * @abstract
50
 */
51
abstract class Document
52
{
53
    /**
54
     * This holds the logger
55
     *
56
     * @var LogManager
57
     * @access protected
58
     */
59
    protected $logger;
60
61
    /**
62
     * This holds the PID for the configuration
63
     *
64
     * @var int
65
     * @access protected
66
     */
67
    protected $cPid = 0;
68
69
    /**
70
     * The extension key
71
     *
72
     * @var string
73
     * @access public
74
     */
75
    public static $extKey = 'dlf';
76
77
    /**
78
     * This holds the configuration for all supported metadata encodings
79
     * @see loadFormats()
80
     *
81
     * @var array
82
     * @access protected
83
     */
84
    protected $formats = [
85
        'OAI' => [
86
            'rootElement' => 'OAI-PMH',
87
            'namespaceURI' => 'http://www.openarchives.org/OAI/2.0/',
88
        ],
89
        'METS' => [
90
            'rootElement' => 'mets',
91
            'namespaceURI' => 'http://www.loc.gov/METS/',
92
        ],
93
        'XLINK' => [
94
            'rootElement' => 'xlink',
95
            'namespaceURI' => 'http://www.w3.org/1999/xlink',
96
        ]
97
    ];
98
99
    /**
100
     * Are the available metadata formats loaded?
101
     * @see $formats
102
     *
103
     * @var bool
104
     * @access protected
105
     */
106
    protected $formatsLoaded = false;
107
108
    /**
109
     * Last searched logical and physical page
110
     *
111
     * @var array
112
     * @access protected
113
     */
114
    protected $lastSearchedPhysicalPage = ['logicalPage' => null, 'physicalPage' => null];
115
116
    /**
117
     * This holds the documents location
118
     *
119
     * @var string
120
     * @access protected
121
     */
122
    protected $location = '';
123
124
    /**
125
     * This holds the logical units
126
     *
127
     * @var array
128
     * @access protected
129
     */
130
    protected $logicalUnits = [];
131
132
    /**
133
     * This holds the documents' parsed metadata array with their corresponding
134
     * structMap//div's ID (METS) or Range / Manifest / Sequence ID (IIIF) as array key
135
     *
136
     * @var array
137
     * @access protected
138
     */
139
    protected $metadataArray = [];
140
141
    /**
142
     * Is the metadata array loaded?
143
     * @see $metadataArray
144
     *
145
     * @var bool
146
     * @access protected
147
     */
148
    protected $metadataArrayLoaded = false;
149
150
    /**
151
     * The holds the total number of pages
152
     *
153
     * @var int
154
     * @access protected
155
     */
156
    protected $numPages = 0;
157
158
    /**
159
     * This holds the UID of the parent document or zero if not multi-volumed
160
     *
161
     * @var int
162
     * @access protected
163
     */
164
    protected $parentId = 0;
165
166
    /**
167
     * This holds the physical structure
168
     *
169
     * @var array
170
     * @access protected
171
     */
172
    protected $physicalStructure = [];
173
174
    /**
175
     * This holds the physical structure metadata
176
     *
177
     * @var array
178
     * @access protected
179
     */
180
    protected $physicalStructureInfo = [];
181
182
    /**
183
     * Is the physical structure loaded?
184
     * @see $physicalStructure
185
     *
186
     * @var bool
187
     * @access protected
188
     */
189
    protected $physicalStructureLoaded = false;
190
191
    /**
192
     * This holds the PID of the document or zero if not in database
193
     *
194
     * @var int
195
     * @access protected
196
     */
197
    protected $pid = 0;
198
199
    /**
200
     * Is the document instantiated successfully?
201
     *
202
     * @var bool
203
     * @access protected
204
     */
205
    protected $ready = false;
206
207
    /**
208
     * The METS file's / IIIF manifest's record identifier
209
     *
210
     * @var string
211
     * @access protected
212
     */
213
    protected $recordId;
214
215
    /**
216
     * This holds the singleton object of the document
217
     *
218
     * @var array (\Kitodo\Dlf\Common\Document\Document)
219
     * @static
220
     * @access protected
221
     */
222
    protected static $registry = [];
223
224
    /**
225
     * This holds the UID of the root document or zero if not multi-volumed
226
     *
227
     * @var int
228
     * @access protected
229
     */
230
    protected $rootId = 0;
231
232
    /**
233
     * Is the root id loaded?
234
     * @see $rootId
235
     *
236
     * @var bool
237
     * @access protected
238
     */
239
    protected $rootIdLoaded = false;
240
241
    /**
242
     * This holds the smLinks between logical and physical structMap
243
     *
244
     * @var array
245
     * @access protected
246
     */
247
    protected $smLinks = ['l2p' => [], 'p2l' => []];
248
249
    /**
250
     * Are the smLinks loaded?
251
     * @see $smLinks
252
     *
253
     * @var bool
254
     * @access protected
255
     */
256
    protected $smLinksLoaded = false;
257
258
    /**
259
     * This holds the logical structure
260
     *
261
     * @var array
262
     * @access protected
263
     */
264
    protected $tableOfContents = [];
265
266
    /**
267
     * Is the table of contents loaded?
268
     * @see $tableOfContents
269
     *
270
     * @var bool
271
     * @access protected
272
     */
273
    protected $tableOfContentsLoaded = false;
274
275
    /**
276
     * This holds the document's thumbnail location
277
     *
278
     * @var string
279
     * @access protected
280
     */
281
    protected $thumbnail = '';
282
283
    /**
284
     * Is the document's thumbnail location loaded?
285
     * @see $thumbnail
286
     *
287
     * @var bool
288
     * @access protected
289
     */
290
    protected $thumbnailLoaded = false;
291
292
    /**
293
     * This holds the toplevel structure's @ID (METS) or the manifest's @id (IIIF)
294
     *
295
     * @var string
296
     * @access protected
297
     */
298
    protected $toplevelId = '';
299
300
    /**
301
     * This holds the UID or the URL of the document
302
     *
303
     * @var mixed
304
     * @access protected
305
     */
306
    protected $uid = 0;
307
308
    /**
309
     * This holds the whole XML file as \SimpleXMLElement object
310
     *
311
     * @var \SimpleXMLElement
312
     * @access protected
313
     */
314
    protected $xml;
315
316
    /**
317
     * This clears the static registry to prevent memory exhaustion
318
     *
319
     * @access public
320
     *
321
     * @static
322
     *
323
     * @return void
324
     */
325
    public static function clearRegistry()
326
    {
327
        // Reset registry array.
328
        self::$registry = [];
329
    }
330
331
    /**
332
     * This is a singleton class, thus an instance must be created by this method
333
     *
334
     * @access public
335
     *
336
     * @static
337
     *
338
     * @param mixed $uid: The unique identifier of the document to parse, the URL of XML file or the IRI of the IIIF resource
339
     * @param int $pid: If > 0, then only document with this PID gets loaded
340
     * @param bool $forceReload: Force reloading the document instead of returning the cached instance
341
     *
342
     * @return \Kitodo\Dlf\Common\Document\Document Instance of this class, either MetsDocument or IiifManifest
343
     */
344
    public static function &getInstance($uid, $pid = 0, $forceReload = false)
345
    {
346
        // Sanitize input.
347
        $pid = max(intval($pid), 0);
348
        if (!$forceReload) {
349
            $regObj = Helper::digest($uid);
350
            if (
351
                is_object(self::$registry[$regObj])
352
                && self::$registry[$regObj] instanceof self
353
            ) {
354
                // Check if instance has given PID.
355
                if (
356
                    !$pid
357
                    || !self::$registry[$regObj]->pid
358
                    || $pid == self::$registry[$regObj]->pid
359
                ) {
360
                    // Return singleton instance if available.
361
                    return self::$registry[$regObj];
362
                }
363
            } else {
364
                // Check the user's session...
365
                $sessionData = Helper::loadFromSession(get_called_class());
366
                if (
367
                    is_object($sessionData[$regObj])
368
                    && $sessionData[$regObj] instanceof self
369
                ) {
370
                    // Check if instance has given PID.
371
                    if (
372
                        !$pid
373
                        || !$sessionData[$regObj]->pid
374
                        || $pid == $sessionData[$regObj]->pid
375
                    ) {
376
                        // ...and restore registry.
377
                        self::$registry[$regObj] = $sessionData[$regObj];
378
                        return self::$registry[$regObj];
379
                    }
380
                }
381
            }
382
        }
383
        // Create new instance depending on format (METS or IIIF) ...
384
        $instance = null;
385
        $documentFormat = null;
386
        $xml = null;
387
        $iiif = null;
388
        // Try to get document format from database
389
        if (MathUtility::canBeInterpretedAsInteger($uid)) {
390
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
391
                ->getQueryBuilderForTable('tx_dlf_documents');
392
393
            $queryBuilder
394
                ->select(
395
                    'tx_dlf_documents.location AS location',
396
                    'tx_dlf_documents.document_format AS document_format'
397
                )
398
                ->from('tx_dlf_documents');
399
400
            // Get UID of document with given record identifier.
401
            if ($pid) {
402
                $queryBuilder
403
                    ->where(
404
                        $queryBuilder->expr()->eq('tx_dlf_documents.uid', intval($uid)),
405
                        $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($pid)),
406
                        Helper::whereExpression('tx_dlf_documents')
407
                    );
408
            } else {
409
                $queryBuilder
410
                    ->where(
411
                        $queryBuilder->expr()->eq('tx_dlf_documents.uid', intval($uid)),
412
                        Helper::whereExpression('tx_dlf_documents')
413
                    );
414
            }
415
416
            $result = $queryBuilder
417
                ->setMaxResults(1)
418
                ->execute();
419
420
            if ($resArray = $result->fetch()) {
421
                $documentFormat = $resArray['document_format'];
422
            }
423
        } else {
424
            // Get document format from content of remote document
425
            // Cast to string for safety reasons.
426
            $location = (string) $uid;
427
            // Try to load a file from the url
428
            if (GeneralUtility::isValidUrl($location)) {
429
                // Load extension configuration
430
                $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
431
                // Set user-agent to identify self when fetching XML data.
432
                if (!empty($extConf['useragent'])) {
433
                    @ini_set('user_agent', $extConf['useragent']);
434
                }
435
                $content = GeneralUtility::getUrl($location);
436
                if ($content !== false) {
437
                    // TODO use single place to load xml
438
                    // Turn off libxml's error logging.
439
                    $libxmlErrors = libxml_use_internal_errors(true);
440
                    // Disables the functionality to allow external entities to be loaded when parsing the XML, must be kept
441
                    $previousValueOfEntityLoader = libxml_disable_entity_loader(true);
442
                    // Try to load XML from file.
443
                    $xml = simplexml_load_string($content);
444
                    // reset entity loader setting
445
                    libxml_disable_entity_loader($previousValueOfEntityLoader);
446
                    // Reset libxml's error logging.
447
                    libxml_use_internal_errors($libxmlErrors);
448
                    if ($xml !== false) {
449
                        /* @var $xml \SimpleXMLElement */
450
                        $xml->registerXPathNamespace('mets', 'http://www.loc.gov/METS/');
451
                        $xpathResult = $xml->xpath('//mets:mets');
452
                        $documentFormat = !empty($xpathResult) ? 'METS' : null;
453
                    } else {
454
                        // Try to load file as IIIF resource instead.
455
                        $contentAsJsonArray = json_decode($content, true);
456
                        if ($contentAsJsonArray !== null) {
457
                            // Load plugin configuration.
458
                            $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
459
                            IiifHelper::setUrlReader(IiifUrlReader::getInstance());
0 ignored issues
show
Bug introduced by
The type Kitodo\Dlf\Common\Document\IiifUrlReader was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
460
                            IiifHelper::setMaxThumbnailHeight($conf['iiifThumbnailHeight']);
461
                            IiifHelper::setMaxThumbnailWidth($conf['iiifThumbnailWidth']);
462
                            $iiif = IiifHelper::loadIiifResource($contentAsJsonArray);
463
                            if ($iiif instanceof IiifResourceInterface) {
464
                                $documentFormat = 'IIIF';
465
                            }
466
                        }
467
                    }
468
                }
469
            }
470
        }
471
        // Sanitize input.
472
        $pid = max(intval($pid), 0);
473
        if ($documentFormat == 'METS') {
474
            $instance = new MetsDocument($uid, $pid, $xml);
475
        } elseif ($documentFormat == 'IIIF') {
476
            $instance = new IiifManifest($uid, $pid, $iiif);
477
        }
478
        // Save instance to registry.
479
        if (
480
            $instance instanceof self
481
            && $instance->ready) {
482
            self::$registry[Helper::digest($instance->uid)] = $instance;
483
            if ($instance->uid != $instance->location) {
484
                self::$registry[Helper::digest($instance->location)] = $instance;
485
            }
486
            // Load extension configuration
487
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
488
            // Save registry to session if caching is enabled.
489
            if (!empty($extConf['caching'])) {
490
                Helper::saveToSession(self::$registry, get_class($instance));
491
            }
492
        }
493
        $instance->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class($instance));
0 ignored issues
show
Bug introduced by
It seems like $instance can also be of type null; however, parameter $object of get_class() does only seem to accept object, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

493
        $instance->logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(get_class(/** @scrutinizer ignore-type */ $instance));
Loading history...
494
        // Return new instance.
495
        return $instance;
496
    }
497
498
    /**
499
     * Source document PHP object which is represented by a Document instance
500
     *
501
     * @access protected
502
     *
503
     * @abstract
504
     *
505
     * @return \SimpleXMLElement|IiifResourceInterface An PHP object representation of
506
     * the current document. SimpleXMLElement for METS, IiifResourceInterface for IIIF
507
     */
508
    protected abstract function getDocument();
509
510
    /**
511
     * This extracts all the metadata for a logical structure node
512
     *
513
     * @access public
514
     *
515
     * @abstract
516
     *
517
     * @param string $id: The @ID attribute of the logical structure node (METS) or the @id property
518
     * of the Manifest / Range (IIIF)
519
     * @param int $cPid: The PID for the metadata definitions
520
     *                       (defaults to $this->cPid or $this->pid)
521
     *
522
     * @return array The logical structure node's / the IIIF resource's parsed metadata array
523
     */
524
    public abstract function getMetadata($id, $cPid = 0);
525
526
    /**
527
     * This gets the location of a downloadable file for a physical page or track
528
     *
529
     * @access public
530
     *
531
     * @abstract
532
     *
533
     * @param string $id: The @ID attribute of the file node (METS) or the @id property of the IIIF resource
534
     *
535
     * @return string    The file's location as URL
536
     */
537
    public abstract function getDownloadLocation($id);
538
539
    /**
540
     * This gets the location of a file representing a physical page or track
541
     *
542
     * @access public
543
     *
544
     * @abstract
545
     *
546
     * @param string $id: The @ID attribute of the file node (METS) or the @id property of the IIIF resource
547
     *
548
     * @return string The file's location as URL
549
     */
550
    public abstract function getFileLocation($id);
551
552
    /**
553
     * This gets the MIME type of a file representing a physical page or track
554
     *
555
     * @access public
556
     *
557
     * @abstract
558
     *
559
     * @param string $id: The @ID attribute of the file node
560
     *
561
     * @return string The file's MIME type
562
     */
563
    public abstract function getFileMimeType($id);
564
565
    /**
566
     * This gets details about a logical structure element
567
     *
568
     * @access public
569
     *
570
     * @abstract
571
     *
572
     * @param string $id: The @ID attribute of the logical structure node (METS) or
573
     * the @id property of the Manifest / Range (IIIF)
574
     * @param bool $recursive: Whether to include the child elements / resources
575
     *
576
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
577
     */
578
    public abstract function getLogicalStructure($id, $recursive = false);
579
580
    /**
581
     * This returns the first corresponding physical page number of a given logical page label
582
     *
583
     * @access public
584
     *
585
     * @param string $logicalPage: The label (or a part of the label) of the logical page
586
     *
587
     * @return int The physical page number
588
     */
589
    public function getPhysicalPage($logicalPage)
590
    {
591
        if (
592
            !empty($this->lastSearchedPhysicalPage['logicalPage'])
593
            && $this->lastSearchedPhysicalPage['logicalPage'] == $logicalPage
594
        ) {
595
            return $this->lastSearchedPhysicalPage['physicalPage'];
596
        } else {
597
            $physicalPage = 0;
598
            foreach ($this->physicalStructureInfo as $page) {
599
                if (strpos($page['orderlabel'], $logicalPage) !== false) {
600
                    $this->lastSearchedPhysicalPage['logicalPage'] = $logicalPage;
601
                    $this->lastSearchedPhysicalPage['physicalPage'] = $physicalPage;
602
                    return $physicalPage;
603
                }
604
                $physicalPage++;
605
            }
606
        }
607
        return 1;
608
    }
609
610
    /**
611
     * This determines a title for the given document
612
     *
613
     * @access public
614
     *
615
     * @static
616
     *
617
     * @param int $uid: The UID of the document
618
     * @param bool $recursive: Search superior documents for a title, too?
619
     *
620
     * @return string The title of the document itself or a parent document
621
     */
622
    public static function getTitle($uid, $recursive = false)
623
    {
624
        $logger = GeneralUtility::makeInstance(LogManager::class)->getLogger(__CLASS__);
625
626
        $title = '';
627
        // Sanitize input.
628
        $uid = max(intval($uid), 0);
629
        if ($uid) {
630
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
631
                ->getQueryBuilderForTable('tx_dlf_documents');
632
633
            $result = $queryBuilder
634
                ->select(
635
                    'tx_dlf_documents.title',
636
                    'tx_dlf_documents.partof'
637
                )
638
                ->from('tx_dlf_documents')
639
                ->where(
640
                    $queryBuilder->expr()->eq('tx_dlf_documents.uid', $uid),
641
                    Helper::whereExpression('tx_dlf_documents')
642
                )
643
                ->setMaxResults(1)
644
                ->execute();
645
646
            if ($resArray = $result->fetch()) {
647
                // Get title information.
648
                $title = $resArray['title'];
649
                $partof = $resArray['partof'];
650
                // Search parent documents recursively for a title?
651
                if (
652
                    $recursive
653
                    && empty($title)
654
                    && intval($partof)
655
                    && $partof != $uid
656
                ) {
657
                    $title = self::getTitle($partof, true);
658
                }
659
            } else {
660
                $logger->warning('No document with UID ' . $uid . ' found or document not accessible');
661
            }
662
        } else {
663
            $logger->error('Invalid UID ' . $uid . ' for document');
664
        }
665
        return $title;
666
    }
667
668
    /**
669
     * This extracts all the metadata for the toplevel logical structure node / resource
670
     *
671
     * @access public
672
     *
673
     * @param int $cPid: The PID for the metadata definitions
674
     *
675
     * @return array The logical structure node's / resource's parsed metadata array
676
     */
677
    public function getTitleData($cPid = 0)
678
    {
679
        $titledata = $this->getMetadata($this->_getToplevelId(), $cPid);
680
        // Add information from METS structural map to titledata array.
681
        if ($this instanceof MetsDocument) {
682
            $this->addMetadataFromMets($titledata, $this->_getToplevelId());
683
        }
684
        // Set record identifier for METS file / IIIF manifest if not present.
685
        if (
686
            is_array($titledata)
687
            && array_key_exists('record_id', $titledata)
688
        ) {
689
            if (
690
                !empty($this->recordId)
691
                && !in_array($this->recordId, $titledata['record_id'])
692
            ) {
693
                array_unshift($titledata['record_id'], $this->recordId);
694
            }
695
        }
696
        return $titledata;
697
    }
698
699
    /**
700
     * Get the tree depth of a logical structure element within the table of content
701
     *
702
     * @access public
703
     *
704
     * @param string $logId: The id of the logical structure element whose depth is requested
705
     * @return int|bool tree depth as integer or false if no element with $logId exists within the TOC.
706
     */
707
    public function getStructureDepth($logId)
708
    {
709
        return $this->getTreeDepth($this->_getTableOfContents(), 1, $logId);
710
    }
711
712
    /**
713
     * Register all available namespaces for a \SimpleXMLElement object
714
     *
715
     * @access public
716
     *
717
     * @param \SimpleXMLElement|\DOMXPath &$obj: \SimpleXMLElement or \DOMXPath object
718
     *
719
     * @return void
720
     */
721
    public function registerNamespaces(&$obj)
722
    {
723
        // TODO Check usage. XML specific method does not seem to be used anywhere outside this class within the project, but it is public and may be used by extensions.
724
        $this->loadFormats();
725
        // Do we have a \SimpleXMLElement or \DOMXPath object?
726
        if ($obj instanceof \SimpleXMLElement) {
727
            $method = 'registerXPathNamespace';
728
        } elseif ($obj instanceof \DOMXPath) {
729
            $method = 'registerNamespace';
730
        } else {
731
            $this->logger->error('Given object is neither a SimpleXMLElement nor a DOMXPath instance');
1 ignored issue
show
Bug introduced by
The method error() does not exist on TYPO3\CMS\Core\Log\LogManager. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

731
            $this->logger->/** @scrutinizer ignore-call */ 
732
                           error('Given object is neither a SimpleXMLElement nor a DOMXPath instance');

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
732
            return;
733
        }
734
        // Register metadata format's namespaces.
735
        foreach ($this->formats as $enc => $conf) {
736
            $obj->$method(strtolower($enc), $conf['namespaceURI']);
737
        }
738
    }
739
740
    /**
741
     * This saves the document to the database and index
742
     *
743
     * @access public
744
     *
745
     * @param int $pid: The PID of the saved record
746
     * @param int $core: The UID of the Solr core for indexing
747
     * @param int|string $owner: UID or index_name of owner to set while indexing
748
     *
749
     * @return bool true on success or false on failure
750
     */
751
    public function save($pid = 0, $core = 0, $owner = null)
752
    {
753
        if (\TYPO3_MODE !== 'BE') {
754
            $this->logger->error('Saving a document is only allowed in the backend');
755
            return false;
756
        }
757
        // Make sure $pid is a non-negative integer.
758
        $pid = max(intval($pid), 0);
759
        // Make sure $core is a non-negative integer.
760
        $core = max(intval($core), 0);
761
        // If $pid is not given, try to get it elsewhere.
762
        if (
763
            !$pid
764
            && $this->pid
765
        ) {
766
            // Retain current PID.
767
            $pid = $this->pid;
768
        } elseif (!$pid) {
769
            $this->logger->error('Invalid PID ' . $pid . ' for document saving');
770
            return false;
771
        }
772
        // Set PID for metadata definitions.
773
        $this->cPid = $pid;
774
        // Set UID placeholder if not updating existing record.
775
        if ($pid != $this->pid) {
776
            $this->uid = uniqid('NEW');
0 ignored issues
show
Bug introduced by
The property uid is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
777
        }
778
        // Get metadata array.
779
        $metadata = $this->getTitleData($pid);
780
        // Check for record identifier.
781
        if (empty($metadata['record_id'][0])) {
782
            $this->logger->error('No record identifier found to avoid duplication');
783
            return false;
784
        }
785
        // Load plugin configuration.
786
        $conf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
787
788
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
789
            ->getQueryBuilderForTable('tx_dlf_structures');
790
791
        // Get UID for structure type.
792
        $result = $queryBuilder
793
            ->select('tx_dlf_structures.uid AS uid')
794
            ->from('tx_dlf_structures')
795
            ->where(
796
                $queryBuilder->expr()->eq('tx_dlf_structures.pid', intval($pid)),
797
                $queryBuilder->expr()->eq('tx_dlf_structures.index_name', $queryBuilder->expr()->literal($metadata['type'][0])),
798
                Helper::whereExpression('tx_dlf_structures')
799
            )
800
            ->setMaxResults(1)
801
            ->execute();
802
803
        if ($resArray = $result->fetch()) {
804
            $structure = $resArray['uid'];
805
        } else {
806
            $this->logger->error('Could not identify document/structure type "' . $queryBuilder->expr()->literal($metadata['type'][0]) . '"');
807
            return false;
808
        }
809
        $metadata['type'][0] = $structure;
810
811
        // Remove appended "valueURI" from authors' names for storing in database.
812
        foreach ($metadata['author'] as $i => $author) {
813
            $splitName = explode(chr(31), $author);
814
            $metadata['author'][$i] = $splitName[0];
815
        }
816
817
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
818
            ->getQueryBuilderForTable('tx_dlf_collections');
819
820
        // Get UIDs for collections.
821
        $result = $queryBuilder
822
            ->select(
823
                'tx_dlf_collections.index_name AS index_name',
824
                'tx_dlf_collections.uid AS uid'
825
            )
826
            ->from('tx_dlf_collections')
827
            ->where(
828
                $queryBuilder->expr()->eq('tx_dlf_collections.pid', intval($pid)),
829
                $queryBuilder->expr()->in('tx_dlf_collections.sys_language_uid', [-1, 0]),
830
                Helper::whereExpression('tx_dlf_collections')
831
            )
832
            ->execute();
833
834
        $collUid = [];
835
        while ($resArray = $result->fetch()) {
836
            $collUid[$resArray['index_name']] = $resArray['uid'];
837
        }
838
        $collections = [];
839
        foreach ($metadata['collection'] as $collection) {
840
            if (!empty($collUid[$collection])) {
841
                // Add existing collection's UID.
842
                $collections[] = $collUid[$collection];
843
            } else {
844
                // Insert new collection.
845
                $collNewUid = uniqid('NEW');
846
                $collData['tx_dlf_collections'][$collNewUid] = [
847
                    'pid' => $pid,
848
                    'label' => $collection,
849
                    'index_name' => $collection,
850
                    'oai_name' => (!empty($conf['publishNewCollections']) ? Helper::getCleanString($collection) : ''),
851
                    'description' => '',
852
                    'documents' => 0,
853
                    'owner' => 0,
854
                    'status' => 0,
855
                ];
856
                $substUid = Helper::processDBasAdmin($collData);
857
                // Prevent double insertion.
858
                unset($collData);
859
                // Add new collection's UID.
860
                $collections[] = $substUid[$collNewUid];
861
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
862
                    Helper::addMessage(
863
                        htmlspecialchars(sprintf(Helper::getMessage('flash.newCollection'), $collection, $substUid[$collNewUid])),
864
                        Helper::getMessage('flash.attention', true),
865
                        \TYPO3\CMS\Core\Messaging\FlashMessage::INFO,
866
                        true
867
                    );
868
                }
869
            }
870
        }
871
        $metadata['collection'] = $collections;
872
873
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
874
            ->getQueryBuilderForTable('tx_dlf_libraries');
875
876
        // Get UID for owner.
877
        if (empty($owner)) {
878
            $owner = empty($metadata['owner'][0]) ? $metadata['owner'][0] : 'default';
879
        }
880
        if (!MathUtility::canBeInterpretedAsInteger($owner)) {
881
            $result = $queryBuilder
882
                ->select('tx_dlf_libraries.uid AS uid')
883
                ->from('tx_dlf_libraries')
884
                ->where(
885
                    $queryBuilder->expr()->eq('tx_dlf_libraries.pid', intval($pid)),
886
                    $queryBuilder->expr()->eq('tx_dlf_libraries.index_name', $queryBuilder->expr()->literal($owner)),
887
                    Helper::whereExpression('tx_dlf_libraries')
888
                )
889
                ->setMaxResults(1)
890
                ->execute();
891
892
            if ($resArray = $result->fetch()) {
893
                $ownerUid = $resArray['uid'];
894
            } else {
895
                // Insert new library.
896
                $libNewUid = uniqid('NEW');
897
                $libData['tx_dlf_libraries'][$libNewUid] = [
898
                    'pid' => $pid,
899
                    'label' => $owner,
900
                    'index_name' => $owner,
901
                    'website' => '',
902
                    'contact' => '',
903
                    'image' => '',
904
                    'oai_label' => '',
905
                    'oai_base' => '',
906
                    'opac_label' => '',
907
                    'opac_base' => '',
908
                    'union_label' => '',
909
                    'union_base' => '',
910
                ];
911
                $substUid = Helper::processDBasAdmin($libData);
912
                // Add new library's UID.
913
                $ownerUid = $substUid[$libNewUid];
914
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
915
                    Helper::addMessage(
916
                        htmlspecialchars(sprintf(Helper::getMessage('flash.newLibrary'), $owner, $ownerUid)),
917
                        Helper::getMessage('flash.attention', true),
918
                        \TYPO3\CMS\Core\Messaging\FlashMessage::INFO,
919
                        true
920
                    );
921
                }
922
            }
923
            $owner = $ownerUid;
924
        }
925
        $metadata['owner'][0] = $owner;
926
        // Get UID of parent document.
927
        $partof = $this->getParentDocumentUidForSaving($pid, $core, $owner);
928
        // Use the date of publication or title as alternative sorting metric for parts of multi-part works.
929
        if (!empty($partof)) {
930
            if (
931
                empty($metadata['volume'][0])
932
                && !empty($metadata['year'][0])
933
            ) {
934
                $metadata['volume'] = $metadata['year'];
935
            }
936
            if (empty($metadata['volume_sorting'][0])) {
937
                // If METS @ORDER is given it is preferred over year_sorting and year.
938
                if (!empty($metadata['mets_order'][0])) {
939
                    $metadata['volume_sorting'][0] = $metadata['mets_order'][0];
940
                } elseif (!empty($metadata['year_sorting'][0])) {
941
                    $metadata['volume_sorting'][0] = $metadata['year_sorting'][0];
942
                } elseif (!empty($metadata['year'][0])) {
943
                    $metadata['volume_sorting'][0] = $metadata['year'][0];
944
                }
945
            }
946
            // If volume_sorting is still empty, try to use title_sorting or METS @ORDERLABEL finally (workaround for newspapers)
947
            if (empty($metadata['volume_sorting'][0])) {
948
                if (!empty($metadata['title_sorting'][0])) {
949
                    $metadata['volume_sorting'][0] = $metadata['title_sorting'][0];
950
                } elseif (!empty($metadata['mets_orderlabel'][0])) {
951
                    $metadata['volume_sorting'][0] = $metadata['mets_orderlabel'][0];
952
                }
953
            }
954
        }
955
956
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
957
            ->getQueryBuilderForTable('tx_dlf_metadata');
958
959
        // Get metadata for lists and sorting.
960
        $result = $queryBuilder
961
            ->select(
962
                'tx_dlf_metadata.index_name AS index_name',
963
                'tx_dlf_metadata.is_listed AS is_listed',
964
                'tx_dlf_metadata.is_sortable AS is_sortable'
965
            )
966
            ->from('tx_dlf_metadata')
967
            ->where(
968
                $queryBuilder->expr()->orX(
969
                    $queryBuilder->expr()->eq('tx_dlf_metadata.is_listed', 1),
970
                    $queryBuilder->expr()->eq('tx_dlf_metadata.is_sortable', 1)
971
                ),
972
                $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($pid)),
973
                Helper::whereExpression('tx_dlf_metadata')
974
            )
975
            ->execute();
976
977
        $listed = [];
978
        $sortable = [];
979
980
        while ($resArray = $result->fetch()) {
981
            if (!empty($metadata[$resArray['index_name']])) {
982
                if ($resArray['is_listed']) {
983
                    $listed[$resArray['index_name']] = $metadata[$resArray['index_name']];
984
                }
985
                if ($resArray['is_sortable']) {
986
                    $sortable[$resArray['index_name']] = $metadata[$resArray['index_name']][0];
987
                }
988
            }
989
        }
990
        // Fill data array.
991
        $data['tx_dlf_documents'][$this->uid] = [
992
            'pid' => $pid,
993
            $GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['starttime'] => 0,
994
            $GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['endtime'] => 0,
995
            'prod_id' => $metadata['prod_id'][0],
996
            'location' => $this->location,
997
            'record_id' => $metadata['record_id'][0],
998
            'opac_id' => $metadata['opac_id'][0],
999
            'union_id' => $metadata['union_id'][0],
1000
            'urn' => $metadata['urn'][0],
1001
            'purl' => $metadata['purl'][0],
1002
            'title' => $metadata['title'][0],
1003
            'title_sorting' => $metadata['title_sorting'][0],
1004
            'author' => implode('; ', $metadata['author']),
1005
            'year' => implode('; ', $metadata['year']),
1006
            'place' => implode('; ', $metadata['place']),
1007
            'thumbnail' => $this->_getThumbnail(true),
1008
            'metadata' => serialize($listed),
1009
            'metadata_sorting' => serialize($sortable),
1010
            'structure' => $metadata['type'][0],
1011
            'partof' => $partof,
1012
            'volume' => $metadata['volume'][0],
1013
            'volume_sorting' => $metadata['volume_sorting'][0],
1014
            'license' => $metadata['license'][0],
1015
            'terms' => $metadata['terms'][0],
1016
            'restrictions' => $metadata['restrictions'][0],
1017
            'out_of_print' => $metadata['out_of_print'][0],
1018
            'rights_info' => $metadata['rights_info'][0],
1019
            'collections' => $metadata['collection'],
1020
            'mets_label' => $metadata['mets_label'][0],
1021
            'mets_orderlabel' => $metadata['mets_orderlabel'][0],
1022
            'mets_order' => $metadata['mets_order'][0],
1023
            'owner' => $metadata['owner'][0],
1024
            'solrcore' => $core,
1025
            'status' => 0,
1026
            'document_format' => $metadata['document_format'][0],
1027
        ];
1028
        // Unhide hidden documents.
1029
        if (!empty($conf['unhideOnIndex'])) {
1030
            $data['tx_dlf_documents'][$this->uid][$GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['disabled']] = 0;
1031
        }
1032
        // Process data.
1033
        $newIds = Helper::processDBasAdmin($data);
1034
        // Replace placeholder with actual UID.
1035
        if (strpos($this->uid, 'NEW') === 0) {
1036
            $this->uid = $newIds[$this->uid];
1037
            $this->pid = $pid;
0 ignored issues
show
Bug introduced by
The property pid is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1038
            $this->parentId = $partof;
0 ignored issues
show
Bug introduced by
The property parentId is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1039
        }
1040
        if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
1041
            Helper::addMessage(
1042
                htmlspecialchars(sprintf(Helper::getMessage('flash.documentSaved'), $metadata['title'][0], $this->uid)),
1043
                Helper::getMessage('flash.done', true),
1044
                \TYPO3\CMS\Core\Messaging\FlashMessage::OK,
1045
                true
1046
            );
1047
        }
1048
        // Add document to index.
1049
        if ($core) {
1050
            Indexer::add($this, $core);
1051
        } else {
1052
            $this->logger->notice('Invalid UID "' . $core . '" for Solr core');
1 ignored issue
show
Bug introduced by
The method notice() does not exist on TYPO3\CMS\Core\Log\LogManager. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

1052
            $this->logger->/** @scrutinizer ignore-call */ 
1053
                           notice('Invalid UID "' . $core . '" for Solr core');

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
1053
        }
1054
        return true;
1055
    }
1056
1057
    /**
1058
     * This ensures that the recordId, if existent, is retrieved from the document
1059
     *
1060
     * @access protected
1061
     *
1062
     * @abstract
1063
     *
1064
     * @param int $pid: ID of the configuration page with the recordId config
1065
     *
1066
     */
1067
    protected abstract function establishRecordId($pid);
1068
1069
    /**
1070
     * Get the ID of the parent document if the current document has one. Also save a parent document
1071
     * to the database and the Solr index if their $pid and the current $pid differ.
1072
     * Currently only applies to METS documents.
1073
     *
1074
     * @access protected
1075
     *
1076
     * @abstract
1077
     *
1078
     * @return int The parent document's id.
1079
     */
1080
    protected abstract function getParentDocumentUidForSaving($pid, $core, $owner);
1081
1082
    /**
1083
     * This sets some basic class properties
1084
     *
1085
     * @access protected
1086
     *
1087
     * @abstract
1088
     *
1089
     * @return void
1090
     */
1091
    protected abstract function init();
1092
1093
    /**
1094
     * METS/IIIF specific part of loading a location
1095
     *
1096
     * @access protected
1097
     *
1098
     * @abstract
1099
     *
1100
     * @param string $location: The URL of the file to load
1101
     *
1102
     * @return bool true on success or false on failure
1103
     */
1104
    protected abstract function loadLocation($location);
1105
1106
    /**
1107
     * Reuse any document object that might have been already loaded to determine wether document is METS or IIIF
1108
     *
1109
     * @access protected
1110
     *
1111
     * @abstract
1112
     *
1113
     * @param \SimpleXMLElement|IiifResourceInterface $preloadedDocument: any instance that has already been loaded
1114
     *
1115
     * @return bool true if $preloadedDocument can actually be reused, false if it has to be loaded again
1116
     */
1117
    protected abstract function setPreloadedDocument($preloadedDocument);
1118
1119
    /**
1120
     * Load XML file / IIIF resource from URL
1121
     *
1122
     * @access protected
1123
     *
1124
     * @param string $location: The URL of the file to load
1125
     *
1126
     * @return bool true on success or false on failure
1127
     */
1128
    protected function load($location)
1129
    {
1130
        // Load XML / JSON-LD file.
1131
        if (GeneralUtility::isValidUrl($location)) {
1132
            // Load extension configuration
1133
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
1134
            // Set user-agent to identify self when fetching XML / JSON-LD data.
1135
            if (!empty($extConf['useragent'])) {
1136
                @ini_set('user_agent', $extConf['useragent']);
1137
            }
1138
            // the actual loading is format specific
1139
            return $this->loadLocation($location);
1140
        } else {
1141
            $this->logger->error('Invalid file location "' . $location . '" for document loading');
1142
        }
1143
        return false;
1144
    }
1145
1146
    /**
1147
     * Register all available data formats
1148
     *
1149
     * @access protected
1150
     *
1151
     * @return void
1152
     */
1153
    protected function loadFormats()
1154
    {
1155
        if (!$this->formatsLoaded) {
1156
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1157
                ->getQueryBuilderForTable('tx_dlf_formats');
1158
1159
            // Get available data formats from database.
1160
            $result = $queryBuilder
1161
                ->select(
1162
                    'tx_dlf_formats.type AS type',
1163
                    'tx_dlf_formats.root AS root',
1164
                    'tx_dlf_formats.namespace AS namespace',
1165
                    'tx_dlf_formats.class AS class'
1166
                )
1167
                ->from('tx_dlf_formats')
1168
                ->where(
1169
                    $queryBuilder->expr()->eq('tx_dlf_formats.pid', 0)
1170
                )
1171
                ->execute();
1172
1173
            while ($resArray = $result->fetch()) {
1174
                // Update format registry.
1175
                $this->formats[$resArray['type']] = [
1176
                    'rootElement' => $resArray['root'],
1177
                    'namespaceURI' => $resArray['namespace'],
1178
                    'class' => $resArray['class']
1179
                ];
1180
            }
1181
            $this->formatsLoaded = true;
1182
        }
1183
    }
1184
1185
    /**
1186
     * Traverse a logical (sub-) structure tree to find the structure with the requested logical id and return it's depth.
1187
     *
1188
     * @access protected
1189
     *
1190
     * @param array $structure: logical structure array
1191
     * @param int $depth: current tree depth
1192
     * @param string $logId: ID of the logical structure whose depth is requested
1193
     *
1194
     * @return int|bool: false if structure with $logId is not a child of this substructure,
1195
     * or the actual depth.
1196
     */
1197
    protected function getTreeDepth($structure, $depth, $logId)
1198
    {
1199
        foreach ($structure as $element) {
1200
            if ($element['id'] == $logId) {
1201
                return $depth;
1202
            } elseif (array_key_exists('children', $element)) {
1203
                $foundInChildren = $this->getTreeDepth($element['children'], $depth + 1, $logId);
1204
                if ($foundInChildren !== false) {
1205
                    return $foundInChildren;
1206
                }
1207
            }
1208
        }
1209
        return false;
1210
    }
1211
1212
    /**
1213
     * This returns $this->cPid via __get()
1214
     *
1215
     * @access protected
1216
     *
1217
     * @return int The PID of the metadata definitions
1218
     */
1219
    protected function _getCPid()
1220
    {
1221
        return $this->cPid;
1222
    }
1223
1224
    /**
1225
     * This returns $this->location via __get()
1226
     *
1227
     * @access protected
1228
     *
1229
     * @return string The location of the document
1230
     */
1231
    protected function _getLocation()
1232
    {
1233
        return $this->location;
1234
    }
1235
1236
    /**
1237
     * Format specific part of building the document's metadata array
1238
     *
1239
     * @access protected
1240
     *
1241
     * @abstract
1242
     *
1243
     * @param int $cPid
1244
     */
1245
    protected abstract function prepareMetadataArray($cPid);
1246
1247
    /**
1248
     * This builds an array of the document's metadata
1249
     *
1250
     * @access protected
1251
     *
1252
     * @return array Array of metadata with their corresponding logical structure node ID as key
1253
     */
1254
    protected function _getMetadataArray()
1255
    {
1256
        // Set metadata definitions' PID.
1257
        $cPid = ($this->cPid ? $this->cPid : $this->pid);
1258
        if (!$cPid) {
1259
            $this->logger->error('Invalid PID ' . $cPid . ' for metadata definitions');
1260
            return [];
1261
        }
1262
        if (
1263
            !$this->metadataArrayLoaded
1264
            || $this->metadataArray[0] != $cPid
1265
        ) {
1266
            $this->prepareMetadataArray($cPid);
1267
            $this->metadataArray[0] = $cPid;
0 ignored issues
show
Bug introduced by
The property metadataArray is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1268
            $this->metadataArrayLoaded = true;
1269
        }
1270
        return $this->metadataArray;
1271
    }
1272
1273
    /**
1274
     * This returns $this->numPages via __get()
1275
     *
1276
     * @access protected
1277
     *
1278
     * @return int The total number of pages and/or tracks
1279
     */
1280
    protected function _getNumPages()
1281
    {
1282
        $this->_getPhysicalStructure();
1283
        return $this->numPages;
1284
    }
1285
1286
    /**
1287
     * This returns $this->parentId via __get()
1288
     *
1289
     * @access protected
1290
     *
1291
     * @return int The UID of the parent document or zero if not applicable
1292
     */
1293
    protected function _getParentId()
1294
    {
1295
        return $this->parentId;
1296
    }
1297
1298
    /**
1299
     * This builds an array of the document's physical structure
1300
     *
1301
     * @access protected
1302
     *
1303
     * @abstract
1304
     *
1305
     * @return array Array of physical elements' id, type, label and file representations ordered
1306
     * by @ORDER attribute / IIIF Sequence's Canvases
1307
     */
1308
    protected abstract function _getPhysicalStructure();
1309
1310
    /**
1311
     * This gives an array of the document's physical structure metadata
1312
     *
1313
     * @access protected
1314
     *
1315
     * @return array Array of elements' type, label and file representations ordered by @ID attribute / Canvas order
1316
     */
1317
    protected function _getPhysicalStructureInfo()
1318
    {
1319
        // Is there no physical structure array yet?
1320
        if (!$this->physicalStructureLoaded) {
1321
            // Build physical structure array.
1322
            $this->_getPhysicalStructure();
1323
        }
1324
        return $this->physicalStructureInfo;
1325
    }
1326
1327
    /**
1328
     * This returns $this->pid via __get()
1329
     *
1330
     * @access protected
1331
     *
1332
     * @return int The PID of the document or zero if not in database
1333
     */
1334
    protected function _getPid()
1335
    {
1336
        return $this->pid;
1337
    }
1338
1339
    /**
1340
     * This returns $this->ready via __get()
1341
     *
1342
     * @access protected
1343
     *
1344
     * @return bool Is the document instantiated successfully?
1345
     */
1346
    protected function _getReady()
1347
    {
1348
        return $this->ready;
1349
    }
1350
1351
    /**
1352
     * This returns $this->recordId via __get()
1353
     *
1354
     * @access protected
1355
     *
1356
     * @return mixed The METS file's / IIIF manifest's record identifier
1357
     */
1358
    protected function _getRecordId()
1359
    {
1360
        return $this->recordId;
1361
    }
1362
1363
    /**
1364
     * This returns $this->rootId via __get()
1365
     *
1366
     * @access protected
1367
     *
1368
     * @return int The UID of the root document or zero if not applicable
1369
     */
1370
    protected function _getRootId()
1371
    {
1372
        if (!$this->rootIdLoaded) {
1373
            if ($this->parentId) {
1374
                $parent = self::getInstance($this->parentId, $this->pid);
1375
                $this->rootId = $parent->rootId;
0 ignored issues
show
Bug introduced by
The property rootId is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1376
            }
1377
            $this->rootIdLoaded = true;
1378
        }
1379
        return $this->rootId;
1380
    }
1381
1382
    /**
1383
     * This returns the smLinks between logical and physical structMap (METS) and models the
1384
     * relation between IIIF Canvases and Manifests / Ranges in the same way
1385
     *
1386
     * @access protected
1387
     *
1388
     * @abstract
1389
     *
1390
     * @return array The links between logical and physical nodes / Range, Manifest and Canvas
1391
     */
1392
    protected abstract function _getSmLinks();
1393
1394
    /**
1395
     * This builds an array of the document's logical structure
1396
     *
1397
     * @access protected
1398
     *
1399
     * @return array Array of structure nodes' id, label, type and physical page indexes/mptr / Canvas link with original hierarchy preserved
1400
     */
1401
    protected function _getTableOfContents()
1402
    {
1403
        // Is there no logical structure array yet?
1404
        if (!$this->tableOfContentsLoaded) {
1405
            // Get all logical structures.
1406
            $this->getLogicalStructure('', true);
1407
            $this->tableOfContentsLoaded = true;
1408
        }
1409
        return $this->tableOfContents;
1410
    }
1411
1412
    /**
1413
     * This returns the document's thumbnail location
1414
     *
1415
     * @access protected
1416
     *
1417
     * @abstract
1418
     *
1419
     * @param bool $forceReload: Force reloading the thumbnail instead of returning the cached value
1420
     *
1421
     * @return string The document's thumbnail location
1422
     */
1423
    protected abstract function _getThumbnail($forceReload = false);
1424
1425
    /**
1426
     * This returns the ID of the toplevel logical structure node
1427
     *
1428
     * @access protected
1429
     *
1430
     * @abstract
1431
     *
1432
     * @return string The logical structure node's ID
1433
     */
1434
    protected abstract function _getToplevelId();
1435
1436
    /**
1437
     * This returns $this->uid via __get()
1438
     *
1439
     * @access protected
1440
     *
1441
     * @return mixed The UID or the URL of the document
1442
     */
1443
    protected function _getUid()
1444
    {
1445
        return $this->uid;
1446
    }
1447
1448
    /**
1449
     * This sets $this->cPid via __set()
1450
     *
1451
     * @access protected
1452
     *
1453
     * @param int $value: The new PID for the metadata definitions
1454
     *
1455
     * @return void
1456
     */
1457
    protected function _setCPid($value)
1458
    {
1459
        $this->cPid = max(intval($value), 0);
1460
    }
1461
1462
    /**
1463
     * This magic method is invoked each time a clone is called on the object variable
1464
     *
1465
     * @access protected
1466
     *
1467
     * @return void
1468
     */
1469
    protected function __clone()
1470
    {
1471
        // This method is defined as protected because singleton objects should not be cloned.
1472
    }
1473
1474
    /**
1475
     * This is a singleton class, thus the constructor should be private/protected
1476
     * (Get an instance of this class by calling \Kitodo\Dlf\Common\Document\Document::getInstance())
1477
     *
1478
     * @access protected
1479
     *
1480
     * @param int $uid: The UID of the document to parse or URL to XML file
1481
     * @param int $pid: If > 0, then only document with this PID gets loaded
1482
     * @param \SimpleXMLElement|IiifResourceInterface $preloadedDocument: Either null or the \SimpleXMLElement
1483
     * or IiifResourceInterface that has been loaded to determine the basic document format.
1484
     *
1485
     * @return void
1486
     */
1487
    protected function __construct($uid, $pid, $preloadedDocument)
1488
    {
1489
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
1490
            ->getQueryBuilderForTable('tx_dlf_documents');
1491
        $location = '';
1492
        // Prepare to check database for the requested document.
1493
        if (MathUtility::canBeInterpretedAsInteger($uid)) {
1494
            $whereClause = $queryBuilder->expr()->andX(
1495
                $queryBuilder->expr()->eq('tx_dlf_documents.uid', intval($uid)),
1496
                Helper::whereExpression('tx_dlf_documents')
1497
            );
1498
        } else {
1499
            // Try to load METS file / IIIF manifest.
1500
            if ($this->setPreloadedDocument($preloadedDocument) || (GeneralUtility::isValidUrl($uid)
1501
                && $this->load($uid))) {
1502
                // Initialize core METS object.
1503
                $this->init();
1504
                if ($this->getDocument() !== null) {
1505
                    // Cast to string for safety reasons.
1506
                    $location = (string) $uid;
1507
                    $this->establishRecordId($pid);
1508
                } else {
1509
                    // No METS / IIIF part found.
1510
                    return;
1511
                }
1512
            } else {
1513
                // Loading failed.
1514
                return;
1515
            }
1516
            if (
1517
                !empty($location)
1518
                && !empty($this->recordId)
1519
            ) {
1520
                // Try to match record identifier or location (both should be unique).
1521
                $whereClause = $queryBuilder->expr()->andX(
1522
                    $queryBuilder->expr()->orX(
1523
                        $queryBuilder->expr()->eq('tx_dlf_documents.location', $queryBuilder->expr()->literal($location)),
1524
                        $queryBuilder->expr()->eq('tx_dlf_documents.record_id', $queryBuilder->expr()->literal($this->recordId))
1525
                    ),
1526
                    Helper::whereExpression('tx_dlf_documents')
1527
                );
1528
            } else {
1529
                // Can't persistently identify document, don't try to match at all.
1530
                $whereClause = '1=-1';
1531
            }
1532
        }
1533
        // Check for PID if needed.
1534
        if ($pid) {
1535
            $whereClause = $queryBuilder->expr()->andX(
1536
                $whereClause,
1537
                $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($pid))
1538
            );
1539
        }
1540
        // Get document PID and location from database.
1541
        $result = $queryBuilder
1542
            ->select(
1543
                'tx_dlf_documents.uid AS uid',
1544
                'tx_dlf_documents.pid AS pid',
1545
                'tx_dlf_documents.record_id AS record_id',
1546
                'tx_dlf_documents.partof AS partof',
1547
                'tx_dlf_documents.thumbnail AS thumbnail',
1548
                'tx_dlf_documents.location AS location'
1549
            )
1550
            ->from('tx_dlf_documents')
1551
            ->where($whereClause)
1552
            ->setMaxResults(1)
1553
            ->execute();
1554
1555
        if ($resArray = $result->fetch()) {
1556
            $this->uid = $resArray['uid'];
0 ignored issues
show
Bug introduced by
The property uid is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1557
            $this->pid = $resArray['pid'];
0 ignored issues
show
Bug introduced by
The property pid is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1558
            $this->recordId = $resArray['record_id'];
0 ignored issues
show
Bug introduced by
The property recordId is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1559
            $this->parentId = $resArray['partof'];
0 ignored issues
show
Bug introduced by
The property parentId is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1560
            $this->thumbnail = $resArray['thumbnail'];
0 ignored issues
show
Bug introduced by
The property thumbnail is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1561
            $this->location = $resArray['location'];
0 ignored issues
show
Bug introduced by
The property location is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1562
            $this->thumbnailLoaded = true;
1563
            // Load XML file if necessary...
1564
            if (
1565
                $this->getDocument() === null
1566
                && $this->load($this->location)
1567
            ) {
1568
                // ...and set some basic properties.
1569
                $this->init();
1570
            }
1571
            // Do we have a METS / IIIF object now?
1572
            if ($this->getDocument() !== null) {
1573
                // Set new location if necessary.
1574
                if (!empty($location)) {
1575
                    $this->location = $location;
1576
                }
1577
                // Document ready!
1578
                $this->ready = true;
0 ignored issues
show
Bug introduced by
The property ready is declared read-only in Kitodo\Dlf\Common\Document\Document.
Loading history...
1579
            }
1580
        } elseif ($this->getDocument() !== null) {
1581
            // Set location as UID for documents not in database.
1582
            $this->uid = $location;
1583
            $this->location = $location;
1584
            // Document ready!
1585
            $this->ready = true;
1586
        } else {
1587
            $this->logger->error('No document with UID ' . $uid . ' found or document not accessible');
1588
        }
1589
    }
1590
1591
    /**
1592
     * This magic method is called each time an invisible property is referenced from the object
1593
     *
1594
     * @access public
1595
     *
1596
     * @param string $var: Name of variable to get
1597
     *
1598
     * @return mixed Value of $this->$var
1599
     */
1600
    public function __get($var)
1601
    {
1602
        $method = '_get' . ucfirst($var);
1603
        if (
1604
            !property_exists($this, $var)
1605
            || !method_exists($this, $method)
1606
        ) {
1607
            $this->logger->warning('There is no getter function for property "' . $var . '"');
1 ignored issue
show
Bug introduced by
The method warning() does not exist on TYPO3\CMS\Core\Log\LogManager. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

1607
            $this->logger->/** @scrutinizer ignore-call */ 
1608
                           warning('There is no getter function for property "' . $var . '"');

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
1608
            return;
1609
        } else {
1610
            return $this->$method();
1611
        }
1612
    }
1613
1614
    /**
1615
     * This magic method is called each time an invisible property is checked for isset() or empty()
1616
     *
1617
     * @access public
1618
     *
1619
     * @param string $var: Name of variable to check
1620
     *
1621
     * @return bool true if variable is set and not empty, false otherwise
1622
     */
1623
    public function __isset($var)
1624
    {
1625
        return !empty($this->__get($var));
1626
    }
1627
1628
    /**
1629
     * This magic method is called each time an invisible property is referenced from the object
1630
     *
1631
     * @access public
1632
     *
1633
     * @param string $var: Name of variable to set
1634
     * @param mixed $value: New value of variable
1635
     *
1636
     * @return void
1637
     */
1638
    public function __set($var, $value)
1639
    {
1640
        $method = '_set' . ucfirst($var);
1641
        if (
1642
            !property_exists($this, $var)
1643
            || !method_exists($this, $method)
1644
        ) {
1645
            $this->logger->warning('There is no setter function for property "' . $var . '"');
1646
        } else {
1647
            $this->$method($value);
1648
        }
1649
    }
1650
}
1651