Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#329)
by Sebastian
03:28
created

Document::getLogicalStructureInfo()   F

Complexity

Conditions 21
Paths 3456

Size

Total Lines 74
Code Lines 46

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 21
eloc 46
nc 3456
nop 2
dl 0
loc 74
rs 0
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
namespace Kitodo\Dlf\Common;
3
4
/**
5
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
6
 *
7
 * This file is part of the Kitodo and TYPO3 projects.
8
 *
9
 * @license GNU General Public License version 3 or later.
10
 * For the full copyright and license information, please read the
11
 * LICENSE.txt file that was distributed with this source code.
12
 */
13
14
/**
15
 * Document class for the 'dlf' extension
16
 *
17
 * @author Sebastian Meyer <[email protected]>
18
 * @author Henrik Lochmann <[email protected]>
19
 * @package TYPO3
20
 * @subpackage dlf
21
 * @access public
22
 */
23
final class Document {
24
    /**
25
     * This holds the whole XML file as string for serialization purposes
26
     * @see __sleep() / __wakeup()
27
     *
28
     * @var string
29
     * @access protected
30
     */
31
    protected $asXML = '';
32
33
    /**
34
     * This holds the PID for the configuration
35
     *
36
     * @var integer
37
     * @access protected
38
     */
39
    protected $cPid = 0;
40
41
    /**
42
     * This holds the XML file's dmdSec parts with their IDs as array key
43
     *
44
     * @var array
45
     * @access protected
46
     */
47
    protected $dmdSec = [];
48
49
    /**
50
     * Are the METS file's dmdSecs loaded?
51
     * @see $dmdSec
52
     *
53
     * @var boolean
54
     * @access protected
55
     */
56
    protected $dmdSecLoaded = FALSE;
57
58
    /**
59
     * The extension key
60
     *
61
     * @var string
62
     * @access public
63
     */
64
    public static $extKey = 'dlf';
65
66
    /**
67
     * This holds the file ID -> USE concordance
68
     * @see _getFileGrps()
69
     *
70
     * @var array
71
     * @access protected
72
     */
73
    protected $fileGrps = [];
74
75
    /**
76
     * Are the file groups loaded?
77
     * @see $fileGrps
78
     *
79
     * @var boolean
80
     * @access protected
81
     */
82
    protected $fileGrpsLoaded = FALSE;
83
84
    /**
85
     * This holds the configuration for all supported metadata encodings
86
     * @see loadFormats()
87
     *
88
     * @var array
89
     * @access protected
90
     */
91
    protected $formats = [
92
        'OAI' => [
93
            'rootElement' => 'OAI-PMH',
94
            'namespaceURI' => 'http://www.openarchives.org/OAI/2.0/',
95
        ],
96
        'METS' => [
97
            'rootElement' => 'mets',
98
            'namespaceURI' => 'http://www.loc.gov/METS/',
99
        ],
100
        'XLINK' => [
101
            'rootElement' => 'xlink',
102
            'namespaceURI' => 'http://www.w3.org/1999/xlink',
103
        ]
104
    ];
105
106
    /**
107
     * Are the available metadata formats loaded?
108
     * @see $formats
109
     *
110
     * @var boolean
111
     * @access protected
112
     */
113
    protected $formatsLoaded = FALSE;
114
115
    /**
116
     * Are there any fulltext files available?
117
     *
118
     * @var boolean
119
     * @access protected
120
     */
121
    protected $hasFulltext = FALSE;
122
123
    /**
124
     * Last searched logical and physical page
125
     *
126
     * @var array
127
     * @access protected
128
     */
129
    protected $lastSearchedPhysicalPage = ['logicalPage' => NULL, 'physicalPage' => NULL];
130
131
    /**
132
     * This holds the documents location
133
     *
134
     * @var string
135
     * @access protected
136
     */
137
    protected $location = '';
138
139
    /**
140
     * This holds the logical units
141
     *
142
     * @var array
143
     * @access protected
144
     */
145
    protected $logicalUnits = [];
146
147
    /**
148
     * This holds the documents' parsed metadata array with their corresponding structMap//div's ID as array key
149
     *
150
     * @var array
151
     * @access protected
152
     */
153
    protected $metadataArray = [];
154
155
    /**
156
     * Is the metadata array loaded?
157
     * @see $metadataArray
158
     *
159
     * @var boolean
160
     * @access protected
161
     */
162
    protected $metadataArrayLoaded = FALSE;
163
164
    /**
165
     * This holds the XML file's METS part as \SimpleXMLElement object
166
     *
167
     * @var \SimpleXMLElement
168
     * @access protected
169
     */
170
    protected $mets;
171
172
    /**
173
     * The holds the total number of pages
174
     *
175
     * @var integer
176
     * @access protected
177
     */
178
    protected $numPages = 0;
179
180
    /**
181
     * This holds the UID of the parent document or zero if not multi-volumed
182
     *
183
     * @var integer
184
     * @access protected
185
     */
186
    protected $parentId = 0;
187
188
    /**
189
     * This holds the physical structure
190
     *
191
     * @var array
192
     * @access protected
193
     */
194
    protected $physicalStructure = [];
195
196
    /**
197
     * This holds the physical structure metadata
198
     *
199
     * @var array
200
     * @access protected
201
     */
202
    protected $physicalStructureInfo = [];
203
204
    /**
205
     * Is the physical structure loaded?
206
     * @see $physicalStructure
207
     *
208
     * @var boolean
209
     * @access protected
210
     */
211
    protected $physicalStructureLoaded = FALSE;
212
213
    /**
214
     * This holds the PID of the document or zero if not in database
215
     *
216
     * @var integer
217
     * @access protected
218
     */
219
    protected $pid = 0;
220
221
    /**
222
     * This holds the documents' raw text pages with their corresponding structMap//div's ID as array key
223
     *
224
     * @var array
225
     * @access protected
226
     */
227
    protected $rawTextArray = [];
228
229
    /**
230
     * Is the document instantiated successfully?
231
     *
232
     * @var boolean
233
     * @access protected
234
     */
235
    protected $ready = FALSE;
236
237
    /**
238
     * The METS file's record identifier
239
     *
240
     * @var string
241
     * @access protected
242
     */
243
    protected $recordId;
244
245
    /**
246
     * This holds the singleton object of the document
247
     *
248
     * @var array (\Kitodo\Dlf\Common\Document)
249
     * @access protected
250
     */
251
    protected static $registry = [];
252
253
    /**
254
     * This holds the UID of the root document or zero if not multi-volumed
255
     *
256
     * @var integer
257
     * @access protected
258
     */
259
    protected $rootId = 0;
260
261
    /**
262
     * Is the root id loaded?
263
     * @see $rootId
264
     *
265
     * @var boolean
266
     * @access protected
267
     */
268
    protected $rootIdLoaded = FALSE;
269
270
    /**
271
     * This holds the smLinks between logical and physical structMap
272
     *
273
     * @var array
274
     * @access protected
275
     */
276
    protected $smLinks = ['l2p' => [], 'p2l' => []];
277
278
    /**
279
     * Are the smLinks loaded?
280
     * @see $smLinks
281
     *
282
     * @var boolean
283
     * @access protected
284
     */
285
    protected $smLinksLoaded = FALSE;
286
287
    /**
288
     * This holds the logical structure
289
     *
290
     * @var array
291
     * @access protected
292
     */
293
    protected $tableOfContents = [];
294
295
    /**
296
     * Is the table of contents loaded?
297
     * @see $tableOfContents
298
     *
299
     * @var boolean
300
     * @access protected
301
     */
302
    protected $tableOfContentsLoaded = FALSE;
303
304
    /**
305
     * This holds the document's thumbnail location.
306
     *
307
     * @var string
308
     * @access protected
309
     */
310
    protected $thumbnail = '';
311
312
    /**
313
     * Is the document's thumbnail location loaded?
314
     * @see $thumbnail
315
     *
316
     * @var boolean
317
     * @access protected
318
     */
319
    protected $thumbnailLoaded = FALSE;
320
321
    /**
322
     * This holds the toplevel structure's @ID
323
     *
324
     * @var string
325
     * @access protected
326
     */
327
    protected $toplevelId = '';
328
329
    /**
330
     * This holds the UID or the URL of the document
331
     *
332
     * @var mixed
333
     * @access protected
334
     */
335
    protected $uid = 0;
336
337
    /**
338
     * This holds the whole XML file as \SimpleXMLElement object
339
     *
340
     * @var \SimpleXMLElement
341
     * @access protected
342
     */
343
    protected $xml;
344
345
    /**
346
     * This clears the static registry to prevent memory exhaustion
347
     *
348
     * @access public
349
     *
350
     * @return void
351
     */
352
    public static function clearRegistry() {
353
        // Reset registry array.
354
        self::$registry = [];
355
    }
356
357
    /**
358
     * This gets the location of a file representing a physical page or track
359
     *
360
     * @access public
361
     *
362
     * @param string $id: The @ID attribute of the file node
363
     *
364
     * @return string The file's location as URL
365
     */
366
    public function getFileLocation($id) {
367
        if (!empty($id)
368
            && ($location = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="'.$id.'"]/mets:FLocat[@LOCTYPE="URL"]'))) {
369
            return (string) $location[0]->attributes('http://www.w3.org/1999/xlink')->href;
370
        } else {
371
            Helper::devLog('There is no file node with @ID "'.$id.'"', DEVLOG_SEVERITY_WARNING);
372
            return '';
373
        }
374
    }
375
376
    /**
377
     * This gets the MIME type of a file representing a physical page or track
378
     *
379
     * @access public
380
     *
381
     * @param string $id: The @ID attribute of the file node
382
     *
383
     * @return string The file's MIME type
384
     */
385
    public function getFileMimeType($id) {
386
        if (!empty($id)
387
            && ($mimetype = $this->mets->xpath('./mets:fileSec/mets:fileGrp/mets:file[@ID="'.$id.'"]/@MIMETYPE'))) {
388
            return (string) $mimetype[0];
389
        } else {
390
            Helper::devLog('There is no file node with @ID "'.$id.'" or no MIME type specified', DEVLOG_SEVERITY_WARNING);
391
            return '';
392
        }
393
    }
394
395
    /**
396
     * This is a singleton class, thus an instance must be created by this method
397
     *
398
     * @access public
399
     *
400
     * @param mixed $uid: The unique identifier of the document to parse or URL of XML file
401
     * @param integer $pid: If > 0, then only document with this PID gets loaded
402
     * @param boolean $forceReload: Force reloading the document instead of returning the cached instance
403
     *
404
     * @return \Kitodo\Dlf\Common\Document Instance of this class
405
     */
406
    public static function &getInstance($uid, $pid = 0, $forceReload = FALSE) {
407
        // Sanitize input.
408
        $pid = max(intval($pid), 0);
409
        if (!$forceReload) {
410
            $regObj = md5($uid);
411
            if (is_object(self::$registry[$regObj])
412
                && self::$registry[$regObj] instanceof self) {
413
                // Check if instance has given PID.
414
                if (!$pid
415
                    || !self::$registry[$regObj]->pid
416
                    || $pid == self::$registry[$regObj]->pid) {
417
                    // Return singleton instance if available.
418
                    return self::$registry[$regObj];
419
                }
420
            } else {
421
                // Check the user's session...
422
                $sessionData = Helper::loadFromSession(get_called_class());
423
                if (is_object($sessionData[$regObj])
424
                    && $sessionData[$regObj] instanceof self) {
425
                    // Check if instance has given PID.
426
                    if (!$pid
427
                        || !$sessionData[$regObj]->pid
428
                        || $pid == $sessionData[$regObj]->pid) {
429
                        // ...and restore registry.
430
                        self::$registry[$regObj] = $sessionData[$regObj];
431
                        return self::$registry[$regObj];
432
                    }
433
                }
434
            }
435
        }
436
        // Create new instance...
437
        $instance = new self($uid, $pid);
438
        // ...and save it to registry.
439
        if ($instance->ready) {
440
            self::$registry[md5($instance->uid)] = $instance;
441
            if ($instance->uid != $instance->location) {
442
                self::$registry[md5($instance->location)] = $instance;
443
            }
444
            // Load extension configuration
445
            $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf']['dlf']);
446
            // Save registry to session if caching is enabled.
447
            if (!empty($extConf['caching'])) {
448
                Helper::saveToSession(self::$registry, get_class($instance));
449
            }
450
        }
451
        // Return new instance.
452
        return $instance;
453
    }
454
455
    /**
456
     * This gets details about a logical structure element
457
     *
458
     * @access public
459
     *
460
     * @param string $id: The @ID attribute of the logical structure node
461
     * @param boolean $recursive: Whether to include the child elements
462
     *
463
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
464
     */
465
    public function getLogicalStructure($id, $recursive = FALSE) {
466
        $details = [];
467
        // Is the requested logical unit already loaded?
468
        if (!$recursive
469
            && !empty($this->logicalUnits[$id])) {
470
            // Yes. Return it.
471
            return $this->logicalUnits[$id];
472
        } elseif (!empty($id)) {
473
            // Get specified logical unit.
474
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="'.$id.'"]');
475
        } else {
476
            // Get all logical units at top level.
477
            $divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]/mets:div');
478
        }
479
        if (!empty($divs)) {
480
            if (!$recursive) {
481
                // Get the details for the first xpath hit.
482
                $details = $this->getLogicalStructureInfo($divs[0]);
483
            } else {
484
                // Walk the logical structure recursively and fill the whole table of contents.
485
                foreach ($divs as $div) {
486
                    $this->tableOfContents[] = $this->getLogicalStructureInfo($div, TRUE);
487
                }
488
            }
489
        }
490
        return $details;
491
    }
492
493
    /**
494
     * This gets details about a logical structure element
495
     *
496
     * @access protected
497
     *
498
     * @param \SimpleXMLElement $structure: The logical structure node
499
     * @param boolean $recursive: Whether to include the child elements
500
     *
501
     * @return array Array of the element's id, label, type and physical page indexes/mptr link
502
     */
503
    protected function getLogicalStructureInfo(\SimpleXMLElement $structure, $recursive = FALSE) {
504
        // Get attributes.
505
        foreach ($structure->attributes() as $attribute => $value) {
506
            $attributes[$attribute] = (string) $value;
507
        }
508
        // Load plugin configuration.
509
        $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf'][self::$extKey]);
510
        // Extract identity information.
511
        $details = [];
512
        $details['id'] = $attributes['ID'];
1 ignored issue
show
Comprehensibility Best Practice introduced by
The variable $attributes seems to be defined by a foreach iteration on line 505. Are you sure the iterator is never empty, otherwise this variable is not defined?
Loading history...
513
        $details['dmdId'] = (isset($attributes['DMDID']) ? $attributes['DMDID'] : '');
514
        $details['label'] = (isset($attributes['LABEL']) ? $attributes['LABEL'] : '');
515
        $details['orderlabel'] = (isset($attributes['ORDERLABEL']) ? $attributes['ORDERLABEL'] : '');
516
        $details['contentIds'] = (isset($attributes['CONTENTIDS']) ? $attributes['CONTENTIDS'] : '');
517
        $details['volume'] = '';
518
        // Set volume information only if no label is set and this is the toplevel structure element.
519
        if (empty($details['label'])
520
            && $details['id'] == $this->_getToplevelId()) {
521
            $metadata = $this->getMetadata($details['id']);
522
            if (!empty($metadata['volume'][0])) {
523
                $details['volume'] = $metadata['volume'][0];
524
            }
525
        }
526
        $details['pagination'] = '';
527
        $details['type'] = $attributes['TYPE'];
528
        $details['thumbnailId'] = '';
529
        // Load smLinks.
530
        $this->_getSmLinks();
531
        // Load physical structure.
532
        $this->_getPhysicalStructure();
533
        // Get the physical page or external file this structure element is pointing at.
534
        $details['points'] = '';
535
        // Is there a mptr node?
536
        if (count($structure->children('http://www.loc.gov/METS/')->mptr)) {
537
            // Yes. Get the file reference.
538
            $details['points'] = (string) $structure->children('http://www.loc.gov/METS/')->mptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
539
        } elseif (!empty($this->physicalStructure)
540
            && array_key_exists($details['id'], $this->smLinks['l2p'])) { // Are there any physical elements and is this logical unit linked to at least one of them?
541
            $details['points'] = max(intval(array_search($this->smLinks['l2p'][$details['id']][0], $this->physicalStructure, TRUE)), 1);
542
            if (!empty($this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$extConf['fileGrpThumbs']])) {
543
                $details['thumbnailId'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['files'][$extConf['fileGrpThumbs']];
544
            }
545
            // Get page/track number of the first page/track related to this structure element.
546
            $details['pagination'] = $this->physicalStructureInfo[$this->smLinks['l2p'][$details['id']][0]]['orderlabel'];
547
        } elseif ($details['id'] == $this->_getToplevelId()) { // Is this the toplevel structure element?
548
            // Yes. Point to itself.
549
            $details['points'] = 1;
550
            if (!empty($this->physicalStructure)
551
                && !empty($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$extConf['fileGrpThumbs']])) {
552
                $details['thumbnailId'] = $this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$extConf['fileGrpThumbs']];
553
            }
554
        }
555
        // Get the files this structure element is pointing at.
556
        $details['files'] = [];
557
        $fileUse = $this->_getFileGrps();
558
        // Get the file representations from fileSec node.
559
        foreach ($structure->children('http://www.loc.gov/METS/')->fptr as $fptr) {
560
            // Check if file has valid @USE attribute.
561
            if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
562
                $details['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
563
            }
564
        }
565
        // Keep for later usage.
566
        $this->logicalUnits[$details['id']] = $details;
567
        // Walk the structure recursively? And are there any children of the current element?
568
        if ($recursive
569
            && count($structure->children('http://www.loc.gov/METS/')->div)) {
570
            $details['children'] = [];
571
            foreach ($structure->children('http://www.loc.gov/METS/')->div as $child) {
572
                // Repeat for all children.
573
                $details['children'][] = $this->getLogicalStructureInfo($child, TRUE);
574
            }
575
        }
576
        return $details;
577
    }
578
579
    /**
580
     * This extracts all the metadata for a logical structure node
581
     *
582
     * @access public
583
     *
584
     * @param string $id: The @ID attribute of the logical structure node
585
     * @param integer $cPid: The PID for the metadata definitions
586
     *                       (defaults to $this->cPid or $this->pid)
587
     *
588
     * @return array The logical structure node's parsed metadata array
589
     */
590
    public function getMetadata($id, $cPid = 0) {
591
        // Make sure $cPid is a non-negative integer.
592
        $cPid = max(intval($cPid), 0);
593
        // If $cPid is not given, try to get it elsewhere.
594
        if (!$cPid
595
            && ($this->cPid || $this->pid)) {
596
            // Retain current PID.
597
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
598
        } elseif (!$cPid) {
599
            Helper::devLog('Invalid PID '.$cPid.' for metadata definitions', DEVLOG_SEVERITY_WARNING);
600
            return [];
601
        }
602
        // Get metadata from parsed metadata array if available.
603
        if (!empty($this->metadataArray[$id])
604
            && $this->metadataArray[0] == $cPid) {
605
            return $this->metadataArray[$id];
606
        }
607
        // Initialize metadata array with empty values.
608
        $metadata = [
609
            'title' => [],
610
            'title_sorting' => [],
611
            'author' => [],
612
            'place' => [],
613
            'year' => [],
614
            'prod_id' => [],
615
            'record_id' => [],
616
            'opac_id' => [],
617
            'union_id' => [],
618
            'urn' => [],
619
            'purl' => [],
620
            'type' => [],
621
            'volume' => [],
622
            'volume_sorting' => [],
623
            'collection' => [],
624
            'owner' => [],
625
        ];
626
        // Get the logical structure node's DMDID.
627
        if (!empty($this->logicalUnits[$id])) {
628
            $dmdId = $this->logicalUnits[$id]['dmdId'];
629
        } else {
630
            $dmdId = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="'.$id.'"]/@DMDID');
631
            $dmdId = (string) $dmdId[0];
632
        }
633
        if (!empty($dmdId)) {
634
            // Load available metadata formats and dmdSecs.
635
            $this->loadFormats();
636
            $this->_getDmdSec();
637
            // Is this metadata format supported?
638
            if (!empty($this->formats[$this->dmdSec[$dmdId]['type']])) {
639
                if (!empty($this->formats[$this->dmdSec[$dmdId]['type']]['class'])) {
640
                    $class = $this->formats[$this->dmdSec[$dmdId]['type']]['class'];
641
                    // Get the metadata from class.
642
                    if (class_exists($class)
643
                        && ($obj = \TYPO3\CMS\Core\Utility\GeneralUtility::makeInstance($class)) instanceof MetadataInterface) {
644
                        $obj->extractMetadata($this->dmdSec[$dmdId]['xml'], $metadata);
645
                    } else {
646
                        Helper::devLog('Invalid class/method "'.$class.'->extractMetadata()" for metadata format "'.$this->dmdSec[$dmdId]['type'].'"', DEVLOG_SEVERITY_WARNING);
647
                    }
648
                }
649
            } else {
650
                Helper::devLog('Unsupported metadata format "'.$this->dmdSec[$dmdId]['type'].'" in dmdSec with @ID "'.$dmdId.'"', DEVLOG_SEVERITY_WARNING);
651
                return [];
652
            }
653
            // Get the structure's type.
654
            if (!empty($this->logicalUnits[$id])) {
655
                $metadata['type'] = [$this->logicalUnits[$id]['type']];
656
            } else {
657
                $struct = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="'.$id.'"]/@TYPE');
658
                $metadata['type'] = [(string) $struct[0]];
659
            }
660
            // Get the additional metadata from database.
661
            $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
662
                'tx_dlf_metadata.index_name AS index_name,tx_dlf_metadataformat.xpath AS xpath,tx_dlf_metadataformat.xpath_sorting AS xpath_sorting,tx_dlf_metadata.is_sortable AS is_sortable,tx_dlf_metadata.default_value AS default_value,tx_dlf_metadata.format AS format',
663
                'tx_dlf_metadata,tx_dlf_metadataformat,tx_dlf_formats',
664
                'tx_dlf_metadata.pid='.$cPid
665
                    .' AND tx_dlf_metadataformat.pid='.$cPid
666
                    .' AND ((tx_dlf_metadata.uid=tx_dlf_metadataformat.parent_id AND tx_dlf_metadataformat.encoded=tx_dlf_formats.uid AND tx_dlf_formats.type='.$GLOBALS['TYPO3_DB']->fullQuoteStr($this->dmdSec[$dmdId]['type'], 'tx_dlf_formats').') OR tx_dlf_metadata.format=0)'
667
                    .Helper::whereClause('tx_dlf_metadata', TRUE)
668
                    .Helper::whereClause('tx_dlf_metadataformat')
669
                    .Helper::whereClause('tx_dlf_formats'),
670
                '',
671
                '',
672
                ''
673
            );
674
            // We need a \DOMDocument here, because SimpleXML doesn't support XPath functions properly.
675
            $domNode = dom_import_simplexml($this->dmdSec[$dmdId]['xml']);
676
            $domXPath = new \DOMXPath($domNode->ownerDocument);
677
            $this->registerNamespaces($domXPath);
678
            // OK, now make the XPath queries.
679
            while ($resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result)) {
680
                // Set metadata field's value(s).
681
                if ($resArray['format'] > 0
682
                    && !empty($resArray['xpath'])
683
                    && ($values = $domXPath->evaluate($resArray['xpath'], $domNode))) {
684
                    if ($values instanceof \DOMNodeList
685
                        && $values->length > 0) {
686
                        $metadata[$resArray['index_name']] = [];
687
                        foreach ($values as $value) {
688
                            $metadata[$resArray['index_name']][] = trim((string) $value->nodeValue);
689
                        }
690
                    } elseif (!($values instanceof \DOMNodeList)) {
691
                        $metadata[$resArray['index_name']] = [trim((string) $values)];
692
                    }
693
                }
694
                // Set default value if applicable.
695
                // '!empty($resArray['default_value'])' is not possible, because '0' is a valid default value.
696
                // Setting an empty default value creates a lot of empty fields within the index.
697
                // These empty fields are then shown within the search facets as 'empty'.
698
                if (empty($metadata[$resArray['index_name']][0])
699
                    && strlen($resArray['default_value']) > 0) {
700
                    $metadata[$resArray['index_name']] = [$resArray['default_value']];
701
                }
702
                // Set sorting value if applicable.
703
                if (!empty($metadata[$resArray['index_name']])
704
                    && $resArray['is_sortable']) {
705
                    if ($resArray['format'] > 0
706
                        && !empty($resArray['xpath_sorting'])
707
                        && ($values = $domXPath->evaluate($resArray['xpath_sorting'], $domNode))) {
708
                        if ($values instanceof \DOMNodeList
709
                            && $values->length > 0) {
710
                            $metadata[$resArray['index_name'].'_sorting'][0] = trim((string) $values->item(0)->nodeValue);
711
                        } elseif (!($values instanceof \DOMNodeList)) {
712
                            $metadata[$resArray['index_name'].'_sorting'][0] = trim((string) $values);
713
                        }
714
                    }
715
                    if (empty($metadata[$resArray['index_name'].'_sorting'][0])) {
716
                        $metadata[$resArray['index_name'].'_sorting'][0] = $metadata[$resArray['index_name']][0];
717
                    }
718
                }
719
            }
720
            // Set title to empty string if not present.
721
            if (empty($metadata['title'][0])) {
722
                $metadata['title'][0] = '';
723
                $metadata['title_sorting'][0] = '';
724
            }
725
            // Add collections from database to toplevel element if document is already saved.
726
            if (\TYPO3\CMS\Core\Utility\MathUtility::canBeInterpretedAsInteger($this->uid)
727
                && $id == $this->_getToplevelId()) {
728
                $result = $GLOBALS['TYPO3_DB']->exec_SELECT_mm_query(
729
                    'tx_dlf_collections.index_name AS index_name',
730
                    'tx_dlf_documents',
731
                    'tx_dlf_relations',
732
                    'tx_dlf_collections',
733
                    'AND tx_dlf_collections.pid='.intval($cPid)
734
                        .' AND tx_dlf_documents.uid='.intval($this->uid)
735
                        .' AND tx_dlf_relations.ident='.$GLOBALS['TYPO3_DB']->fullQuoteStr('docs_colls', 'tx_dlf_relations')
736
                        .' AND tx_dlf_collections.sys_language_uid IN (-1,0)'
737
                        .Helper::whereClause('tx_dlf_documents')
738
                        .Helper::whereClause('tx_dlf_collections'),
739
                    'tx_dlf_collections.index_name',
740
                    '',
741
                    ''
742
                );
743
                while ($resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result)) {
744
                    if (!in_array($resArray['index_name'], $metadata['collection'])) {
745
                        $metadata['collection'][] = $resArray['index_name'];
746
                    }
747
                }
748
            }
749
        } else {
750
            // There is no dmdSec for this structure node.
751
            return [];
752
        }
753
        return $metadata;
754
    }
755
756
    /**
757
     * This returns the first corresponding physical page number of a given logical page label
758
     *
759
     * @access public
760
     *
761
     * @param string $logicalPage: The label (or a part of the label) of the logical page
762
     *
763
     * @return integer The physical page number
764
     */
765
    public function getPhysicalPage($logicalPage) {
766
        if (!empty($this->lastSearchedPhysicalPage['logicalPage'])
767
            && $this->lastSearchedPhysicalPage['logicalPage'] == $logicalPage) {
768
            return $this->lastSearchedPhysicalPage['physicalPage'];
769
        } else {
770
            $physicalPage = 0;
771
            foreach ($this->physicalStructureInfo as $page) {
772
                if (strpos($page['orderlabel'], $logicalPage) !== FALSE) {
773
                    $this->lastSearchedPhysicalPage['logicalPage'] = $logicalPage;
774
                    $this->lastSearchedPhysicalPage['physicalPage'] = $physicalPage;
775
                    return $physicalPage;
776
                }
777
                $physicalPage++;
778
            }
779
        }
780
        return 1;
781
    }
782
783
    /**
784
     * This extracts the raw text for a physical structure node
785
     *
786
     * @access public
787
     *
788
     * @param string $id: The @ID attribute of the physical structure node
789
     *
790
     * @return string The physical structure node's raw text
791
     */
792
    public function getRawText($id) {
793
        $rawText = '';
794
        // Get text from raw text array if available.
795
        if (!empty($this->rawTextArray[$id])) {
796
            return $this->rawTextArray[$id];
797
        }
798
        // Load fileGrps and check for fulltext files.
799
        $this->_getFileGrps();
800
        if ($this->hasFulltext) {
801
            // Load available text formats, ...
802
            $this->loadFormats();
803
            // ... physical structure ...
804
            $this->_getPhysicalStructure();
805
            // ... and extension configuration.
806
            $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf'][self::$extKey]);
807
            if (!empty($this->physicalStructureInfo[$id])) {
808
                // Get fulltext file.
809
                $file = $this->getFileLocation($this->physicalStructureInfo[$id]['files'][$extConf['fileGrpFulltext']]);
810
                // Turn off libxml's error logging.
811
                $libxmlErrors = libxml_use_internal_errors(TRUE);
812
                // Disables the functionality to allow external entities to be loaded when parsing the XML, must be kept.
813
                $previousValueOfEntityLoader = libxml_disable_entity_loader(TRUE);
814
                // Load XML from file.
815
                $rawTextXml = simplexml_load_string(\TYPO3\CMS\Core\Utility\GeneralUtility::getUrl($file));
0 ignored issues
show
Bug introduced by
It seems like TYPO3\CMS\Core\Utility\G...lUtility::getUrl($file) can also be of type false; however, parameter $data of simplexml_load_string() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

815
                $rawTextXml = simplexml_load_string(/** @scrutinizer ignore-type */ \TYPO3\CMS\Core\Utility\GeneralUtility::getUrl($file));
Loading history...
816
                // Reset entity loader setting.
817
                libxml_disable_entity_loader($previousValueOfEntityLoader);
818
                // Reset libxml's error logging.
819
                libxml_use_internal_errors($libxmlErrors);
820
                // Get the root element's name as text format.
821
                $textFormat = strtoupper($rawTextXml->getName());
822
            } else {
823
                Helper::devLog('Invalid structure node @ID "'.$id.'"', DEVLOG_SEVERITY_WARNING);
824
                return $rawText;
825
            }
826
            // Is this text format supported?
827
            if (!empty($this->formats[$textFormat])) {
828
                if (!empty($this->formats[$textFormat]['class'])) {
829
                    $class = $this->formats[$textFormat]['class'];
830
                    // Get the raw text from class.
831
                    if (class_exists($class)
832
                        && ($obj = \TYPO3\CMS\Core\Utility\GeneralUtility::makeInstance($class)) instanceof FulltextInterface) {
833
                        $rawText = $obj->getRawText($rawTextXml);
0 ignored issues
show
Bug introduced by
It seems like $rawTextXml can also be of type false; however, parameter $xml of Kitodo\Dlf\Common\FulltextInterface::getRawText() does only seem to accept SimpleXMLElement, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

833
                        $rawText = $obj->getRawText(/** @scrutinizer ignore-type */ $rawTextXml);
Loading history...
834
                        $this->rawTextArray[$id] = $rawText;
835
                    } else {
836
                        Helper::devLog('Invalid class/method "'.$class.'->getRawText()" for text format "'.$textFormat.'"', DEVLOG_SEVERITY_WARNING);
837
                    }
838
                }
839
            } else {
840
                Helper::devLog('Unsupported text format "'.$textFormat.'" in physical node with @ID "'.$id.'"', DEVLOG_SEVERITY_WARNING);
841
            }
842
        }
843
        return $rawText;
844
    }
845
846
    /**
847
     * This determines a title for the given document
848
     *
849
     * @access public
850
     *
851
     * @param integer $uid: The UID of the document
852
     * @param boolean $recursive: Search superior documents for a title, too?
853
     *
854
     * @return string The title of the document itself or a parent document
855
     */
856
    public static function getTitle($uid, $recursive = FALSE) {
857
        $title = '';
858
        // Sanitize input.
859
        $uid = max(intval($uid), 0);
860
        if ($uid) {
861
            $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
862
                'tx_dlf_documents.title,tx_dlf_documents.partof',
863
                'tx_dlf_documents',
864
                'tx_dlf_documents.uid='.$uid
865
                    .Helper::whereClause('tx_dlf_documents'),
866
                '',
867
                '',
868
                '1'
869
            );
870
            if ($GLOBALS['TYPO3_DB']->sql_num_rows($result)) {
871
                // Get title information.
872
                list ($title, $partof) = $GLOBALS['TYPO3_DB']->sql_fetch_row($result);
873
                // Search parent documents recursively for a title?
874
                if ($recursive
875
                    && empty($title)
876
                    && intval($partof)
877
                    && $partof != $uid) {
878
                    $title = self::getTitle($partof, TRUE);
879
                }
880
            } else {
881
                Helper::devLog('No document with UID '.$uid.' found or document not accessible', DEVLOG_SEVERITY_WARNING);
882
            }
883
        } else {
884
            Helper::devLog('Invalid UID '.$uid.' for document', DEVLOG_SEVERITY_ERROR);
885
        }
886
        return $title;
887
    }
888
889
    /**
890
     * This extracts all the metadata for the toplevel logical structure node
891
     *
892
     * @access public
893
     *
894
     * @param integer $cPid: The PID for the metadata definitions
895
     *
896
     * @return array The logical structure node's parsed metadata array
897
     */
898
    public function getTitledata($cPid = 0) {
899
        $titledata = $this->getMetadata($this->_getToplevelId(), $cPid);
900
        // Set record identifier for METS file if not present.
901
        if (is_array($titledata)
902
            && array_key_exists('record_id', $titledata)) {
903
            if (!empty($this->recordId)
904
                && !in_array($this->recordId, $titledata['record_id'])) {
905
                array_unshift($titledata['record_id'], $this->recordId);
906
            }
907
        }
908
        return $titledata;
909
    }
910
911
    /**
912
     * This sets some basic class properties
913
     *
914
     * @access protected
915
     *
916
     * @return void
917
     */
918
    protected function init() {
919
        // Get METS node from XML file.
920
        $this->registerNamespaces($this->xml);
921
        $mets = $this->xml->xpath('//mets:mets');
922
        if ($mets) {
923
            $this->mets = $mets[0];
924
            // Register namespaces.
925
            $this->registerNamespaces($this->mets);
926
        } else {
927
            Helper::devLog('No METS part found in document with UID '.$this->uid, DEVLOG_SEVERITY_ERROR);
928
        }
929
    }
930
931
    /**
932
     * Load XML file from URL
933
     *
934
     * @access protected
935
     *
936
     * @param string $location: The URL of the file to load
937
     *
938
     * @return boolean TRUE on success or FALSE on failure
939
     */
940
    protected function load($location) {
941
        // Load XML file.
942
        if (\TYPO3\CMS\Core\Utility\GeneralUtility::isValidUrl($location)) {
943
            // Load extension configuration
944
            $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf']['dlf']);
945
            // Set user-agent to identify self when fetching XML data.
946
            if (!empty($extConf['useragent'])) {
947
                @ini_set('user_agent', $extConf['useragent']);
1 ignored issue
show
Security Best Practice introduced by
It seems like you do not handle an error condition for ini_set(). This can introduce security issues, and is generally not recommended. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unhandled  annotation

947
                /** @scrutinizer ignore-unhandled */ @ini_set('user_agent', $extConf['useragent']);

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
948
            }
949
            // Turn off libxml's error logging.
950
            $libxmlErrors = libxml_use_internal_errors(TRUE);
951
            // Disables the functionality to allow external entities to be loaded when parsing the XML, must be kept
952
            $previousValueOfEntityLoader = libxml_disable_entity_loader(TRUE);
953
            // Load XML from file.
954
            $xml = simplexml_load_string(\TYPO3\CMS\Core\Utility\GeneralUtility::getUrl($location));
0 ignored issues
show
Bug introduced by
It seems like TYPO3\CMS\Core\Utility\G...lity::getUrl($location) can also be of type false; however, parameter $data of simplexml_load_string() does only seem to accept string, maybe add an additional type check? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

954
            $xml = simplexml_load_string(/** @scrutinizer ignore-type */ \TYPO3\CMS\Core\Utility\GeneralUtility::getUrl($location));
Loading history...
955
            // reset entity loader setting
956
            libxml_disable_entity_loader($previousValueOfEntityLoader);
957
            // Reset libxml's error logging.
958
            libxml_use_internal_errors($libxmlErrors);
959
            // Set some basic properties.
960
            if ($xml !== FALSE) {
961
                $this->xml = $xml;
962
                return TRUE;
963
            } else {
964
                Helper::devLog('Could not load XML file from "'.$location.'"', DEVLOG_SEVERITY_ERROR);
965
            }
966
        } else {
967
            Helper::devLog('Invalid file location "'.$location.'" for document loading', DEVLOG_SEVERITY_ERROR);
968
        }
969
        return FALSE;
970
    }
971
972
    /**
973
     * Register all available data formats
974
     *
975
     * @access protected
976
     *
977
     * @return void
978
     */
979
    protected function loadFormats() {
980
        if (!$this->formatsLoaded) {
981
            // Get available data formats from database.
982
            $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
983
                'tx_dlf_formats.type AS type,tx_dlf_formats.root AS root,tx_dlf_formats.namespace AS namespace,tx_dlf_formats.class AS class',
984
                'tx_dlf_formats',
985
                'tx_dlf_formats.pid=0'
986
                    .Helper::whereClause('tx_dlf_formats'),
987
                '',
988
                '',
989
                ''
990
            );
991
            while ($resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result)) {
992
                // Update format registry.
993
                $this->formats[$resArray['type']] = [
994
                    'rootElement' => $resArray['root'],
995
                    'namespaceURI' => $resArray['namespace'],
996
                    'class' => $resArray['class']
997
                ];
998
            }
999
            $this->formatsLoaded = TRUE;
1000
        }
1001
    }
1002
1003
    /**
1004
     * Register all available namespaces for a \SimpleXMLElement object
1005
     *
1006
     * @access public
1007
     *
1008
     * @param \SimpleXMLElement|\DOMXPath &$obj: \SimpleXMLElement or \DOMXPath object
1009
     *
1010
     * @return void
1011
     */
1012
    public function registerNamespaces(&$obj) {
1013
        $this->loadFormats();
1014
        // Do we have a \SimpleXMLElement or \DOMXPath object?
1015
        if ($obj instanceof \SimpleXMLElement) {
1016
            $method = 'registerXPathNamespace';
1017
        } elseif ($obj instanceof \DOMXPath) {
1018
            $method = 'registerNamespace';
1019
        } else {
1020
            Helper::devLog('Given object is neither a SimpleXMLElement nor a DOMXPath instance', DEVLOG_SEVERITY_ERROR);
1021
            return;
1022
        }
1023
        // Register metadata format's namespaces.
1024
        foreach ($this->formats as $enc => $conf) {
1025
            $obj->$method(strtolower($enc), $conf['namespaceURI']);
1026
        }
1027
    }
1028
1029
    /**
1030
     * This saves the document to the database and index
1031
     *
1032
     * @access public
1033
     *
1034
     * @param integer $pid: The PID of the saved record
1035
     * @param integer $core: The UID of the Solr core for indexing
1036
     *
1037
     * @return boolean TRUE on success or FALSE on failure
1038
     */
1039
    public function save($pid = 0, $core = 0) {
1040
        if (TYPO3_MODE !== 'BE') {
0 ignored issues
show
introduced by
The condition Kitodo\Dlf\Common\TYPO3_MODE !== 'BE' is always false.
Loading history...
1041
            Helper::devLog('Saving a document is only allowed in the backend', DEVLOG_SEVERITY_ERROR);
1042
            return FALSE;
1043
        }
1044
        // Make sure $pid is a non-negative integer.
1045
        $pid = max(intval($pid), 0);
1046
        // Make sure $core is a non-negative integer.
1047
        $core = max(intval($core), 0);
1048
        // If $pid is not given, try to get it elsewhere.
1049
        if (!$pid
1050
            && $this->pid) {
1051
            // Retain current PID.
1052
            $pid = $this->pid;
1053
        } elseif (!$pid) {
1054
            Helper::devLog('Invalid PID '.$pid.' for document saving', DEVLOG_SEVERITY_ERROR);
1055
            return FALSE;
1056
        }
1057
        // Set PID for metadata definitions.
1058
        $this->cPid = $pid;
1059
        // Set UID placeholder if not updating existing record.
1060
        if ($pid != $this->pid) {
1061
            $this->uid = uniqid('NEW');
1062
        }
1063
        // Get metadata array.
1064
        $metadata = $this->getTitledata($pid);
1065
        // Check for record identifier.
1066
        if (empty($metadata['record_id'][0])) {
1067
            Helper::devLog('No record identifier found to avoid duplication', DEVLOG_SEVERITY_ERROR);
1068
            return FALSE;
1069
        }
1070
        // Load plugin configuration.
1071
        $conf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf'][self::$extKey]);
1072
        // Get UID for user "_cli_dlf".
1073
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1074
            'be_users.uid AS uid',
1075
            'be_users',
1076
            'username='.$GLOBALS['TYPO3_DB']->fullQuoteStr('_cli_dlf', 'be_users')
1077
                .\TYPO3\CMS\Backend\Utility\BackendUtility::BEenableFields('be_users')
1078
                .\TYPO3\CMS\Backend\Utility\BackendUtility::deleteClause('be_users'),
1079
            '',
1080
            '',
1081
            '1'
1082
        );
1083
        if (!$GLOBALS['TYPO3_DB']->sql_num_rows($result)) {
1084
            Helper::devLog('Backend user "_cli_dlf" not found or disabled', DEVLOG_SEVERITY_ERROR);
1085
            return FALSE;
1086
        }
1087
        // Get UID for structure type.
1088
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1089
            'tx_dlf_structures.uid AS uid',
1090
            'tx_dlf_structures',
1091
            'tx_dlf_structures.pid='.intval($pid)
1092
                .' AND tx_dlf_structures.index_name='.$GLOBALS['TYPO3_DB']->fullQuoteStr($metadata['type'][0], 'tx_dlf_structures')
1093
                .Helper::whereClause('tx_dlf_structures'),
1094
            '',
1095
            '',
1096
            '1'
1097
        );
1098
        if ($GLOBALS['TYPO3_DB']->sql_num_rows($result)) {
1099
            list ($structure) = $GLOBALS['TYPO3_DB']->sql_fetch_row($result);
1100
        } else {
1101
            Helper::devLog('Could not identify document/structure type "'.$GLOBALS['TYPO3_DB']->fullQuoteStr($metadata['type'][0], 'tx_dlf_structures').'"', DEVLOG_SEVERITY_ERROR);
1102
            return FALSE;
1103
        }
1104
        $metadata['type'][0] = $structure;
1105
        // Get UIDs for collections.
1106
        $collections = [];
1107
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1108
            'tx_dlf_collections.index_name AS index_name,tx_dlf_collections.uid AS uid',
1109
            'tx_dlf_collections',
1110
            'tx_dlf_collections.pid='.intval($pid)
1111
                .' AND tx_dlf_collections.sys_language_uid IN (-1,0)'
1112
                .Helper::whereClause('tx_dlf_collections'),
1113
            '',
1114
            '',
1115
            ''
1116
        );
1117
        while ($resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result)) {
1118
            $collUid[$resArray['index_name']] = $resArray['uid'];
1119
        }
1120
        foreach ($metadata['collection'] as $collection) {
1121
            if (!empty($collUid[$collection])) {
1122
                // Add existing collection's UID.
1123
                $collections[] = $collUid[$collection];
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $collUid does not seem to be defined for all execution paths leading up to this point.
Loading history...
1124
            } else {
1125
                // Insert new collection.
1126
                $collNewUid = uniqid('NEW');
1127
                $collData['tx_dlf_collections'][$collNewUid] = [
1128
                    'pid' => $pid,
1129
                    'label' => $collection,
1130
                    'index_name' => $collection,
1131
                    'oai_name' => (!empty($conf['publishNewCollections']) ? Helper::getCleanString($collection) : ''),
1132
                    'description' => '',
1133
                    'documents' => 0,
1134
                    'owner' => 0,
1135
                    'status' => 0,
1136
                ];
1137
                $substUid = Helper::processDB($collData);
1138
                // Prevent double insertion.
1139
                unset ($collData);
1140
                // Add new collection's UID.
1141
                $collections[] = $substUid[$collNewUid];
1142
                if ((TYPO3_REQUESTTYPE & TYPO3_REQUESTTYPE_CLI) == FALSE) {
0 ignored issues
show
Bug Best Practice introduced by
It seems like you are loosely comparing Kitodo\Dlf\Common\TYPO3_...n\TYPO3_REQUESTTYPE_CLI of type integer to the boolean FALSE. If you are specifically checking for 0, consider using something more explicit like === 0 instead.
Loading history...
1143
                    Helper::addMessage(
1144
                        htmlspecialchars(sprintf(Helper::getMessage('flash.newCollection'), $collection, $substUid[$collNewUid])),
1145
                        Helper::getMessage('flash.attention', TRUE),
1146
                        \TYPO3\CMS\Core\Messaging\FlashMessage::INFO,
1147
                        TRUE
1148
                    );
1149
                }
1150
            }
1151
        }
1152
        $metadata['collection'] = $collections;
1153
        // Get UID for owner.
1154
        $owner = !empty($metadata['owner'][0]) ? $metadata['owner'][0] : 'default';
1155
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1156
            'tx_dlf_libraries.uid AS uid',
1157
            'tx_dlf_libraries',
1158
            'tx_dlf_libraries.pid='.intval($pid)
1159
                .' AND tx_dlf_libraries.index_name='.$GLOBALS['TYPO3_DB']->fullQuoteStr($owner, 'tx_dlf_libraries')
1160
                .Helper::whereClause('tx_dlf_libraries'),
1161
            '',
1162
            '',
1163
            '1'
1164
        );
1165
        if ($GLOBALS['TYPO3_DB']->sql_num_rows($result)) {
1166
            list ($ownerUid) = $GLOBALS['TYPO3_DB']->sql_fetch_row($result);
1167
        } else {
1168
            // Insert new library.
1169
            $libNewUid = uniqid('NEW');
1170
            $libData['tx_dlf_libraries'][$libNewUid] = [
1 ignored issue
show
Comprehensibility Best Practice introduced by
$libData was never initialized. Although not strictly required by PHP, it is generally a good practice to add $libData = array(); before regardless.
Loading history...
1171
                'pid' => $pid,
1172
                'label' => $owner,
1173
                'index_name' => $owner,
1174
                'website' => '',
1175
                'contact' => '',
1176
                'image' => '',
1177
                'oai_label' => '',
1178
                'oai_base' => '',
1179
                'opac_label' => '',
1180
                'opac_base' => '',
1181
                'union_label' => '',
1182
                'union_base' => '',
1183
            ];
1184
            $substUid = Helper::processDB($libData);
1185
            // Add new library's UID.
1186
            $ownerUid = $substUid[$libNewUid];
1187
            if ((TYPO3_REQUESTTYPE & TYPO3_REQUESTTYPE_CLI) == FALSE) {
0 ignored issues
show
Bug Best Practice introduced by
It seems like you are loosely comparing Kitodo\Dlf\Common\TYPO3_...n\TYPO3_REQUESTTYPE_CLI of type integer to the boolean FALSE. If you are specifically checking for 0, consider using something more explicit like === 0 instead.
Loading history...
1188
                Helper::addMessage(
1189
                    htmlspecialchars(sprintf(Helper::getMessage('flash.newLibrary'), $owner, $ownerUid)),
1190
                    Helper::getMessage('flash.attention', TRUE),
1191
                    \TYPO3\CMS\Core\Messaging\FlashMessage::INFO,
1192
                    TRUE
1193
                );
1194
            }
1195
        }
1196
        $metadata['owner'][0] = $ownerUid;
1197
        // Get UID of parent document.
1198
        $partof = 0;
1199
        // Get the closest ancestor of the current document which has a MPTR child.
1200
        $parentMptr = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@ID="'.$this->_getToplevelId().'"]/ancestor::mets:div[./mets:mptr][1]/mets:mptr');
1201
        if (!empty($parentMptr[0])) {
1202
            $parentLocation = (string) $parentMptr[0]->attributes('http://www.w3.org/1999/xlink')->href;
1203
            if ($parentLocation != $this->location) {
1204
                $parentDoc = self::getInstance($parentLocation, $pid);
1205
                if ($parentDoc->ready) {
1206
                    if ($parentDoc->pid != $pid) {
1207
                        $parentDoc->save($pid, $core);
1208
                    }
1209
                    $partof = $parentDoc->uid;
1210
                }
1211
            }
1212
        }
1213
        // Use the date of publication or title as alternative sorting metric for parts of multi-part works.
1214
        if (!empty($partof)) {
1215
            if (empty($metadata['volume'][0])
1216
                && !empty($metadata['year'][0])) {
1217
                $metadata['volume'] = $metadata['year'];
1218
            }
1219
            if (empty($metadata['volume_sorting'][0])) {
1220
                if (!empty($metadata['year_sorting'][0])) {
1221
                    $metadata['volume_sorting'][0] = $metadata['year_sorting'][0];
1222
                } elseif (!empty($metadata['year'][0])) {
1223
                    $metadata['volume_sorting'][0] = $metadata['year'][0];
1224
                }
1225
            }
1226
            // If volume_sorting is still empty, try to use title_sorting finally (workaround for newspapers)
1227
            if (empty($metadata['volume_sorting'][0])) {
1228
                if (!empty($metadata['title_sorting'][0])) {
1229
                    $metadata['volume_sorting'][0] = $metadata['title_sorting'][0];
1230
                }
1231
            }
1232
        }
1233
        // Get metadata for lists and sorting.
1234
        $listed = [];
1235
        $sortable = [];
1236
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1237
            'tx_dlf_metadata.index_name AS index_name,tx_dlf_metadata.is_listed AS is_listed,tx_dlf_metadata.is_sortable AS is_sortable',
1238
            'tx_dlf_metadata',
1239
            '(tx_dlf_metadata.is_listed=1 OR tx_dlf_metadata.is_sortable=1)'
1240
                .' AND tx_dlf_metadata.pid='.intval($pid)
1241
                .Helper::whereClause('tx_dlf_metadata'),
1242
            '',
1243
            '',
1244
            ''
1245
        );
1246
        while ($resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result)) {
1247
            if (!empty($metadata[$resArray['index_name']])) {
1248
                if ($resArray['is_listed']) {
1249
                    $listed[$resArray['index_name']] = $metadata[$resArray['index_name']];
1250
                }
1251
                if ($resArray['is_sortable']) {
1252
                    $sortable[$resArray['index_name']] = $metadata[$resArray['index_name']][0];
1253
                }
1254
            }
1255
        }
1256
        // Fill data array.
1257
        $data['tx_dlf_documents'][$this->uid] = [
1 ignored issue
show
Comprehensibility Best Practice introduced by
$data was never initialized. Although not strictly required by PHP, it is generally a good practice to add $data = array(); before regardless.
Loading history...
1258
            'pid' => $pid,
1259
            $GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['starttime'] => 0,
1260
            $GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['endtime'] => 0,
1261
            'prod_id' => $metadata['prod_id'][0],
1262
            'location' => $this->location,
1263
            'record_id' => $metadata['record_id'][0],
1264
            'opac_id' => $metadata['opac_id'][0],
1265
            'union_id' => $metadata['union_id'][0],
1266
            'urn' => $metadata['urn'][0],
1267
            'purl' => $metadata['purl'][0],
1268
            'title' => $metadata['title'][0],
1269
            'title_sorting' => $metadata['title_sorting'][0],
1270
            'author' => implode('; ', $metadata['author']),
1271
            'year' => implode('; ', $metadata['year']),
1272
            'place' => implode('; ', $metadata['place']),
1273
            'thumbnail' => $this->_getThumbnail(TRUE),
1274
            'metadata' => serialize($listed),
1275
            'metadata_sorting' => serialize($sortable),
1276
            'structure' => $metadata['type'][0],
1277
            'partof' => $partof,
1278
            'volume' => $metadata['volume'][0],
1279
            'volume_sorting' => $metadata['volume_sorting'][0],
1280
            'collections' => $metadata['collection'],
1281
            'owner' => $metadata['owner'][0],
1282
            'solrcore' => $core,
1283
            'status' => 0,
1284
        ];
1285
        // Unhide hidden documents.
1286
        if (!empty($conf['unhideOnIndex'])) {
1287
            $data['tx_dlf_documents'][$this->uid][$GLOBALS['TCA']['tx_dlf_documents']['ctrl']['enablecolumns']['disabled']] = 0;
1288
        }
1289
        // Process data.
1290
        $newIds = Helper::processDB($data);
1291
        // Replace placeholder with actual UID.
1292
        if (strpos($this->uid, 'NEW') === 0) {
1293
            $this->uid = $newIds[$this->uid];
1294
            $this->pid = $pid;
1295
            $this->parentId = $partof;
1296
        }
1297
        if ((TYPO3_REQUESTTYPE & TYPO3_REQUESTTYPE_CLI) == FALSE) {
0 ignored issues
show
Bug Best Practice introduced by
It seems like you are loosely comparing Kitodo\Dlf\Common\TYPO3_...n\TYPO3_REQUESTTYPE_CLI of type integer to the boolean FALSE. If you are specifically checking for 0, consider using something more explicit like === 0 instead.
Loading history...
1298
            Helper::addMessage(
1299
                htmlspecialchars(sprintf(Helper::getMessage('flash.documentSaved'), $metadata['title'][0], $this->uid)),
1300
                Helper::getMessage('flash.done', TRUE),
1301
                \TYPO3\CMS\Core\Messaging\FlashMessage::OK,
1302
                TRUE
1303
            );
1304
        }
1305
        // Add document to index.
1306
        if ($core) {
1307
            Indexer::add($this, $core);
1308
        } else {
1309
            Helper::devLog('Invalid UID "'.$core.'" for Solr core', DEVLOG_SEVERITY_NOTICE);
1310
        }
1311
        return TRUE;
1312
    }
1313
1314
    /**
1315
     * This returns $this->cPid via __get()
1316
     *
1317
     * @access protected
1318
     *
1319
     * @return integer The PID of the metadata definitions
1320
     */
1321
    protected function _getCPid() {
1322
        return $this->cPid;
1323
    }
1324
1325
    /**
1326
     * This builds an array of the document's dmdSecs
1327
     *
1328
     * @access protected
1329
     *
1330
     * @return array Array of dmdSecs with their IDs as array key
1331
     */
1332
    protected function _getDmdSec() {
1333
        if (!$this->dmdSecLoaded) {
1334
            // Get available data formats.
1335
            $this->loadFormats();
1336
            // Get dmdSec nodes from METS.
1337
            $dmdIds = $this->mets->xpath('./mets:dmdSec/@ID');
1338
            foreach ($dmdIds as $dmdId) {
1339
                if ($type = $this->mets->xpath('./mets:dmdSec[@ID="'.(string) $dmdId.'"]/mets:mdWrap[not(@MDTYPE="OTHER")]/@MDTYPE')) {
1340
                    if (!empty($this->formats[(string) $type[0]])) {
1341
                        $type = (string) $type[0];
1342
                        $xml = $this->mets->xpath('./mets:dmdSec[@ID="'.(string) $dmdId.'"]/mets:mdWrap[@MDTYPE="'.$type.'"]/mets:xmlData/'.strtolower($type).':'.$this->formats[$type]['rootElement']);
1343
                    }
1344
                } elseif ($type = $this->mets->xpath('./mets:dmdSec[@ID="'.(string) $dmdId.'"]/mets:mdWrap[@MDTYPE="OTHER"]/@OTHERMDTYPE')) {
1345
                    if (!empty($this->formats[(string) $type[0]])) {
1346
                        $type = (string) $type[0];
1347
                        $xml = $this->mets->xpath('./mets:dmdSec[@ID="'.(string) $dmdId.'"]/mets:mdWrap[@MDTYPE="OTHER"][@OTHERMDTYPE="'.$type.'"]/mets:xmlData/'.strtolower($type).':'.$this->formats[$type]['rootElement']);
1348
                    }
1349
                }
1350
                if ($xml) {
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $xml does not seem to be defined for all execution paths leading up to this point.
Loading history...
1351
                    $this->dmdSec[(string) $dmdId]['type'] = $type;
1352
                    $this->dmdSec[(string) $dmdId]['xml'] = $xml[0];
1353
                    $this->registerNamespaces($this->dmdSec[(string) $dmdId]['xml']);
1354
                }
1355
            }
1356
            $this->dmdSecLoaded = TRUE;
1357
        }
1358
        return $this->dmdSec;
1359
    }
1360
1361
    /**
1362
     * This builds the file ID -> USE concordance
1363
     *
1364
     * @access protected
1365
     *
1366
     * @return array Array of file use groups with file IDs
1367
     */
1368
    protected function _getFileGrps() {
1369
        if (!$this->fileGrpsLoaded) {
1370
            // Get configured USE attributes.
1371
            $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf'][self::$extKey]);
1372
            $useGrps = \TYPO3\CMS\Core\Utility\GeneralUtility::trimExplode(',', $extConf['fileGrps']);
1373
            if (!empty($extConf['fileGrpThumbs'])) {
1374
                $useGrps[] = $extConf['fileGrpThumbs'];
1375
            }
1376
            if (!empty($extConf['fileGrpDownload'])) {
1377
                $useGrps[] = $extConf['fileGrpDownload'];
1378
            }
1379
            if (!empty($extConf['fileGrpFulltext'])) {
1380
                $useGrps[] = $extConf['fileGrpFulltext'];
1381
            }
1382
            if (!empty($extConf['fileGrpAudio'])) {
1383
                $useGrps[] = $extConf['fileGrpAudio'];
1384
            }
1385
            // Get all file groups.
1386
            $fileGrps = $this->mets->xpath('./mets:fileSec/mets:fileGrp');
1387
            // Build concordance for configured USE attributes.
1388
            foreach ($fileGrps as $fileGrp) {
1389
                if (in_array((string) $fileGrp['USE'], $useGrps)) {
1390
                    foreach ($fileGrp->children('http://www.loc.gov/METS/')->file as $file) {
1391
                        $this->fileGrps[(string) $file->attributes()->ID] = (string) $fileGrp['USE'];
1392
                    }
1393
                }
1394
            }
1395
            // Are there any fulltext files available?
1396
            if (!empty($extConf['fileGrpFulltext'])
1397
                && in_array($extConf['fileGrpFulltext'], $this->fileGrps)) {
1398
                $this->hasFulltext = TRUE;
1399
            }
1400
            $this->fileGrpsLoaded = TRUE;
1401
        }
1402
        return $this->fileGrps;
1403
    }
1404
1405
    /**
1406
     * This returns $this->hasFulltext via __get()
1407
     *
1408
     * @access protected
1409
     *
1410
     * @return boolean Are there any fulltext files available?
1411
     */
1412
    protected function _getHasFulltext() {
1413
        // Are the fileGrps already loaded?
1414
        if (!$this->fileGrpsLoaded) {
1415
            $this->_getFileGrps();
1416
        }
1417
        return $this->hasFulltext;
1418
    }
1419
1420
    /**
1421
     * This returns $this->location via __get()
1422
     *
1423
     * @access protected
1424
     *
1425
     * @return string The location of the document
1426
     */
1427
    protected function _getLocation() {
1428
        return $this->location;
1429
    }
1430
1431
    /**
1432
     * This builds an array of the document's metadata
1433
     *
1434
     * @access protected
1435
     *
1436
     * @return array Array of metadata with their corresponding logical structure node ID as key
1437
     */
1438
    protected function _getMetadataArray() {
1439
        // Set metadata definitions' PID.
1440
        $cPid = ($this->cPid ? $this->cPid : $this->pid);
1441
        if (!$cPid) {
1442
            Helper::devLog('Invalid PID '.$cPid.' for metadata definitions', DEVLOG_SEVERITY_ERROR);
1443
            return [];
1444
        }
1445
        if (!$this->metadataArrayLoaded
1446
            || $this->metadataArray[0] != $cPid) {
1447
            // Get all logical structure nodes with metadata.
1448
            if (($ids = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID]/@ID'))) {
1449
                foreach ($ids as $id) {
1450
                    $this->metadataArray[(string) $id] = $this->getMetadata((string) $id, $cPid);
1451
                }
1452
            }
1453
            // Set current PID for metadata definitions.
1454
            $this->metadataArray[0] = $cPid;
1455
            $this->metadataArrayLoaded = TRUE;
1456
        }
1457
        return $this->metadataArray;
1458
    }
1459
1460
    /**
1461
     * This returns $this->mets via __get()
1462
     *
1463
     * @access protected
1464
     *
1465
     * @return \SimpleXMLElement The XML's METS part as \SimpleXMLElement object
1466
     */
1467
    protected function _getMets() {
1468
        return $this->mets;
1469
    }
1470
1471
    /**
1472
     * This returns $this->numPages via __get()
1473
     *
1474
     * @access protected
1475
     *
1476
     * @return integer The total number of pages and/or tracks
1477
     */
1478
    protected function _getNumPages() {
1479
        $this->_getPhysicalStructure();
1480
        return $this->numPages;
1481
    }
1482
1483
    /**
1484
     * This returns $this->parentId via __get()
1485
     *
1486
     * @access protected
1487
     *
1488
     * @return integer The UID of the parent document or zero if not applicable
1489
     */
1490
    protected function _getParentId() {
1491
        return $this->parentId;
1492
    }
1493
1494
    /**
1495
     * This builds an array of the document's physical structure
1496
     *
1497
     * @access protected
1498
     *
1499
     * @return array Array of physical elements' id, type, label and file representations ordered by @ORDER attribute
1500
     */
1501
    protected function _getPhysicalStructure() {
1502
        // Is there no physical structure array yet?
1503
        if (!$this->physicalStructureLoaded) {
1504
            // Does the document have a structMap node of type "PHYSICAL"?
1505
            $elementNodes = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div');
1506
            if ($elementNodes) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $elementNodes of type SimpleXMLElement[] is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
1507
                // Get file groups.
1508
                $fileUse = $this->_getFileGrps();
1509
                // Get the physical sequence's metadata.
1510
                $physNode = $this->mets->xpath('./mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]');
1511
                $physSeq[0] = (string) $physNode[0]['ID'];
1 ignored issue
show
Comprehensibility Best Practice introduced by
$physSeq was never initialized. Although not strictly required by PHP, it is generally a good practice to add $physSeq = array(); before regardless.
Loading history...
1512
                $this->physicalStructureInfo[$physSeq[0]]['id'] = (string) $physNode[0]['ID'];
1513
                $this->physicalStructureInfo[$physSeq[0]]['dmdId'] = (isset($physNode[0]['DMDID']) ? (string) $physNode[0]['DMDID'] : '');
1514
                $this->physicalStructureInfo[$physSeq[0]]['label'] = (isset($physNode[0]['LABEL']) ? (string) $physNode[0]['LABEL'] : '');
1515
                $this->physicalStructureInfo[$physSeq[0]]['orderlabel'] = (isset($physNode[0]['ORDERLABEL']) ? (string) $physNode[0]['ORDERLABEL'] : '');
1516
                $this->physicalStructureInfo[$physSeq[0]]['type'] = (string) $physNode[0]['TYPE'];
1517
                $this->physicalStructureInfo[$physSeq[0]]['contentIds'] = (isset($physNode[0]['CONTENTIDS']) ? (string) $physNode[0]['CONTENTIDS'] : '');
1518
                // Get the file representations from fileSec node.
1519
                foreach ($physNode[0]->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1520
                    // Check if file has valid @USE attribute.
1521
                    if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1522
                        $this->physicalStructureInfo[$physSeq[0]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1523
                    }
1524
                }
1525
                // Build the physical elements' array from the physical structMap node.
1526
                foreach ($elementNodes as $elementNode) {
1527
                    $elements[(int) $elementNode['ORDER']] = (string) $elementNode['ID'];
1528
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['id'] = (string) $elementNode['ID'];
1 ignored issue
show
Comprehensibility Best Practice introduced by
The variable $elements seems to be defined later in this foreach loop on line 1527. Are you sure it is defined here?
Loading history...
1529
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['dmdId'] = (isset($elementNode['DMDID']) ? (string) $elementNode['DMDID'] : '');
1530
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['label'] = (isset($elementNode['LABEL']) ? (string) $elementNode['LABEL'] : '');
1531
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['orderlabel'] = (isset($elementNode['ORDERLABEL']) ? (string) $elementNode['ORDERLABEL'] : '');
1532
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['type'] = (string) $elementNode['TYPE'];
1533
                    $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['contentIds'] = (isset($elementNode['CONTENTIDS']) ? (string) $elementNode['CONTENTIDS'] : '');
1534
                    // Get the file representations from fileSec node.
1535
                    foreach ($elementNode->children('http://www.loc.gov/METS/')->fptr as $fptr) {
1536
                        // Check if file has valid @USE attribute.
1537
                        if (!empty($fileUse[(string) $fptr->attributes()->FILEID])) {
1538
                            $this->physicalStructureInfo[$elements[(int) $elementNode['ORDER']]]['files'][$fileUse[(string) $fptr->attributes()->FILEID]] = (string) $fptr->attributes()->FILEID;
1539
                        }
1540
                    }
1541
                }
1542
                // Sort array by keys (= @ORDER).
1543
                if (ksort($elements)) {
1544
                    // Set total number of pages/tracks.
1545
                    $this->numPages = count($elements);
1546
                    // Merge and re-index the array to get nice numeric indexes.
1547
                    $this->physicalStructure = array_merge($physSeq, $elements);
1548
                }
1549
            }
1550
            $this->physicalStructureLoaded = TRUE;
1551
        }
1552
        return $this->physicalStructure;
1553
    }
1554
1555
    /**
1556
     * This gives an array of the document's physical structure metadata
1557
     *
1558
     * @access protected
1559
     *
1560
     * @return array Array of elements' type, label and file representations ordered by @ID attribute
1561
     */
1562
    protected function _getPhysicalStructureInfo() {
1563
        // Is there no physical structure array yet?
1564
        if (!$this->physicalStructureLoaded) {
1565
            // Build physical structure array.
1566
            $this->_getPhysicalStructure();
1567
        }
1568
        return $this->physicalStructureInfo;
1569
    }
1570
1571
    /**
1572
     * This returns $this->pid via __get()
1573
     *
1574
     * @access protected
1575
     *
1576
     * @return integer The PID of the document or zero if not in database
1577
     */
1578
    protected function _getPid() {
1579
        return $this->pid;
1580
    }
1581
1582
    /**
1583
     * This returns $this->ready via __get()
1584
     *
1585
     * @access protected
1586
     *
1587
     * @return boolean Is the document instantiated successfully?
1588
     */
1589
    protected function _getReady() {
1590
        return $this->ready;
1591
    }
1592
1593
    /**
1594
     * This returns $this->recordId via __get()
1595
     *
1596
     * @access protected
1597
     *
1598
     * @return mixed The METS file's record identifier
1599
     */
1600
    protected function _getRecordId() {
1601
        return $this->recordId;
1602
    }
1603
1604
    /**
1605
     * This returns $this->rootId via __get()
1606
     *
1607
     * @access protected
1608
     *
1609
     * @return integer The UID of the root document or zero if not applicable
1610
     */
1611
    protected function _getRootId() {
1612
        if (!$this->rootIdLoaded) {
1613
            if ($this->parentId) {
1614
                $parent = self::getInstance($this->parentId, $this->pid);
1615
                $this->rootId = $parent->rootId;
1616
            }
1617
            $this->rootIdLoaded = TRUE;
1618
        }
1619
        return $this->rootId;
1620
    }
1621
1622
    /**
1623
     * This returns the smLinks between logical and physical structMap
1624
     *
1625
     * @access protected
1626
     *
1627
     * @return array The links between logical and physical nodes
1628
     */
1629
    protected function _getSmLinks() {
1630
        if (!$this->smLinksLoaded) {
1631
            $smLinks = $this->mets->xpath('./mets:structLink/mets:smLink');
1632
            foreach ($smLinks as $smLink) {
1633
                $this->smLinks['l2p'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->from][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->to;
1634
                $this->smLinks['p2l'][(string) $smLink->attributes('http://www.w3.org/1999/xlink')->to][] = (string) $smLink->attributes('http://www.w3.org/1999/xlink')->from;
1635
            }
1636
            $this->smLinksLoaded = TRUE;
1637
        }
1638
        return $this->smLinks;
1639
    }
1640
1641
    /**
1642
     * This builds an array of the document's logical structure
1643
     *
1644
     * @access protected
1645
     *
1646
     * @return array Array of structure nodes' id, label, type and physical page indexes/mptr link with original hierarchy preserved
1647
     */
1648
    protected function _getTableOfContents() {
1649
        // Is there no logical structure array yet?
1650
        if (!$this->tableOfContentsLoaded) {
1651
            // Get all logical structures.
1652
            $this->getLogicalStructure('', TRUE);
1653
            $this->tableOfContentsLoaded = TRUE;
1654
        }
1655
        return $this->tableOfContents;
1656
    }
1657
1658
    /**
1659
     * This returns the document's thumbnail location
1660
     *
1661
     * @access protected
1662
     *
1663
     * @param boolean $forceReload: Force reloading the thumbnail instead of returning the cached value
1664
     *
1665
     * @return string The document's thumbnail location
1666
     */
1667
    protected function _getThumbnail($forceReload = FALSE) {
1668
        if (!$this->thumbnailLoaded
1669
            || $forceReload) {
1670
            // Retain current PID.
1671
            $cPid = ($this->cPid ? $this->cPid : $this->pid);
1672
            if (!$cPid) {
1673
                Helper::devLog('Invalid PID '.$cPid.' for structure definitions', DEVLOG_SEVERITY_ERROR);
1674
                $this->thumbnailLoaded = TRUE;
1675
                return $this->thumbnail;
1676
            }
1677
            // Load extension configuration.
1678
            $extConf = unserialize($GLOBALS['TYPO3_CONF_VARS']['EXT']['extConf'][self::$extKey]);
1679
            if (empty($extConf['fileGrpThumbs'])) {
1680
                Helper::devLog('No fileGrp for thumbnails specified', DEVLOG_SEVERITY_WARNING);
1681
                $this->thumbnailLoaded = TRUE;
1682
                return $this->thumbnail;
1683
            }
1684
            $strctId = $this->_getToplevelId();
1685
            $metadata = $this->getTitledata($cPid);
1686
            // Get structure element to get thumbnail from.
1687
            $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1688
                'tx_dlf_structures.thumbnail AS thumbnail',
1689
                'tx_dlf_structures',
1690
                'tx_dlf_structures.pid='.intval($cPid)
1691
                    .' AND tx_dlf_structures.index_name='.$GLOBALS['TYPO3_DB']->fullQuoteStr($metadata['type'][0], 'tx_dlf_structures')
1692
                    .Helper::whereClause('tx_dlf_structures'),
1693
                '',
1694
                '',
1695
                '1'
1696
            );
1697
            if ($GLOBALS['TYPO3_DB']->sql_num_rows($result) > 0) {
1698
                $resArray = $GLOBALS['TYPO3_DB']->sql_fetch_assoc($result);
1699
                // Get desired thumbnail structure if not the toplevel structure itself.
1700
                if (!empty($resArray['thumbnail'])) {
1701
                    $strctType = Helper::getIndexNameFromUid($resArray['thumbnail'], 'tx_dlf_structures', $cPid);
1702
                    // Check if this document has a structure element of the desired type.
1703
                    $strctIds = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@TYPE="'.$strctType.'"]/@ID');
1704
                    if (!empty($strctIds)) {
1705
                        $strctId = (string) $strctIds[0];
1706
                    }
1707
                }
1708
                // Load smLinks.
1709
                $this->_getSmLinks();
1710
                // Get thumbnail location.
1711
                if ($this->_getPhysicalStructure()
1712
                    && !empty($this->smLinks['l2p'][$strctId])) {
1713
                    $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->smLinks['l2p'][$strctId][0]]['files'][$extConf['fileGrpThumbs']]);
1714
                } else {
1715
                    $this->thumbnail = $this->getFileLocation($this->physicalStructureInfo[$this->physicalStructure[1]]['files'][$extConf['fileGrpThumbs']]);
1716
                }
1717
            } else {
1718
                Helper::devLog('No structure of type "'.$metadata['type'][0].'" found in database', DEVLOG_SEVERITY_ERROR);
1719
            }
1720
            $this->thumbnailLoaded = TRUE;
1721
        }
1722
        return $this->thumbnail;
1723
    }
1724
1725
    /**
1726
     * This returns the ID of the toplevel logical structure node
1727
     *
1728
     * @access protected
1729
     *
1730
     * @return string The logical structure node's ID
1731
     */
1732
    protected function _getToplevelId() {
1733
        if (empty($this->toplevelId)) {
1734
            // Get all logical structure nodes with metadata, but without associated METS-Pointers.
1735
            if (($divs = $this->mets->xpath('./mets:structMap[@TYPE="LOGICAL"]//mets:div[@DMDID and not(./mets:mptr)]'))) {
1736
                // Load smLinks.
1737
                $this->_getSmLinks();
1738
                foreach ($divs as $div) {
1739
                    $id = (string) $div['ID'];
1740
                    // Are there physical structure nodes for this logical structure?
1741
                    if (array_key_exists($id, $this->smLinks['l2p'])) {
1742
                        // Yes. That's what we're looking for.
1743
                        $this->toplevelId = $id;
1744
                        break;
1745
                    } elseif (empty($this->toplevelId)) {
1746
                        // No. Remember this anyway, but keep looking for a better one.
1747
                        $this->toplevelId = $id;
1748
                    }
1749
                }
1750
            }
1751
        }
1752
        return $this->toplevelId;
1753
    }
1754
1755
    /**
1756
     * This returns $this->uid via __get()
1757
     *
1758
     * @access protected
1759
     *
1760
     * @return mixed The UID or the URL of the document
1761
     */
1762
    protected function _getUid() {
1763
        return $this->uid;
1764
    }
1765
1766
    /**
1767
     * This sets $this->cPid via __set()
1768
     *
1769
     * @access protected
1770
     *
1771
     * @param integer $value: The new PID for the metadata definitions
1772
     *
1773
     * @return void
1774
     */
1775
    protected function _setCPid($value) {
1776
        $this->cPid = max(intval($value), 0);
1777
    }
1778
1779
    /**
1780
     * This magic method is invoked each time a clone is called on the object variable
1781
     * (This method is defined as private/protected because singleton objects should not be cloned)
1782
     *
1783
     * @access protected
1784
     *
1785
     * @return void
1786
     */
1787
    protected function __clone() {}
1788
1789
    /**
1790
     * This is a singleton class, thus the constructor should be private/protected
1791
     * (Get an instance of this class by calling \Kitodo\Dlf\Common\Document::getInstance())
1792
     *
1793
     * @access protected
1794
     *
1795
     * @param integer $uid: The UID of the document to parse or URL to XML file
1796
     * @param integer $pid: If > 0, then only document with this PID gets loaded
1797
     *
1798
     * @return void
1799
     */
1800
    protected function __construct($uid, $pid) {
1801
        // Prepare to check database for the requested document.
1802
        if (\TYPO3\CMS\Core\Utility\MathUtility::canBeInterpretedAsInteger($uid)) {
1803
            $whereClause = 'tx_dlf_documents.uid='.intval($uid).Helper::whereClause('tx_dlf_documents');
1804
        } else {
1805
            // Try to load METS file.
1806
            if (\TYPO3\CMS\Core\Utility\GeneralUtility::isValidUrl($uid)
1807
                && $this->load($uid)) {
1808
                // Initialize core METS object.
1809
                $this->init();
1810
                if ($this->mets !== NULL) {
1811
                    // Cast to string for safety reasons.
1812
                    $location = (string) $uid;
1813
                    // Check for METS object @ID.
1814
                    if (!empty($this->mets['OBJID'])) {
1815
                        $this->recordId = (string) $this->mets['OBJID'];
1816
                    }
1817
                    // Get hook objects.
1818
                    $hookObjects = Helper::getHookObjects('Classes/Common/Document.php');
1819
                    // Apply hooks.
1820
                    foreach ($hookObjects as $hookObj) {
1821
                        if (method_exists($hookObj, 'construct_postProcessRecordId')) {
1822
                            $hookObj->construct_postProcessRecordId($this->xml, $this->recordId);
1823
                        }
1824
                    }
1825
                } else {
1826
                    // No METS part found.
1827
                    return;
1828
                }
1829
            } else {
1830
                // Loading failed.
1831
                return;
1832
            }
1833
            if (!empty($location)
1834
                && !empty($this->recordId)) {
1835
                // Try to match record identifier or location (both should be unique).
1836
                $whereClause = '(tx_dlf_documents.location='.$GLOBALS['TYPO3_DB']->fullQuoteStr($location, 'tx_dlf_documents').' OR tx_dlf_documents.record_id='.$GLOBALS['TYPO3_DB']->fullQuoteStr($this->recordId, 'tx_dlf_documents').')'.Helper::whereClause('tx_dlf_documents');
1837
            } else {
1838
                // Can't persistently identify document, don't try to match at all.
1839
                $whereClause = '1=-1';
1840
            }
1841
        }
1842
        // Check for PID if needed.
1843
        if ($pid) {
1844
            $whereClause .= ' AND tx_dlf_documents.pid='.intval($pid);
1845
        }
1846
        // Get document PID and location from database.
1847
        $result = $GLOBALS['TYPO3_DB']->exec_SELECTquery(
1848
            'tx_dlf_documents.uid AS uid,tx_dlf_documents.pid AS pid,tx_dlf_documents.record_id AS record_id,tx_dlf_documents.partof AS partof,tx_dlf_documents.thumbnail AS thumbnail,tx_dlf_documents.location AS location',
1849
            'tx_dlf_documents',
1850
            $whereClause,
1851
            '',
1852
            '',
1853
            '1'
1854
        );
1855
        if ($GLOBALS['TYPO3_DB']->sql_num_rows($result) > 0) {
1856
            list ($this->uid, $this->pid, $this->recordId, $this->parentId, $this->thumbnail, $this->location) = $GLOBALS['TYPO3_DB']->sql_fetch_row($result);
1857
            $this->thumbnailLoaded = TRUE;
1858
            // Load XML file if necessary...
1859
            if ($this->mets === NULL
1860
                && $this->load($this->location)) {
1861
                // ...and set some basic properties.
1862
                $this->init();
1863
            }
1864
            // Do we have a METS object now?
1865
            if ($this->mets !== NULL) {
1866
                // Set new location if necessary.
1867
                if (!empty($location)) {
1868
                    $this->location = $location;
1869
                }
1870
                // Document ready!
1871
                $this->ready = TRUE;
1872
            }
1873
        } elseif ($this->mets !== NULL) {
1874
            // Set location as UID for documents not in database.
1875
            $this->uid = $location;
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $location does not seem to be defined for all execution paths leading up to this point.
Loading history...
1876
            $this->location = $location;
1877
            // Document ready!
1878
            $this->ready = TRUE;
1879
        } else {
1880
            Helper::devLog('No document with UID '.$uid.' found or document not accessible', DEVLOG_SEVERITY_ERROR);
1881
        }
1882
    }
1883
1884
    /**
1885
     * This magic method is called each time an invisible property is referenced from the object
1886
     *
1887
     * @access public
1888
     *
1889
     * @param string $var: Name of variable to get
1890
     *
1891
     * @return mixed Value of $this->$var
1892
     */
1893
    public function __get($var) {
1894
        $method = '_get'.ucfirst($var);
1895
        if (!property_exists($this, $var)
1896
            || !method_exists($this, $method)) {
1897
            Helper::devLog('There is no getter function for property "'.$var.'"', DEVLOG_SEVERITY_WARNING);
1898
            return;
1899
        } else {
1900
            return $this->$method();
1901
        }
1902
    }
1903
1904
    /**
1905
     * This magic method is called each time an invisible property is referenced from the object
1906
     *
1907
     * @access public
1908
     *
1909
     * @param string $var: Name of variable to set
1910
     * @param mixed $value: New value of variable
1911
     *
1912
     * @return void
1913
     */
1914
    public function __set($var, $value) {
1915
        $method = '_set'.ucfirst($var);
1916
        if (!property_exists($this, $var)
1917
            || !method_exists($this, $method)) {
1918
            Helper::devLog('There is no setter function for property "'.$var.'"', DEVLOG_SEVERITY_WARNING);
1919
        } else {
1920
            $this->$method($value);
1921
        }
1922
    }
1923
1924
    /**
1925
     * This magic method is executed prior to any serialization of the object
1926
     * @see __wakeup()
1927
     *
1928
     * @access public
1929
     *
1930
     * @return array Properties to be serialized
1931
     */
1932
    public function __sleep() {
1933
        // \SimpleXMLElement objects can't be serialized, thus save the XML as string for serialization
1934
        $this->asXML = $this->xml->asXML();
1935
        return ['uid', 'pid', 'recordId', 'parentId', 'asXML'];
1936
    }
1937
1938
    /**
1939
     * This magic method is used for setting a string value for the object
1940
     *
1941
     * @access public
1942
     *
1943
     * @return string String representing the METS object
1944
     */
1945
    public function __toString() {
1946
        $xml = new \DOMDocument('1.0', 'utf-8');
1947
        $xml->appendChild($xml->importNode(dom_import_simplexml($this->mets), TRUE));
1948
        $xml->formatOutput = TRUE;
1949
        return $xml->saveXML();
1950
    }
1951
1952
    /**
1953
     * This magic method is executed after the object is deserialized
1954
     * @see __sleep()
1955
     *
1956
     * @access public
1957
     *
1958
     * @return void
1959
     */
1960
    public function __wakeup() {
1961
        // Turn off libxml's error logging.
1962
        $libxmlErrors = libxml_use_internal_errors(TRUE);
1963
        // Reload XML from string.
1964
        $xml = @simplexml_load_string($this->asXML);
1965
        // Reset libxml's error logging.
1966
        libxml_use_internal_errors($libxmlErrors);
1967
        if ($xml !== FALSE) {
1968
            $this->asXML = '';
1969
            $this->xml = $xml;
1970
            // Rebuild the unserializable properties.
1971
            $this->init();
1972
        } else {
1973
            Helper::devLog('Could not load XML after deserialization', DEVLOG_SEVERITY_ERROR);
1974
        }
1975
    }
1976
}
1977