Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#711)
by Beatrycze
02:45
created

Indexer::getUpdateQuery()   A

Complexity

Conditions 1
Paths 1

Size

Total Lines 14
Code Lines 12

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 0 Features 0
Metric Value
eloc 12
c 1
b 0
f 0
dl 0
loc 14
rs 9.8666
cc 1
nc 1
nop 3
1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
16
use TYPO3\CMS\Core\Database\ConnectionPool;
17
use TYPO3\CMS\Core\Messaging\FlashMessage;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
use TYPO3\CMS\Core\Utility\MathUtility;
20
use Ubl\Iiif\Presentation\Common\Model\Resources\AnnotationContainerInterface;
21
use Ubl\Iiif\Tools\IiifHelper;
22
23
/**
24
 * Indexer class for the 'dlf' extension
25
 *
26
 * @author Sebastian Meyer <[email protected]>
27
 * @package TYPO3
28
 * @subpackage dlf
29
 * @access public
30
 */
31
class Indexer
32
{
33
    /**
34
     * The extension key
35
     *
36
     * @var string
37
     * @access public
38
     */
39
    public static $extKey = 'dlf';
40
41
    /**
42
     * Array of metadata fields' configuration
43
     * @see loadIndexConf()
44
     *
45
     * @var array
46
     * @access protected
47
     */
48
    protected static $fields = [
49
        'autocomplete' => [],
50
        'facets' => [],
51
        'sortables' => [],
52
        'indexed' => [],
53
        'stored' => [],
54
        'tokenized' => [],
55
        'fieldboost' => []
56
    ];
57
58
    /**
59
     * Is the index configuration loaded?
60
     * @see $fields
61
     *
62
     * @var bool
63
     * @access protected
64
     */
65
    protected static $fieldsLoaded = false;
66
67
    /**
68
     * List of already processed documents
69
     *
70
     * @var array
71
     * @access protected
72
     */
73
    protected static $processedDocs = [];
74
75
    /**
76
     * Instance of \Kitodo\Dlf\Common\Solr class
77
     *
78
     * @var \Kitodo\Dlf\Common\Solr
79
     * @access protected
80
     */
81
    protected static $solr;
82
83
    /**
84
     * Insert given document into Solr index
85
     *
86
     * @access public
87
     *
88
     * @param \Kitodo\Dlf\Common\Document &$doc: The document to add
89
     * @param int $core: UID of the Solr core to use
90
     *
91
     * @return bool true on success or false on failure
92
     */
93
    public static function add(Document &$doc, $core = 0)
94
    {
95
        if (in_array($doc->uid, self::$processedDocs)) {
96
            return true;
97
        } elseif (self::solrConnect($core, $doc->pid)) {
98
            $success = true;
99
            // Handle multi-volume documents.
100
            if ($doc->parentId) {
101
                $parent = Document::getInstance($doc->parentId, 0, true);
102
                if ($parent->ready) {
103
                    $success = self::add($parent, $core);
104
                } else {
105
                    Helper::log('Could not load parent document with UID ' . $doc->parentId, LOG_SEVERITY_ERROR);
106
                    return false;
107
                }
108
            }
109
            try {
110
                // Add document to list of processed documents.
111
                self::$processedDocs[] = $doc->uid;
112
                // Delete old Solr documents.
113
                $updateQuery = self::$solr->service->createUpdate();
114
                $updateQuery->addDeleteQuery('uid:' . $doc->uid);
115
                self::$solr->service->update($updateQuery);
116
117
                // Index every logical unit as separate Solr document.
118
                foreach ($doc->tableOfContents as $logicalUnit) {
119
                    if ($success) {
120
                        $success = self::processLogical($doc, $logicalUnit);
121
                    } else {
122
                        break;
123
                    }
124
                }
125
                // Index full text files if available.
126
                if ($doc->hasFulltext) {
127
                    foreach ($doc->physicalStructure as $pageNumber => $xmlId) {
128
                        if ($success) {
129
                            $success = self::processPhysical($doc, $pageNumber, $doc->physicalStructureInfo[$xmlId]);
130
                        } else {
131
                            break;
132
                        }
133
                    }
134
                }
135
                // Commit all changes.
136
                $updateQuery = self::$solr->service->createUpdate();
137
                $updateQuery->addCommit();
138
                self::$solr->service->update($updateQuery);
139
140
                $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
141
                    ->getQueryBuilderForTable('tx_dlf_documents');
142
143
                // Get document title from database.
144
                $result = $queryBuilder
145
                    ->select('tx_dlf_documents.title AS title')
146
                    ->from('tx_dlf_documents')
147
                    ->where(
148
                        $queryBuilder->expr()->eq('tx_dlf_documents.uid', intval($doc->uid)),
149
                        Helper::whereExpression('tx_dlf_documents')
150
                    )
151
                    ->setMaxResults(1)
152
                    ->execute();
153
154
                $allResults = $result->fetchAll();
155
                $resArray = $allResults[0];
156
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
157
                    if ($success) {
158
                        Helper::addMessage(
159
                            htmlspecialchars(sprintf(Helper::getMessage('flash.documentIndexed'), $resArray['title'], $doc->uid)),
160
                            Helper::getMessage('flash.done', true),
161
                            FlashMessage::OK,
162
                            true,
163
                            'core.template.flashMessages'
164
                        );
165
                    } else {
166
                        Helper::addMessage(
167
                            htmlspecialchars(sprintf(Helper::getMessage('flash.documentNotIndexed'), $resArray['title'], $doc->uid)),
168
                            Helper::getMessage('flash.error', true),
169
                            FlashMessage::ERROR,
170
                            true,
171
                            'core.template.flashMessages'
172
                        );
173
                    }
174
                }
175
                return $success;
176
            } catch (\Exception $e) {
177
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
178
                    Helper::addMessage(
179
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
180
                        Helper::getMessage('flash.error', true),
181
                        FlashMessage::ERROR,
182
                        true,
183
                        'core.template.flashMessages'
184
                    );
185
                }
186
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
187
                return false;
188
            }
189
        } else {
190
            if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
191
                Helper::addMessage(
192
                    Helper::getMessage('flash.solrNoConnection', true),
193
                    Helper::getMessage('flash.warning', true),
194
                    FlashMessage::WARNING,
195
                    true,
196
                    'core.template.flashMessages'
197
                );
198
            }
199
            Helper::log('Could not connect to Apache Solr server', LOG_SEVERITY_ERROR);
200
            return false;
201
        }
202
    }
203
204
    /**
205
     * Returns the dynamic index field name for the given metadata field.
206
     *
207
     * @access public
208
     *
209
     * @param string $index_name: The metadata field's name in database
210
     * @param int $pid: UID of the configuration page
211
     *
212
     * @return string The field's dynamic index name
213
     */
214
    public static function getIndexFieldName($index_name, $pid = 0)
215
    {
216
        // Sanitize input.
217
        $pid = max(intval($pid), 0);
218
        if (!$pid) {
219
            Helper::log('Invalid PID ' . $pid . ' for metadata configuration', LOG_SEVERITY_ERROR);
220
            return '';
221
        }
222
        // Load metadata configuration.
223
        self::loadIndexConf($pid);
224
        // Build field's suffix.
225
        $suffix = (in_array($index_name, self::$fields['tokenized']) ? 't' : 'u');
226
        $suffix .= (in_array($index_name, self::$fields['stored']) ? 's' : 'u');
227
        $suffix .= (in_array($index_name, self::$fields['indexed']) ? 'i' : 'u');
228
        $index_name .= '_' . $suffix;
229
        return $index_name;
230
    }
231
232
    /**
233
     * Load indexing configuration
234
     *
235
     * @access protected
236
     *
237
     * @param int $pid: The configuration page's UID
238
     *
239
     * @return void
240
     */
241
    protected static function loadIndexConf($pid)
242
    {
243
        if (!self::$fieldsLoaded) {
244
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
245
                ->getQueryBuilderForTable('tx_dlf_metadata');
246
247
            // Get the metadata indexing options.
248
            $result = $queryBuilder
249
                ->select(
250
                    'tx_dlf_metadata.index_name AS index_name',
251
                    'tx_dlf_metadata.index_tokenized AS index_tokenized',
252
                    'tx_dlf_metadata.index_stored AS index_stored',
253
                    'tx_dlf_metadata.index_indexed AS index_indexed',
254
                    'tx_dlf_metadata.is_sortable AS is_sortable',
255
                    'tx_dlf_metadata.is_facet AS is_facet',
256
                    'tx_dlf_metadata.is_listed AS is_listed',
257
                    'tx_dlf_metadata.index_autocomplete AS index_autocomplete',
258
                    'tx_dlf_metadata.index_boost AS index_boost'
259
                )
260
                ->from('tx_dlf_metadata')
261
                ->where(
262
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($pid)),
263
                    Helper::whereExpression('tx_dlf_metadata')
264
                )
265
                ->execute();
266
267
            while ($indexing = $result->fetch()) {
268
                if ($indexing['index_tokenized']) {
269
                    self::$fields['tokenized'][] = $indexing['index_name'];
270
                }
271
                if (
272
                    $indexing['index_stored']
273
                    || $indexing['is_listed']
274
                ) {
275
                    self::$fields['stored'][] = $indexing['index_name'];
276
                }
277
                if (
278
                    $indexing['index_indexed']
279
                    || $indexing['index_autocomplete']
280
                ) {
281
                    self::$fields['indexed'][] = $indexing['index_name'];
282
                }
283
                if ($indexing['is_sortable']) {
284
                    self::$fields['sortables'][] = $indexing['index_name'];
285
                }
286
                if ($indexing['is_facet']) {
287
                    self::$fields['facets'][] = $indexing['index_name'];
288
                }
289
                if ($indexing['index_autocomplete']) {
290
                    self::$fields['autocomplete'][] = $indexing['index_name'];
291
                }
292
                if ($indexing['index_boost'] > 0.0) {
293
                    self::$fields['fieldboost'][$indexing['index_name']] = floatval($indexing['index_boost']);
294
                } else {
295
                    self::$fields['fieldboost'][$indexing['index_name']] = false;
296
                }
297
            }
298
            self::$fieldsLoaded = true;
299
        }
300
    }
301
302
    /**
303
     * Processes a logical unit (and its children) for the Solr index
304
     *
305
     * @access protected
306
     *
307
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
308
     * @param array $logicalUnit: Array of the logical unit to process
309
     *
310
     * @return bool true on success or false on failure
311
     */
312
    protected static function processLogical(Document &$doc, array $logicalUnit)
313
    {
314
        $success = true;
315
        // Get metadata for logical unit.
316
        $metadata = $doc->metadataArray[$logicalUnit['id']];
317
        if (!empty($metadata)) {
318
            $metadata['author'] = self::removeAppendsFromAuthor($metadata['author']);
319
            // Replace owner UID with index_name.
320
            if (
321
                !empty($metadata['owner'][0])
322
                && MathUtility::canBeInterpretedAsInteger($metadata['owner'][0])
323
            ) {
324
                $metadata['owner'][0] = Helper::getIndexNameFromUid($metadata['owner'][0], 'tx_dlf_libraries', $doc->cPid);
325
            }
326
            // Create new Solr document.
327
            $updateQuery = self::getUpdateQuery($doc, $logicalUnit);
328
            if (MathUtility::canBeInterpretedAsInteger($logicalUnit['points'])) {
329
                $solrDoc->setField('page', $logicalUnit['points']);
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $solrDoc seems to be never defined.
Loading history...
330
            }
331
            if ($logicalUnit['id'] == $doc->toplevelId) {
332
                $solrDoc->setField('thumbnail', $doc->thumbnail);
333
            } elseif (!empty($logicalUnit['thumbnailId'])) {
334
                $solrDoc->setField('thumbnail', $doc->getFileLocation($logicalUnit['thumbnailId']));
335
            }
336
            // There can be only one toplevel unit per UID, independently of backend configuration
337
            $solrDoc->setField('toplevel', $logicalUnit['id'] == $doc->toplevelId ? true : false);
338
            $solrDoc->setField('title', $metadata['title'][0], self::$fields['fieldboost']['title']);
339
            $solrDoc->setField('volume', $metadata['volume'][0], self::$fields['fieldboost']['volume']);
340
            $solrDoc->setField('record_id', $metadata['record_id'][0]);
341
            $solrDoc->setField('purl', $metadata['purl'][0]);
342
            $solrDoc->setField('location', $doc->location);
343
            $solrDoc->setField('urn', $metadata['urn']);
344
            $solrDoc->setField('license', $metadata['license']);
345
            $solrDoc->setField('terms', $metadata['terms']);
346
            $solrDoc->setField('restrictions', $metadata['restrictions']);
347
            $coordinates = json_decode($metadata['coordinates'][0]);
348
            if (is_object($coordinates)) {
349
                $solrDoc->setField('geom', json_encode($coordinates->features[0]));
350
            }
351
            $autocomplete = [];
352
            foreach ($metadata as $index_name => $data) {
353
                if (
354
                    !empty($data)
355
                    && substr($index_name, -8) !== '_sorting'
356
                ) {
357
                    $solrDoc->setField(self::getIndexFieldName($index_name, $doc->pid), $data, self::$fields['fieldboost'][$index_name]);
358
                    if (in_array($index_name, self::$fields['sortables'])) {
359
                        // Add sortable fields to index.
360
                        $solrDoc->setField($index_name . '_sorting', $metadata[$index_name . '_sorting'][0]);
361
                    }
362
                    if (in_array($index_name, self::$fields['facets'])) {
363
                        // Add facets to index.
364
                        $solrDoc->setField($index_name . '_faceting', $data);
365
                    }
366
                    if (in_array($index_name, self::$fields['autocomplete'])) {
367
                        $autocomplete = array_merge($autocomplete, $data);
368
                    }
369
                }
370
            }
371
            // Add autocomplete values to index.
372
            if (!empty($autocomplete)) {
373
                $solrDoc->setField('autocomplete', $autocomplete);
374
            }
375
            // Add collection information to logical sub-elements if applicable.
376
            if (
377
                in_array('collection', self::$fields['facets'])
378
                && empty($metadata['collection'])
379
                && !empty($doc->metadataArray[$doc->toplevelId]['collection'])
380
            ) {
381
                $solrDoc->setField('collection_faceting', $doc->metadataArray[$doc->toplevelId]['collection']);
382
            }
383
            try {
384
                $updateQuery->addDocument($solrDoc);
385
                self::$solr->service->update($updateQuery);
386
            } catch (\Exception $e) {
387
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
388
                    Helper::addMessage(
389
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
390
                        Helper::getMessage('flash.error', true),
391
                        FlashMessage::ERROR,
392
                        true,
393
                        'core.template.flashMessages'
394
                    );
395
                }
396
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
397
                return false;
398
            }
399
        }
400
        // Check for child elements...
401
        if (!empty($logicalUnit['children'])) {
402
            foreach ($logicalUnit['children'] as $child) {
403
                if ($success) {
404
                    // ...and process them, too.
405
                    $success = self::processLogical($doc, $child);
406
                } else {
407
                    break;
408
                }
409
            }
410
        }
411
        return $success;
412
    }
413
414
    /**
415
     * Processes a physical unit for the Solr index
416
     *
417
     * @access protected
418
     *
419
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
420
     * @param int $page: The page number
421
     * @param array $physicalUnit: Array of the physical unit to process
422
     *
423
     * @return bool true on success or false on failure
424
     */
425
    protected static function processPhysical(Document &$doc, $page, array $physicalUnit)
426
    {
427
        if ($doc->hasFulltext && $fullText = $doc->getFullText($physicalUnit['id'])) {
428
            // Read extension configuration.
429
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
430
            // Create new Solr document.
431
            $updateQuery = self::getUpdateQuery($doc, $physicalUnit, $fullText);
432
            $solrDoc->setField('page', $page);
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $solrDoc seems to be never defined.
Loading history...
433
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
434
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
435
                if (!empty($physicalUnit['files'][$fileGrpThumb])) {
436
                    $solrDoc->setField('thumbnail', $doc->getFileLocation($physicalUnit['files'][$fileGrpThumb]));
437
                    break;
438
                }
439
            }
440
            $solrDoc->setField('toplevel', false);
441
            // Add faceting information to physical sub-elements if applicable.
442
            foreach ($doc->metadataArray[$doc->toplevelId] as $index_name => $data) {
443
                if (
444
                    !empty($data)
445
                    && substr($index_name, -8) !== '_sorting'
446
                ) {
447
448
                    if (in_array($index_name, self::$fields['facets'])) {
449
                        if ($index_name == 'author') {
450
                            $data = self::removeAppendsFromAuthor($data);
451
                        }
452
                        // Add facets to index.
453
                        $solrDoc->setField($index_name . '_faceting', $data);
454
                    }
455
                }
456
            }
457
            // Add collection information to physical sub-elements if applicable.
458
            if (
459
                in_array('collection', self::$fields['facets'])
460
                && !empty($doc->metadataArray[$doc->toplevelId]['collection'])
461
            ) {
462
                $solrDoc->setField('collection_faceting', $doc->metadataArray[$doc->toplevelId]['collection']);
463
            }
464
            try {
465
                $updateQuery->addDocument($solrDoc);
466
                self::$solr->service->update($updateQuery);
467
            } catch (\Exception $e) {
468
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
469
                    Helper::addMessage(
470
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
471
                        Helper::getMessage('flash.error', true),
472
                        FlashMessage::ERROR,
473
                        true,
474
                        'core.template.flashMessages'
475
                    );
476
                }
477
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
478
                return false;
479
            }
480
        }
481
        return true;
482
    }
483
484
    /**
485
     * Connects to Solr server.
486
     *
487
     * @access protected
488
     *
489
     * @param int $core: UID of the Solr core
490
     * @param int $pid: UID of the configuration page
491
     *
492
     * @return bool true on success or false on failure
493
     */
494
    protected static function solrConnect($core, $pid = 0)
495
    {
496
        // Get Solr instance.
497
        if (!self::$solr) {
498
            // Connect to Solr server.
499
            $solr = Solr::getInstance($core);
500
            if ($solr->ready) {
501
                self::$solr = $solr;
502
                // Load indexing configuration if needed.
503
                if ($pid) {
504
                    self::loadIndexConf($pid);
505
                }
506
            } else {
507
                return false;
508
            }
509
        }
510
        return true;
511
    }
512
513
    /**
514
     * Get update query with set standard fields (identical for logical and physical unit)
515
     *
516
     * @access private
517
     *
518
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
519
     * @param array $unit: Array of the logical or physical unit to process
520
     * @param string $fullText: Text containing full text for indexing
521
     *
522
     * @return \Solarium\Core\Query\AbstractQuery|\Solarium\QueryType\Update\Query\Query
523
     */
524
    private static function getUpdateQuery($doc, $unit, $fullText = '') {
525
        $updateQuery = self::$solr->service->createUpdate();
526
        $solrDoc = $updateQuery->createDocument();
527
        // Create unique identifier from document's UID and unit's XML ID.
528
        $solrDoc->setField('id', $doc->uid . $unit['id']);
0 ignored issues
show
Bug introduced by
The method setField() does not exist on Solarium\Core\Query\DocumentInterface. It seems like you code against a sub-type of Solarium\Core\Query\DocumentInterface such as Solarium\Plugin\MinimumScoreFilter\Document or Solarium\QueryType\Update\Query\Document. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

528
        $solrDoc->/** @scrutinizer ignore-call */ 
529
                  setField('id', $doc->uid . $unit['id']);
Loading history...
529
        $solrDoc->setField('uid', $doc->uid);
530
        $solrDoc->setField('pid', $doc->pid);
531
        $solrDoc->setField('partof', $doc->parentId);
532
        $solrDoc->setField('root', $doc->rootId);
533
        $solrDoc->setField('sid', $unit['id']);
534
        $solrDoc->setField('type', $unit['type'], self::$fields['fieldboost']['type']);
535
        $solrDoc->setField('collection', $doc->metadataArray[$doc->toplevelId]['collection']);
536
        $solrDoc->setField('fulltext', $fullText);
537
        return $updateQuery;
538
    }
539
540
    /**
541
     * Remove appended "valueURI" from authors' names for indexing.
542
     *
543
     * @access private
544
     *
545
     * @param array|string $authors: Array or string containing author/authors
546
     *
547
     * @return array|string
548
     */
549
    private static function removeAppendsFromAuthor($authors) {
550
        if (is_array($authors)) {
551
            foreach ($authors as $i => $author) {
552
                $splitName = explode(chr(31), $author);
553
                $authors[$i] = $splitName[0];
554
            }
555
        }
556
        return $authors;
557
    }
558
559
    /**
560
     * Prevent instantiation by hiding the constructor
561
     *
562
     * @access private
563
     */
564
    private function __construct()
565
    {
566
        // This is a static class, thus no instances should be created.
567
    }
568
}
569