Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#711)
by Beatrycze
04:16 queued 01:37
created

Indexer   F

Complexity

Total Complexity 79

Size/Duplication

Total Lines 537
Duplicated Lines 0 %

Importance

Changes 5
Bugs 0 Features 1
Metric Value
wmc 79
eloc 272
c 5
b 0
f 1
dl 0
loc 537
rs 2.08

9 Methods

Rating   Name   Duplication   Size   Complexity  
F add() 0 108 15
A getIndexFieldName() 0 16 5
C loadIndexConf() 0 58 12
C processPhysical() 0 58 14
F processLogical() 0 101 24
A solrConnect() 0 17 4
A getSolrDocument() 0 13 1
A __construct() 0 2 1
A removeAppendsFromAuthor() 0 8 3

How to fix   Complexity   

Complex Class

Complex classes like Indexer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Indexer, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Common;
14
15
use TYPO3\CMS\Core\Configuration\ExtensionConfiguration;
16
use TYPO3\CMS\Core\Database\ConnectionPool;
17
use TYPO3\CMS\Core\Messaging\FlashMessage;
18
use TYPO3\CMS\Core\Utility\GeneralUtility;
19
use TYPO3\CMS\Core\Utility\MathUtility;
20
use Ubl\Iiif\Presentation\Common\Model\Resources\AnnotationContainerInterface;
21
use Ubl\Iiif\Tools\IiifHelper;
22
23
/**
24
 * Indexer class for the 'dlf' extension
25
 *
26
 * @author Sebastian Meyer <[email protected]>
27
 * @package TYPO3
28
 * @subpackage dlf
29
 * @access public
30
 */
31
class Indexer
32
{
33
    /**
34
     * The extension key
35
     *
36
     * @var string
37
     * @access public
38
     */
39
    public static $extKey = 'dlf';
40
41
    /**
42
     * Array of metadata fields' configuration
43
     * @see loadIndexConf()
44
     *
45
     * @var array
46
     * @access protected
47
     */
48
    protected static $fields = [
49
        'autocomplete' => [],
50
        'facets' => [],
51
        'sortables' => [],
52
        'indexed' => [],
53
        'stored' => [],
54
        'tokenized' => [],
55
        'fieldboost' => []
56
    ];
57
58
    /**
59
     * Is the index configuration loaded?
60
     * @see $fields
61
     *
62
     * @var bool
63
     * @access protected
64
     */
65
    protected static $fieldsLoaded = false;
66
67
    /**
68
     * List of already processed documents
69
     *
70
     * @var array
71
     * @access protected
72
     */
73
    protected static $processedDocs = [];
74
75
    /**
76
     * Instance of \Kitodo\Dlf\Common\Solr class
77
     *
78
     * @var \Kitodo\Dlf\Common\Solr
79
     * @access protected
80
     */
81
    protected static $solr;
82
83
    /**
84
     * Insert given document into Solr index
85
     *
86
     * @access public
87
     *
88
     * @param \Kitodo\Dlf\Common\Document &$doc: The document to add
89
     * @param int $core: UID of the Solr core to use
90
     *
91
     * @return bool true on success or false on failure
92
     */
93
    public static function add(Document &$doc, $core = 0)
94
    {
95
        if (in_array($doc->uid, self::$processedDocs)) {
96
            return true;
97
        } elseif (self::solrConnect($core, $doc->pid)) {
98
            $success = true;
99
            // Handle multi-volume documents.
100
            if ($doc->parentId) {
101
                $parent = Document::getInstance($doc->parentId, 0, true);
102
                if ($parent->ready) {
103
                    $success = self::add($parent, $core);
104
                } else {
105
                    Helper::log('Could not load parent document with UID ' . $doc->parentId, LOG_SEVERITY_ERROR);
106
                    return false;
107
                }
108
            }
109
            try {
110
                // Add document to list of processed documents.
111
                self::$processedDocs[] = $doc->uid;
112
                // Delete old Solr documents.
113
                $updateQuery = self::$solr->service->createUpdate();
114
                $updateQuery->addDeleteQuery('uid:' . $doc->uid);
115
                self::$solr->service->update($updateQuery);
116
117
                // Index every logical unit as separate Solr document.
118
                foreach ($doc->tableOfContents as $logicalUnit) {
119
                    if ($success) {
120
                        $success = self::processLogical($doc, $logicalUnit);
121
                    } else {
122
                        break;
123
                    }
124
                }
125
                // Index full text files if available.
126
                if ($doc->hasFulltext) {
127
                    foreach ($doc->physicalStructure as $pageNumber => $xmlId) {
128
                        if ($success) {
129
                            $success = self::processPhysical($doc, $pageNumber, $doc->physicalStructureInfo[$xmlId]);
130
                        } else {
131
                            break;
132
                        }
133
                    }
134
                }
135
                // Commit all changes.
136
                $updateQuery = self::$solr->service->createUpdate();
137
                $updateQuery->addCommit();
138
                self::$solr->service->update($updateQuery);
139
140
                $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
141
                    ->getQueryBuilderForTable('tx_dlf_documents');
142
143
                // Get document title from database.
144
                $result = $queryBuilder
145
                    ->select('tx_dlf_documents.title AS title')
146
                    ->from('tx_dlf_documents')
147
                    ->where(
148
                        $queryBuilder->expr()->eq('tx_dlf_documents.uid', intval($doc->uid)),
149
                        Helper::whereExpression('tx_dlf_documents')
150
                    )
151
                    ->setMaxResults(1)
152
                    ->execute();
153
154
                $allResults = $result->fetchAll();
155
                $resArray = $allResults[0];
156
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
157
                    if ($success) {
158
                        Helper::addMessage(
159
                            htmlspecialchars(sprintf(Helper::getMessage('flash.documentIndexed'), $resArray['title'], $doc->uid)),
160
                            Helper::getMessage('flash.done', true),
161
                            FlashMessage::OK,
162
                            true,
163
                            'core.template.flashMessages'
164
                        );
165
                    } else {
166
                        Helper::addMessage(
167
                            htmlspecialchars(sprintf(Helper::getMessage('flash.documentNotIndexed'), $resArray['title'], $doc->uid)),
168
                            Helper::getMessage('flash.error', true),
169
                            FlashMessage::ERROR,
170
                            true,
171
                            'core.template.flashMessages'
172
                        );
173
                    }
174
                }
175
                return $success;
176
            } catch (\Exception $e) {
177
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
178
                    Helper::addMessage(
179
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
180
                        Helper::getMessage('flash.error', true),
181
                        FlashMessage::ERROR,
182
                        true,
183
                        'core.template.flashMessages'
184
                    );
185
                }
186
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
187
                return false;
188
            }
189
        } else {
190
            if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
191
                Helper::addMessage(
192
                    Helper::getMessage('flash.solrNoConnection', true),
193
                    Helper::getMessage('flash.warning', true),
194
                    FlashMessage::WARNING,
195
                    true,
196
                    'core.template.flashMessages'
197
                );
198
            }
199
            Helper::log('Could not connect to Apache Solr server', LOG_SEVERITY_ERROR);
200
            return false;
201
        }
202
    }
203
204
    /**
205
     * Returns the dynamic index field name for the given metadata field.
206
     *
207
     * @access public
208
     *
209
     * @param string $index_name: The metadata field's name in database
210
     * @param int $pid: UID of the configuration page
211
     *
212
     * @return string The field's dynamic index name
213
     */
214
    public static function getIndexFieldName($index_name, $pid = 0)
215
    {
216
        // Sanitize input.
217
        $pid = max(intval($pid), 0);
218
        if (!$pid) {
219
            Helper::log('Invalid PID ' . $pid . ' for metadata configuration', LOG_SEVERITY_ERROR);
220
            return '';
221
        }
222
        // Load metadata configuration.
223
        self::loadIndexConf($pid);
224
        // Build field's suffix.
225
        $suffix = (in_array($index_name, self::$fields['tokenized']) ? 't' : 'u');
226
        $suffix .= (in_array($index_name, self::$fields['stored']) ? 's' : 'u');
227
        $suffix .= (in_array($index_name, self::$fields['indexed']) ? 'i' : 'u');
228
        $index_name .= '_' . $suffix;
229
        return $index_name;
230
    }
231
232
    /**
233
     * Load indexing configuration
234
     *
235
     * @access protected
236
     *
237
     * @param int $pid: The configuration page's UID
238
     *
239
     * @return void
240
     */
241
    protected static function loadIndexConf($pid)
242
    {
243
        if (!self::$fieldsLoaded) {
244
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
245
                ->getQueryBuilderForTable('tx_dlf_metadata');
246
247
            // Get the metadata indexing options.
248
            $result = $queryBuilder
249
                ->select(
250
                    'tx_dlf_metadata.index_name AS index_name',
251
                    'tx_dlf_metadata.index_tokenized AS index_tokenized',
252
                    'tx_dlf_metadata.index_stored AS index_stored',
253
                    'tx_dlf_metadata.index_indexed AS index_indexed',
254
                    'tx_dlf_metadata.is_sortable AS is_sortable',
255
                    'tx_dlf_metadata.is_facet AS is_facet',
256
                    'tx_dlf_metadata.is_listed AS is_listed',
257
                    'tx_dlf_metadata.index_autocomplete AS index_autocomplete',
258
                    'tx_dlf_metadata.index_boost AS index_boost'
259
                )
260
                ->from('tx_dlf_metadata')
261
                ->where(
262
                    $queryBuilder->expr()->eq('tx_dlf_metadata.pid', intval($pid)),
263
                    Helper::whereExpression('tx_dlf_metadata')
264
                )
265
                ->execute();
266
267
            while ($indexing = $result->fetch()) {
268
                if ($indexing['index_tokenized']) {
269
                    self::$fields['tokenized'][] = $indexing['index_name'];
270
                }
271
                if (
272
                    $indexing['index_stored']
273
                    || $indexing['is_listed']
274
                ) {
275
                    self::$fields['stored'][] = $indexing['index_name'];
276
                }
277
                if (
278
                    $indexing['index_indexed']
279
                    || $indexing['index_autocomplete']
280
                ) {
281
                    self::$fields['indexed'][] = $indexing['index_name'];
282
                }
283
                if ($indexing['is_sortable']) {
284
                    self::$fields['sortables'][] = $indexing['index_name'];
285
                }
286
                if ($indexing['is_facet']) {
287
                    self::$fields['facets'][] = $indexing['index_name'];
288
                }
289
                if ($indexing['index_autocomplete']) {
290
                    self::$fields['autocomplete'][] = $indexing['index_name'];
291
                }
292
                if ($indexing['index_boost'] > 0.0) {
293
                    self::$fields['fieldboost'][$indexing['index_name']] = floatval($indexing['index_boost']);
294
                } else {
295
                    self::$fields['fieldboost'][$indexing['index_name']] = false;
296
                }
297
            }
298
            self::$fieldsLoaded = true;
299
        }
300
    }
301
302
    /**
303
     * Processes a logical unit (and its children) for the Solr index
304
     *
305
     * @access protected
306
     *
307
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
308
     * @param array $logicalUnit: Array of the logical unit to process
309
     *
310
     * @return bool true on success or false on failure
311
     */
312
    protected static function processLogical(Document &$doc, array $logicalUnit)
313
    {
314
        $success = true;
315
        // Get metadata for logical unit.
316
        $metadata = $doc->metadataArray[$logicalUnit['id']];
317
        if (!empty($metadata)) {
318
            $metadata['author'] = self::removeAppendsFromAuthor($metadata['author']);
319
            // Replace owner UID with index_name.
320
            if (
321
                !empty($metadata['owner'][0])
322
                && MathUtility::canBeInterpretedAsInteger($metadata['owner'][0])
323
            ) {
324
                $metadata['owner'][0] = Helper::getIndexNameFromUid($metadata['owner'][0], 'tx_dlf_libraries', $doc->cPid);
325
            }
326
            // Create new Solr document.
327
            $updateQuery = self::$solr->service->createUpdate();
328
            $solrDoc = self::getSolrDocument($updateQuery, $doc, $logicalUnit);
329
            if (MathUtility::canBeInterpretedAsInteger($logicalUnit['points'])) {
330
                $solrDoc->setField('page', $logicalUnit['points']);
0 ignored issues
show
Bug introduced by
The method setField() does not exist on Solarium\Core\Query\DocumentInterface. It seems like you code against a sub-type of Solarium\Core\Query\DocumentInterface such as Solarium\Plugin\MinimumScoreFilter\Document or Solarium\QueryType\Update\Query\Document. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

330
                $solrDoc->/** @scrutinizer ignore-call */ 
331
                          setField('page', $logicalUnit['points']);
Loading history...
331
            }
332
            if ($logicalUnit['id'] == $doc->toplevelId) {
333
                $solrDoc->setField('thumbnail', $doc->thumbnail);
334
            } elseif (!empty($logicalUnit['thumbnailId'])) {
335
                $solrDoc->setField('thumbnail', $doc->getFileLocation($logicalUnit['thumbnailId']));
336
            }
337
            // There can be only one toplevel unit per UID, independently of backend configuration
338
            $solrDoc->setField('toplevel', $logicalUnit['id'] == $doc->toplevelId ? true : false);
339
            $solrDoc->setField('title', $metadata['title'][0], self::$fields['fieldboost']['title']);
340
            $solrDoc->setField('volume', $metadata['volume'][0], self::$fields['fieldboost']['volume']);
341
            $solrDoc->setField('record_id', $metadata['record_id'][0]);
342
            $solrDoc->setField('purl', $metadata['purl'][0]);
343
            $solrDoc->setField('location', $doc->location);
344
            $solrDoc->setField('urn', $metadata['urn']);
345
            $solrDoc->setField('license', $metadata['license']);
346
            $solrDoc->setField('terms', $metadata['terms']);
347
            $solrDoc->setField('restrictions', $metadata['restrictions']);
348
            $coordinates = json_decode($metadata['coordinates'][0]);
349
            if (is_object($coordinates)) {
350
                $solrDoc->setField('geom', json_encode($coordinates->features[0]));
351
            }
352
            $autocomplete = [];
353
            foreach ($metadata as $index_name => $data) {
354
                if (
355
                    !empty($data)
356
                    && substr($index_name, -8) !== '_sorting'
357
                ) {
358
                    $solrDoc->setField(self::getIndexFieldName($index_name, $doc->pid), $data, self::$fields['fieldboost'][$index_name]);
359
                    if (in_array($index_name, self::$fields['sortables'])) {
360
                        // Add sortable fields to index.
361
                        $solrDoc->setField($index_name . '_sorting', $metadata[$index_name . '_sorting'][0]);
362
                    }
363
                    if (in_array($index_name, self::$fields['facets'])) {
364
                        // Add facets to index.
365
                        $solrDoc->setField($index_name . '_faceting', $data);
366
                    }
367
                    if (in_array($index_name, self::$fields['autocomplete'])) {
368
                        $autocomplete = array_merge($autocomplete, $data);
369
                    }
370
                }
371
            }
372
            // Add autocomplete values to index.
373
            if (!empty($autocomplete)) {
374
                $solrDoc->setField('autocomplete', $autocomplete);
375
            }
376
            // Add collection information to logical sub-elements if applicable.
377
            if (
378
                in_array('collection', self::$fields['facets'])
379
                && empty($metadata['collection'])
380
                && !empty($doc->metadataArray[$doc->toplevelId]['collection'])
381
            ) {
382
                $solrDoc->setField('collection_faceting', $doc->metadataArray[$doc->toplevelId]['collection']);
383
            }
384
            try {
385
                $updateQuery->addDocument($solrDoc);
386
                self::$solr->service->update($updateQuery);
387
            } catch (\Exception $e) {
388
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
389
                    Helper::addMessage(
390
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
391
                        Helper::getMessage('flash.error', true),
392
                        FlashMessage::ERROR,
393
                        true,
394
                        'core.template.flashMessages'
395
                    );
396
                }
397
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
398
                return false;
399
            }
400
        }
401
        // Check for child elements...
402
        if (!empty($logicalUnit['children'])) {
403
            foreach ($logicalUnit['children'] as $child) {
404
                if ($success) {
405
                    // ...and process them, too.
406
                    $success = self::processLogical($doc, $child);
407
                } else {
408
                    break;
409
                }
410
            }
411
        }
412
        return $success;
413
    }
414
415
    /**
416
     * Processes a physical unit for the Solr index
417
     *
418
     * @access protected
419
     *
420
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
421
     * @param int $page: The page number
422
     * @param array $physicalUnit: Array of the physical unit to process
423
     *
424
     * @return bool true on success or false on failure
425
     */
426
    protected static function processPhysical(Document &$doc, $page, array $physicalUnit)
427
    {
428
        if ($doc->hasFulltext && $fullText = $doc->getFullText($physicalUnit['id'])) {
429
            // Read extension configuration.
430
            $extConf = GeneralUtility::makeInstance(ExtensionConfiguration::class)->get(self::$extKey);
431
            // Create new Solr document.
432
            $updateQuery = self::$solr->service->createUpdate();
433
            $solrDoc = self::getSolrDocument($updateQuery, $doc, $physicalUnit, $fullText);
434
            $solrDoc->setField('page', $page);
435
            $fileGrpsThumb = GeneralUtility::trimExplode(',', $extConf['fileGrpThumbs']);
436
            while ($fileGrpThumb = array_shift($fileGrpsThumb)) {
437
                if (!empty($physicalUnit['files'][$fileGrpThumb])) {
438
                    $solrDoc->setField('thumbnail', $doc->getFileLocation($physicalUnit['files'][$fileGrpThumb]));
439
                    break;
440
                }
441
            }
442
            $solrDoc->setField('toplevel', false);
443
            // Add faceting information to physical sub-elements if applicable.
444
            foreach ($doc->metadataArray[$doc->toplevelId] as $index_name => $data) {
445
                if (
446
                    !empty($data)
447
                    && substr($index_name, -8) !== '_sorting'
448
                ) {
449
450
                    if (in_array($index_name, self::$fields['facets'])) {
451
                        if ($index_name == 'author') {
452
                            $data = self::removeAppendsFromAuthor($data);
453
                        }
454
                        // Add facets to index.
455
                        $solrDoc->setField($index_name . '_faceting', $data);
456
                    }
457
                }
458
            }
459
            // Add collection information to physical sub-elements if applicable.
460
            if (
461
                in_array('collection', self::$fields['facets'])
462
                && !empty($doc->metadataArray[$doc->toplevelId]['collection'])
463
            ) {
464
                $solrDoc->setField('collection_faceting', $doc->metadataArray[$doc->toplevelId]['collection']);
465
            }
466
            try {
467
                $updateQuery->addDocument($solrDoc);
468
                self::$solr->service->update($updateQuery);
469
            } catch (\Exception $e) {
470
                if (!(\TYPO3_REQUESTTYPE & \TYPO3_REQUESTTYPE_CLI)) {
471
                    Helper::addMessage(
472
                        Helper::getMessage('flash.solrException', true) . '<br />' . htmlspecialchars($e->getMessage()),
473
                        Helper::getMessage('flash.error', true),
474
                        FlashMessage::ERROR,
475
                        true,
476
                        'core.template.flashMessages'
477
                    );
478
                }
479
                Helper::log('Apache Solr threw exception: "' . $e->getMessage() . '"', LOG_SEVERITY_ERROR);
480
                return false;
481
            }
482
        }
483
        return true;
484
    }
485
486
    /**
487
     * Connects to Solr server.
488
     *
489
     * @access protected
490
     *
491
     * @param int $core: UID of the Solr core
492
     * @param int $pid: UID of the configuration page
493
     *
494
     * @return bool true on success or false on failure
495
     */
496
    protected static function solrConnect($core, $pid = 0)
497
    {
498
        // Get Solr instance.
499
        if (!self::$solr) {
500
            // Connect to Solr server.
501
            $solr = Solr::getInstance($core);
502
            if ($solr->ready) {
503
                self::$solr = $solr;
504
                // Load indexing configuration if needed.
505
                if ($pid) {
506
                    self::loadIndexConf($pid);
507
                }
508
            } else {
509
                return false;
510
            }
511
        }
512
        return true;
513
    }
514
515
    /**
516
     * Get SOLR document with set standard fields (identical for logical and physical unit)
517
     *
518
     * @access private
519
     *
520
     * @param \Solarium\QueryType\Update\Query\Query $updateQuery solarium query
521
     * @param \Kitodo\Dlf\Common\Document &$doc: The METS document
522
     * @param array $unit: Array of the logical or physical unit to process
523
     * @param string $fullText: Text containing full text for indexing
524
     *
525
     * @return \Solarium\Core\Query\DocumentInterface
526
     */
527
    private static function getSolrDocument($updateQuery, $doc, $unit, $fullText = '') {
528
        $solrDoc = $updateQuery->createDocument();
529
        // Create unique identifier from document's UID and unit's XML ID.
530
        $solrDoc->setField('id', $doc->uid . $unit['id']);
531
        $solrDoc->setField('uid', $doc->uid);
532
        $solrDoc->setField('pid', $doc->pid);
533
        $solrDoc->setField('partof', $doc->parentId);
534
        $solrDoc->setField('root', $doc->rootId);
535
        $solrDoc->setField('sid', $unit['id']);
536
        $solrDoc->setField('type', $unit['type'], self::$fields['fieldboost']['type']);
537
        $solrDoc->setField('collection', $doc->metadataArray[$doc->toplevelId]['collection']);
538
        $solrDoc->setField('fulltext', $fullText);
539
        return $solrDoc;
540
    }
541
542
    /**
543
     * Remove appended "valueURI" from authors' names for indexing.
544
     *
545
     * @access private
546
     *
547
     * @param array|string $authors: Array or string containing author/authors
548
     *
549
     * @return array|string
550
     */
551
    private static function removeAppendsFromAuthor($authors) {
552
        if (is_array($authors)) {
553
            foreach ($authors as $i => $author) {
554
                $splitName = explode(chr(31), $author);
555
                $authors[$i] = $splitName[0];
556
            }
557
        }
558
        return $authors;
559
    }
560
561
    /**
562
     * Prevent instantiation by hiding the constructor
563
     *
564
     * @access private
565
     */
566
    private function __construct()
567
    {
568
        // This is a static class, thus no instances should be created.
569
    }
570
}
571