Scrutinizer GitHub App not installed

We could not synchronize checks via GitHub's checks API since Scrutinizer's GitHub App is not installed for this repository.

Install GitHub App

GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Passed
Pull Request — master (#821)
by
unknown
03:20
created

DocumentRepository::fetchMetadataFromSolr()   B

Complexity

Conditions 10
Paths 6

Size

Total Lines 38
Code Lines 20

Duplication

Lines 0
Ratio 0 %

Importance

Changes 5
Bugs 0 Features 0
Metric Value
cc 10
eloc 20
c 5
b 0
f 0
nc 6
nop 2
dl 0
loc 38
rs 7.6666

How to fix   Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * (c) Kitodo. Key to digital objects e.V. <[email protected]>
5
 *
6
 * This file is part of the Kitodo and TYPO3 projects.
7
 *
8
 * @license GNU General Public License version 3 or later.
9
 * For the full copyright and license information, please read the
10
 * LICENSE.txt file that was distributed with this source code.
11
 */
12
13
namespace Kitodo\Dlf\Domain\Repository;
14
15
use Kitodo\Dlf\Common\Doc;
16
use Kitodo\Dlf\Common\Helper;
17
use Kitodo\Dlf\Common\Indexer;
18
use Kitodo\Dlf\Common\Solr;
19
use Kitodo\Dlf\Domain\Model\Document;
20
use Kitodo\Dlf\Common\SolrSearchResult\ResultDocument;
21
use TYPO3\CMS\Core\Cache\CacheManager;
22
use TYPO3\CMS\Core\Database\ConnectionPool;
23
use TYPO3\CMS\Core\Database\Connection;
24
use TYPO3\CMS\Core\Utility\GeneralUtility;
25
use TYPO3\CMS\Core\Utility\MathUtility;
26
use TYPO3\CMS\Extbase\Persistence\QueryInterface;
27
28
class DocumentRepository extends \TYPO3\CMS\Extbase\Persistence\Repository
29
{
30
    /**
31
     * The controller settings passed to the repository for some special actions.
32
     *
33
     * @var array
34
     * @access protected
35
     */
36
    protected $settings;
37
38
    /**
39
     * Find one document by given parameters
40
     *
41
     * GET parameters may be:
42
     *
43
     * - 'id': the uid of the document
44
     * - 'location': the URL of the location of the XML file
45
     * - 'recordId': the record_id of the document
46
     *
47
     * Currently used by EXT:slub_digitalcollections
48
     *
49
     * @param array $parameters
50
     *
51
     * @return \Kitodo\Dlf\Domain\Model\Document|null
52
     */
53
    public function findOneByParameters($parameters)
54
    {
55
        $doc = null;
56
        $document = null;
57
58
        if (isset($parameters['id']) && MathUtility::canBeInterpretedAsInteger($parameters['id'])) {
59
60
            $document = $this->findOneByIdAndSettings($parameters['id']);
61
62
        } else if (isset($parameters['recordId'])) {
63
64
            $document = $this->findOneByRecordId($parameters['recordId']);
0 ignored issues
show
Bug introduced by
The method findOneByRecordId() does not exist on Kitodo\Dlf\Domain\Repository\DocumentRepository. Since you implemented __call, consider adding a @method annotation. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

64
            /** @scrutinizer ignore-call */ 
65
            $document = $this->findOneByRecordId($parameters['recordId']);
Loading history...
65
66
        } else if (isset($parameters['location']) && GeneralUtility::isValidUrl($parameters['location'])) {
67
68
            $doc = Doc::getInstance($parameters['location'], [], true);
69
70
            if ($doc->recordId) {
71
                $document = $this->findOneByRecordId($doc->recordId);
72
            }
73
74
            if ($document === null) {
75
                // create new (dummy) Document object
76
                $document = GeneralUtility::makeInstance(Document::class);
77
                $document->setLocation($parameters['location']);
78
            }
79
80
        }
81
82
        if ($document !== null && $doc === null) {
83
            $doc = Doc::getInstance($document->getLocation(), [], true);
0 ignored issues
show
Bug introduced by
The method getLocation() does not exist on TYPO3\CMS\Extbase\Persistence\QueryResultInterface. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

83
            $doc = Doc::getInstance($document->/** @scrutinizer ignore-call */ getLocation(), [], true);

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
84
        }
85
86
        if ($doc !== null) {
87
            $document->setDoc($doc);
0 ignored issues
show
Bug introduced by
The method setDoc() does not exist on TYPO3\CMS\Extbase\Persistence\QueryResultInterface. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

87
            $document->/** @scrutinizer ignore-call */ 
88
                       setDoc($doc);

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
88
        }
89
90
        return $document;
0 ignored issues
show
Bug Best Practice introduced by
The expression return $document also could return the type TYPO3\CMS\Extbase\Persis...y<mixed,object>|integer which is incompatible with the documented return type Kitodo\Dlf\Domain\Model\Document|null.
Loading history...
91
    }
92
93
    /**
94
     * Find the oldest document
95
     *
96
     * @return \Kitodo\Dlf\Domain\Model\Document|null
97
     */
98
    public function findOldestDocument()
99
    {
100
        $query = $this->createQuery();
101
102
        $query->setOrderings(['tstamp' => QueryInterface::ORDER_ASCENDING]);
103
        $query->setLimit(1);
104
105
        return $query->execute()->getFirst();
106
    }
107
108
    /**
109
     * @param int $partOf
110
     * @param  \Kitodo\Dlf\Domain\Model\Structure $structure
111
     * @return array|\TYPO3\CMS\Extbase\Persistence\QueryResultInterface
112
     */
113
    public function getChildrenOfYearAnchor($partOf, $structure)
114
    {
115
        $query = $this->createQuery();
116
117
        $query->matching($query->equals('structure', $structure));
118
        $query->matching($query->equals('partof', $partOf));
119
120
        $query->setOrderings([
121
            'mets_orderlabel' => \TYPO3\CMS\Extbase\Persistence\QueryInterface::ORDER_ASCENDING
122
        ]);
123
124
        return $query->execute();
125
    }
126
127
    /**
128
     * Finds all documents for the given settings
129
     *
130
     * @param int $uid
131
     * @param array $settings
132
     *
133
     * @return \Kitodo\Dlf\Domain\Model\Document|null
134
     */
135
    public function findOneByIdAndSettings($uid, $settings = [])
0 ignored issues
show
Unused Code introduced by
The parameter $settings is not used and could be removed. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unused  annotation

135
    public function findOneByIdAndSettings($uid, /** @scrutinizer ignore-unused */ $settings = [])

This check looks for parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
136
    {
137
        $settings = ['documentSets' => $uid];
138
139
        return $this->findDocumentsBySettings($settings)->getFirst();
140
    }
141
142
    /**
143
     * Finds all documents for the given settings
144
     *
145
     * @param array $settings
146
     *
147
     * @return array|\TYPO3\CMS\Extbase\Persistence\QueryResultInterface
148
     */
149
    public function findDocumentsBySettings($settings = [])
150
    {
151
        $query = $this->createQuery();
152
153
        $constraints = [];
154
155
        if ($settings['documentSets']) {
156
            $constraints[] = $query->in('uid', GeneralUtility::intExplode(',', $settings['documentSets']));
157
        }
158
159
        if (isset($settings['excludeOther']) && (int) $settings['excludeOther'] === 0) {
160
            $query->getQuerySettings()->setRespectStoragePage(false);
161
        }
162
163
        if (count($constraints)) {
164
            $query->matching(
165
                $query->logicalAnd($constraints)
166
            );
167
        }
168
169
        return $query->execute();
170
    }
171
172
    /**
173
     * Finds all documents for the given collections
174
     *
175
     * @param array $collections
176
     * @param int $limit
177
     *
178
     * @return array|\TYPO3\CMS\Extbase\Persistence\QueryResultInterface
179
     */
180
    public function findAllByCollectionsLimited($collections, $limit = 50)
181
    {
182
        $query = $this->createQuery();
183
184
        // order by start_date -> start_time...
185
        $query->setOrderings(
186
            ['tstamp' => QueryInterface::ORDER_DESCENDING]
187
        );
188
189
        $constraints = [];
190
        if ($collections) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $collections of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
191
            $constraints[] = $query->in('collections.uid', $collections);
192
        }
193
194
        if (count($constraints)) {
195
            $query->matching(
196
                $query->logicalAnd($constraints)
197
            );
198
        }
199
200
        if ($limit > 0) {
201
            $query->setLimit((int) $limit);
202
        }
203
204
        return $query->execute();
205
    }
206
207
    /**
208
     * Count the titles and volumes for statistics
209
     *
210
     * Volumes are documents that are both
211
     *  a) "leaf" elements i.e. partof != 0
212
     *  b) "root" elements that are not referenced by other documents ("root" elements that have no descendants)
213
214
     * @param array $settings
215
     *
216
     * @return array
217
     */
218
    public function getStatisticsForSelectedCollection($settings)
219
    {
220
        if ($settings['collections']) {
221
            // Include only selected collections.
222
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
223
            ->getQueryBuilderForTable('tx_dlf_documents');
224
225
            $countTitles = $queryBuilder
226
                ->count('tx_dlf_documents.uid')
227
                ->from('tx_dlf_documents')
228
                ->innerJoin(
229
                    'tx_dlf_documents',
230
                    'tx_dlf_relations',
231
                    'tx_dlf_relations_joins',
232
                    $queryBuilder->expr()->eq(
233
                        'tx_dlf_relations_joins.uid_local',
234
                        'tx_dlf_documents.uid'
235
                    )
236
                )
237
                ->innerJoin(
238
                    'tx_dlf_relations_joins',
239
                    'tx_dlf_collections',
240
                    'tx_dlf_collections_join',
241
                    $queryBuilder->expr()->eq(
242
                        'tx_dlf_relations_joins.uid_foreign',
243
                        'tx_dlf_collections_join.uid'
244
                    )
245
                )
246
                ->where(
247
                    $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($settings['storagePid'])),
248
                    $queryBuilder->expr()->eq('tx_dlf_collections_join.pid', intval($settings['storagePid'])),
249
                    $queryBuilder->expr()->eq('tx_dlf_documents.partof', 0),
250
                    $queryBuilder->expr()->in('tx_dlf_collections_join.uid', $queryBuilder->createNamedParameter(GeneralUtility::intExplode(',', $settings['collections']), Connection::PARAM_INT_ARRAY)),
251
                    $queryBuilder->expr()->eq('tx_dlf_relations_joins.ident', $queryBuilder->createNamedParameter('docs_colls'))
252
                )
253
                ->execute()
254
                ->fetchColumn(0);
255
256
                $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
257
                    ->getQueryBuilderForTable('tx_dlf_documents');
258
                $subQueryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
259
                    ->getQueryBuilderForTable('tx_dlf_documents');
260
261
                $subQuery = $subQueryBuilder
262
                    ->select('tx_dlf_documents.partof')
263
                    ->from('tx_dlf_documents')
264
                    ->where(
265
                        $subQueryBuilder->expr()->neq('tx_dlf_documents.partof', 0)
266
                    )
267
                    ->groupBy('tx_dlf_documents.partof')
268
                    ->getSQL();
269
270
                $countVolumes = $queryBuilder
271
                    ->count('tx_dlf_documents.uid')
272
                    ->from('tx_dlf_documents')
273
                    ->innerJoin(
274
                        'tx_dlf_documents',
275
                        'tx_dlf_relations',
276
                        'tx_dlf_relations_joins',
277
                        $queryBuilder->expr()->eq(
278
                            'tx_dlf_relations_joins.uid_local',
279
                            'tx_dlf_documents.uid'
280
                        )
281
                    )
282
                    ->innerJoin(
283
                        'tx_dlf_relations_joins',
284
                        'tx_dlf_collections',
285
                        'tx_dlf_collections_join',
286
                        $queryBuilder->expr()->eq(
287
                            'tx_dlf_relations_joins.uid_foreign',
288
                            'tx_dlf_collections_join.uid'
289
                        )
290
                    )
291
                    ->where(
292
                        $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($settings['storagePid'])),
293
                        $queryBuilder->expr()->eq('tx_dlf_collections_join.pid', intval($settings['storagePid'])),
294
                        $queryBuilder->expr()->notIn('tx_dlf_documents.uid', $subQuery),
295
                        $queryBuilder->expr()->in('tx_dlf_collections_join.uid', $queryBuilder->createNamedParameter(GeneralUtility::intExplode(',', $settings['collections']), Connection::PARAM_INT_ARRAY)),
296
                        $queryBuilder->expr()->eq('tx_dlf_relations_joins.ident', $queryBuilder->createNamedParameter('docs_colls'))
297
                    )
298
                    ->execute()
299
                    ->fetchColumn(0);
300
        } else {
301
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
302
                ->getQueryBuilderForTable('tx_dlf_documents');
303
304
            // Include all collections.
305
            $countTitles = $queryBuilder
306
                ->count('tx_dlf_documents.uid')
307
                ->from('tx_dlf_documents')
308
                ->where(
309
                    $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($settings['storagePid'])),
310
                    $queryBuilder->expr()->eq('tx_dlf_documents.partof', 0),
311
                    Helper::whereExpression('tx_dlf_documents')
312
                )
313
                ->execute()
314
                ->fetchColumn(0);
315
316
            $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
317
                ->getQueryBuilderForTable('tx_dlf_documents');
318
            $subQueryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
319
                ->getQueryBuilderForTable('tx_dlf_documents');
320
321
            $subQuery = $subQueryBuilder
322
                ->select('tx_dlf_documents.partof')
323
                ->from('tx_dlf_documents')
324
                ->where(
325
                    $subQueryBuilder->expr()->neq('tx_dlf_documents.partof', 0)
326
                )
327
                ->groupBy('tx_dlf_documents.partof')
328
                ->getSQL();
329
330
            $countVolumes = $queryBuilder
331
                ->count('tx_dlf_documents.uid')
332
                ->from('tx_dlf_documents')
333
                ->where(
334
                    $queryBuilder->expr()->eq('tx_dlf_documents.pid', intval($settings['storagePid'])),
335
                    $queryBuilder->expr()->notIn('tx_dlf_documents.uid', $subQuery)
336
                )
337
                ->execute()
338
                ->fetchColumn(0);
339
        }
340
341
        return ['titles' => $countTitles, 'volumes' => $countVolumes];
342
    }
343
344
    /**
345
     * Build table of contents
346
     *
347
     * @param int $uid
348
     * @param int $pid
349
     * @param array $settings
350
     *
351
     * @return \TYPO3\CMS\Extbase\Persistence\QueryResultInterface
352
     */
353
    public function getTableOfContentsFromDb($uid, $pid, $settings)
354
    {
355
        // Build table of contents from database.
356
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
357
            ->getQueryBuilderForTable('tx_dlf_documents');
358
359
        $excludeOtherWhere = '';
360
        if ($settings['excludeOther']) {
361
            $excludeOtherWhere = 'tx_dlf_documents.pid=' . intval($settings['storagePid']);
362
        }
363
        // Check if there are any metadata to suggest.
364
        $result = $queryBuilder
365
            ->select(
366
                'tx_dlf_documents.uid AS uid',
367
                'tx_dlf_documents.title AS title',
368
                'tx_dlf_documents.volume AS volume',
369
                'tx_dlf_documents.mets_label AS mets_label',
370
                'tx_dlf_documents.mets_orderlabel AS mets_orderlabel',
371
                'tx_dlf_structures_join.index_name AS type'
372
            )
373
            ->innerJoin(
374
                'tx_dlf_documents',
375
                'tx_dlf_structures',
376
                'tx_dlf_structures_join',
377
                $queryBuilder->expr()->eq(
378
                    'tx_dlf_structures_join.uid',
379
                    'tx_dlf_documents.structure'
380
                )
381
            )
382
            ->from('tx_dlf_documents')
383
            ->where(
384
                $queryBuilder->expr()->eq('tx_dlf_documents.partof', intval($uid)),
385
                $queryBuilder->expr()->eq('tx_dlf_structures_join.pid', intval($pid)),
386
                $excludeOtherWhere
387
            )
388
            ->addOrderBy('tx_dlf_documents.volume_sorting')
389
            ->addOrderBy('tx_dlf_documents.mets_orderlabel')
390
            ->execute();
391
        return $result;
392
    }
393
394
    /**
395
     * Find one document by given settings and identifier
396
     *
397
     * @param array $settings
398
     * @param array $parameters
399
     *
400
     * @return array The found document object
401
     */
402
    public function getOaiRecord($settings, $parameters)
403
    {
404
        $where = '';
405
406
        if (!$settings['show_userdefined']) {
407
            $where .= 'AND tx_dlf_collections.fe_cruser_id=0 ';
408
        }
409
410
        $connection = GeneralUtility::makeInstance(ConnectionPool::class)
411
            ->getConnectionForTable('tx_dlf_documents');
412
413
        $sql = 'SELECT `tx_dlf_documents`.*, GROUP_CONCAT(DISTINCT `tx_dlf_collections`.`oai_name` ORDER BY `tx_dlf_collections`.`oai_name` SEPARATOR " ") AS `collections` ' .
414
            'FROM `tx_dlf_documents` ' .
415
            'INNER JOIN `tx_dlf_relations` ON `tx_dlf_relations`.`uid_local` = `tx_dlf_documents`.`uid` ' .
416
            'INNER JOIN `tx_dlf_collections` ON `tx_dlf_collections`.`uid` = `tx_dlf_relations`.`uid_foreign` ' .
417
            'WHERE `tx_dlf_documents`.`record_id` = ? ' .
418
            'AND `tx_dlf_relations`.`ident`="docs_colls" ' .
419
            $where;
420
421
        $values = [
422
            $parameters['identifier']
423
        ];
424
425
        $types = [
426
            Connection::PARAM_STR
427
        ];
428
429
        // Create a prepared statement for the passed SQL query, bind the given params with their binding types and execute the query
430
        $statement = $connection->executeQuery($sql, $values, $types);
431
432
        return $statement->fetch();
433
    }
434
435
    /**
436
     * Finds all documents for the given settings
437
     *
438
     * @param array $settings
439
     * @param array $documentsToProcess
440
     *
441
     * @return array The found document objects
442
     */
443
    public function getOaiDocumentList($settings, $documentsToProcess)
0 ignored issues
show
Unused Code introduced by
The parameter $settings is not used and could be removed. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unused  annotation

443
    public function getOaiDocumentList(/** @scrutinizer ignore-unused */ $settings, $documentsToProcess)

This check looks for parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
444
    {
445
        $connection = GeneralUtility::makeInstance(ConnectionPool::class)
446
            ->getConnectionForTable('tx_dlf_documents');
447
448
        $sql = 'SELECT `tx_dlf_documents`.*, GROUP_CONCAT(DISTINCT `tx_dlf_collections`.`oai_name` ORDER BY `tx_dlf_collections`.`oai_name` SEPARATOR " ") AS `collections` ' .
449
            'FROM `tx_dlf_documents` ' .
450
            'INNER JOIN `tx_dlf_relations` ON `tx_dlf_relations`.`uid_local` = `tx_dlf_documents`.`uid` ' .
451
            'INNER JOIN `tx_dlf_collections` ON `tx_dlf_collections`.`uid` = `tx_dlf_relations`.`uid_foreign` ' .
452
            'WHERE `tx_dlf_documents`.`uid` IN ( ? ) ' .
453
            'AND `tx_dlf_relations`.`ident`="docs_colls" ' .
454
            'AND ' . Helper::whereExpression('tx_dlf_collections') . ' ' .
455
            'GROUP BY `tx_dlf_documents`.`uid` ';
456
457
        $values = [
458
            $documentsToProcess,
459
        ];
460
461
        $types = [
462
            Connection::PARAM_INT_ARRAY,
463
        ];
464
465
        // Create a prepared statement for the passed SQL query, bind the given params with their binding types and execute the query
466
        $documents = $connection->executeQuery($sql, $values, $types);
467
468
        return $documents;
469
    }
470
471
    /**
472
     * Finds all documents with given uids
473
     *
474
     * @param array $uids
475
     * @param array $checkPartof Whether or not to also match $uids against partof.
476
     *
477
     * @return array
478
     */
479
    private function findAllByUids($uids, $checkPartof = false)
480
    {
481
        // get all documents from db we are talking about
482
        $connectionPool = GeneralUtility::makeInstance(ConnectionPool::class);
483
        $queryBuilder = $connectionPool->getQueryBuilderForTable('tx_dlf_documents');
484
        // Fetch document info for UIDs in $documentSet from DB
485
        $kitodoDocuments = $queryBuilder
486
            ->select(
487
                'tx_dlf_documents.uid AS uid',
488
                'tx_dlf_documents.title AS title',
489
                'tx_dlf_documents.structure AS structure',
490
                'tx_dlf_documents.thumbnail AS thumbnail',
491
                'tx_dlf_documents.volume_sorting AS volumeSorting',
492
                'tx_dlf_documents.mets_orderlabel AS metsOrderlabel',
493
                'tx_dlf_documents.partof AS partOf'
494
            )
495
            ->from('tx_dlf_documents')
496
            ->where(
497
                $queryBuilder->expr()->in('tx_dlf_documents.pid', $this->settings['storagePid']),
498
                $queryBuilder->expr()->orX(
499
                    $queryBuilder->expr()->in('tx_dlf_documents.uid', $uids),
500
                    $checkPartof
501
                        ? $queryBuilder->expr()->in('tx_dlf_documents.partof', $uids)
502
                        : null
503
                )
504
            )
505
            ->addOrderBy('tx_dlf_documents.volume_sorting', 'asc')
506
            ->addOrderBy('tx_dlf_documents.mets_orderlabel', 'asc')
507
            ->execute();
508
509
        $allDocuments = [];
510
        $documentStructures = Helper::getDocumentStructures($this->settings['storagePid']);
511
        // Process documents in a usable array structure
512
        while ($resArray = $kitodoDocuments->fetch()) {
513
            $resArray['structure'] = $documentStructures[$resArray['structure']];
514
            $allDocuments[$resArray['uid']] = $resArray;
515
        }
516
517
        return $allDocuments;
518
    }
519
520
    /**
521
     *
522
     *
523
     * @param array $uids
524
     *
525
     * @return array
526
     */
527
    protected function findChildrenOfEach(array $uids)
528
    {
529
        $allDocuments = $this->findAllByUids($uids, true);
0 ignored issues
show
Bug introduced by
true of type true is incompatible with the type array expected by parameter $checkPartof of Kitodo\Dlf\Domain\Reposi...sitory::findAllByUids(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

529
        $allDocuments = $this->findAllByUids($uids, /** @scrutinizer ignore-type */ true);
Loading history...
530
531
        $result = [];
532
        foreach ($allDocuments as $doc) {
533
            if ($doc['partOf']) {
534
                $result[$doc['partOf']][] = $doc;
535
            }
536
        }
537
        return $result;
538
    }
539
540
    /**
541
     * Find all documents with given collection from Solr
542
     *
543
     * @param \TYPO3\CMS\Extbase\Persistence\Generic\QueryResult|\Kitodo\Dlf\Domain\Model\Collection $collection
544
     * @param array $settings
545
     * @param array $searchParams
546
     * @param \TYPO3\CMS\Extbase\Persistence\Generic\QueryResult $listedMetadata
547
     * @return array
548
     */
549
    public function findSolrByCollection($collection, $settings, $searchParams, $listedMetadata = null)
550
    {
551
        // set settings global inside this repository
552
        $this->settings = $settings;
553
554
        // Prepare query parameters.
555
        $params = [];
556
        $matches = [];
557
        $fields = Solr::getFields();
558
559
        // Set search query.
560
        if (
561
            (!empty($searchParams['fulltext']))
562
            || preg_match('/' . $fields['fulltext'] . ':\((.*)\)/', trim($searchParams['query']), $matches)
563
        ) {
564
            // If the query already is a fulltext query e.g using the facets
565
            $searchParams['query'] = empty($matches[1]) ? $searchParams['query'] : $matches[1];
566
            // Search in fulltext field if applicable. Query must not be empty!
567
            if (!empty($searchParams['query'])) {
568
                $query = $fields['fulltext'] . ':(' . Solr::escapeQuery(trim($searchParams['query'])) . ')';
569
            }
570
            $params['fulltext'] = true;
571
        } else {
572
            // Retain given search field if valid.
573
            if (!empty($searchParams['query'])) {
574
                $query = Solr::escapeQueryKeepField(trim($searchParams['query']), $this->settings['storagePid']);
575
            }
576
        }
577
578
        // Add extended search query.
579
        if (
580
            !empty($searchParams['extQuery'])
581
            && is_array($searchParams['extQuery'])
582
        ) {
583
            $allowedOperators = ['AND', 'OR', 'NOT'];
584
            $numberOfExtQueries = count($searchParams['extQuery']);
585
            for ($i = 0; $i < $numberOfExtQueries; $i++) {
586
                if (!empty($searchParams['extQuery'][$i])) {
587
                    if (
588
                        in_array($searchParams['extOperator'][$i], $allowedOperators)
589
                    ) {
590
                        if (!empty($query)) {
591
                            $query .= ' ' . $searchParams['extOperator'][$i] . ' ';
592
                        }
593
                        $query .= Indexer::getIndexFieldName($searchParams['extField'][$i], $this->settings['storagePid']) . ':(' . Solr::escapeQuery($searchParams['extQuery'][$i]) . ')';
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable $query does not seem to be defined for all execution paths leading up to this point.
Loading history...
594
                    }
595
                }
596
            }
597
        }
598
599
            // Add filter query for faceting.
600
        if (isset($searchParams['fq']) && is_array($searchParams['fq'])) {
601
            foreach ($searchParams['fq'] as $filterQuery) {
602
                $params['filterquery'][]['query'] = $filterQuery;
603
            }
604
        }
605
606
        // Add filter query for in-document searching.
607
        if (
608
            !empty($searchParams['documentId'])
609
            && \TYPO3\CMS\Core\Utility\MathUtility::canBeInterpretedAsInteger($searchParams['documentId'])
610
        ) {
611
            // Search in document and all subordinates (valid for up to three levels of hierarchy).
612
            $params['filterquery'][]['query'] = '_query_:"{!join from='
613
                . $fields['uid'] . ' to=' . $fields['partof'] . '}'
614
                . $fields['uid'] . ':{!join from=' . $fields['uid'] . ' to=' . $fields['partof'] . '}'
615
                . $fields['uid'] . ':' . $searchParams['documentId'] . '"' . ' OR {!join from='
616
                . $fields['uid'] . ' to=' . $fields['partof'] . '}'
617
                . $fields['uid'] . ':' . $searchParams['documentId'] . ' OR '
618
                . $fields['uid'] . ':' . $searchParams['documentId'];
619
        }
620
621
        // if a collection is given, we prepare the collection query string
622
        if ($collection) {
623
            if ($collection instanceof \Kitodo\Dlf\Domain\Model\Collection) {
624
                $collectionsQueryString = '"' . $collection->getIndexName() . '"';
625
            } else {
626
                $collectionsQueryString = '';
627
                foreach ($collection as $index => $collectionEntry) {
628
                    $collectionsQueryString .= ($index > 0 ? ' OR ' : '') . '"' . $collectionEntry->getIndexName() . '"';
629
                }
630
            }
631
632
            if (empty($query)) {
633
                $params['filterquery'][]['query'] = 'toplevel:true';
634
                $params['filterquery'][]['query'] = 'partof:0';
635
            }
636
            $params['filterquery'][]['query'] = 'collection_faceting:(' . $collectionsQueryString . ')';
637
        }
638
639
        // Set some query parameters.
640
        $params['query'] = !empty($query) ? $query : '*';
641
        $params['start'] = 0;
642
        $params['rows'] = 10000;
643
644
        // order the results as given or by title as default
645
        if (!empty($searchParams['orderBy'])) {
646
            $querySort = [
647
                $searchParams['orderBy'] => $searchParams['order']
648
            ];
649
        } else {
650
            $querySort = [
651
                'year_sorting' => 'asc',
652
                'title_sorting' => 'asc'
653
            ];
654
        }
655
656
        $params['sort'] = $querySort;
657
        $params['listMetadataRecords'] = [];
658
659
        // Restrict the fields to the required ones.
660
        $params['fields'] = 'uid,id,page,title,thumbnail,partof,toplevel,type';
661
662
        if ($listedMetadata) {
663
            foreach ($listedMetadata as $metadata) {
664
                if ($metadata->getIndexStored() || $metadata->getIndexIndexed()) {
665
                    $listMetadataRecord = $metadata->getIndexName() . '_' . ($metadata->getIndexTokenized() ? 't' : 'u') . ($metadata->getIndexStored() ? 's' : 'u') . ($metadata->getIndexIndexed() ? 'i' : 'u');
666
                    $params['fields'] .= ',' . $listMetadataRecord;
667
                    $params['listMetadataRecords'][$metadata->getIndexName()] = $listMetadataRecord;
668
                }
669
            }
670
        }
671
672
        // Perform search.
673
        $result = $this->searchSolr($params, true);
674
675
        // Initialize values
676
        $numberOfToplevels = 0;
677
        $documents = [];
678
679
        if ($result['numFound'] > 0) {
680
            // flat array with uids from Solr search
681
            $documentSet = array_unique(array_column($result['documents'], 'uid'));
682
683
            if (empty($documentSet)) {
684
                // return nothing found
685
                return ['solrResults' => [], 'documents' => []];
686
            }
687
688
            // get the Extbase document objects for all uids
689
            $allDocuments = $this->findAllByUids($documentSet);
690
            $childrenOf = $this->findChildrenOfEach($documentSet);
691
692
            foreach ($result['documents'] as $doc) {
693
                if (empty($documents[$doc['uid']]) && $allDocuments[$doc['uid']]) {
694
                    $documents[$doc['uid']] = $allDocuments[$doc['uid']];
695
                }
696
                if ($documents[$doc['uid']]) {
697
                    if ($doc['toplevel'] === false) {
698
                        // this maybe a chapter, article, ..., year
699
                        if ($doc['type'] === 'year') {
700
                            continue;
701
                        }
702
                        if (!empty($doc['page'])) {
703
                            // it's probably a fulltext or metadata search
704
                            $searchResult = [];
705
                            $searchResult['page'] = $doc['page'];
706
                            $searchResult['thumbnail'] = $doc['thumbnail'];
707
                            $searchResult['structure'] = $doc['type'];
708
                            $searchResult['title'] = $doc['title'];
709
                            foreach ($params['listMetadataRecords'] as $indexName => $solrField) {
710
                                if (isset($doc['metadata'][$indexName])) {
711
                                    $documents[$doc['uid']]['metadata'][$indexName] = $doc['metadata'][$indexName];
712
                                    $searchResult['metadata'][$indexName] = $doc['metadata'][$indexName];
713
                                }
714
                            }
715
                            if ($searchParams['fulltext'] == '1') {
716
                                $searchResult['snippet'] = $doc['snippet'];
717
                                $searchResult['highlight'] = $doc['highlight'];
718
                                $searchResult['highlight_word'] = $searchParams['query'];
719
                            }
720
                            $documents[$doc['uid']]['searchResults'][] = $searchResult;
721
                        }
722
                    } else if ($doc['toplevel'] === true) {
723
                        $numberOfToplevels++;
724
                        foreach ($params['listMetadataRecords'] as $indexName => $solrField) {
725
                            if (isset($doc['metadata'][$indexName])) {
726
                                $documents[$doc['uid']]['metadata'][$indexName] = $doc['metadata'][$indexName];
727
                            }
728
                        }
729
                        if ($searchParams['fulltext'] != '1') {
730
                            $documents[$doc['uid']]['page'] = 1;
731
                            $children = $childrenOf[$doc['uid']] ?? [];
732
                            foreach ($children as $docChild) {
733
                                // We need only a few fields from the children, but we need them as array.
734
                                $childDocument = [
735
                                    'thumbnail' => $docChild['thumbnail'],
736
                                    'title' => $docChild['title'],
737
                                    'structure' => $docChild['structure'],
738
                                    'metsOrderlabel' => $docChild['metsOrderlabel'],
739
                                    'uid' => $docChild['uid'],
740
                                    'metadata' => $this->fetchMetadataFromSolr($docChild['uid'], $listedMetadata)
741
                                ];
742
                                $documents[$doc['uid']]['children'][$docChild['uid']] = $childDocument;
743
                            }
744
                        }
745
                    }
746
                    if (empty($documents[$doc['uid']]['metadata'])) {
747
                        $documents[$doc['uid']]['metadata'] = $this->fetchMetadataFromSolr($doc['uid'], $listedMetadata);
748
                    }
749
                    // get title of parent if empty
750
                    if (empty($documents[$doc['uid']]['title']) && ($documents[$doc['uid']]['partOf'] > 0)) {
751
                        $parentDocument = $this->findByUid($documents[$doc['uid']]['partOf']);
752
                        if ($parentDocument) {
753
                            $documents[$doc['uid']]['title'] = '[' . $parentDocument->getTitle() . ']';
754
                        }
755
                    }
756
                }
757
            }
758
        }
759
760
        return ['solrResults' => $result, 'numberOfToplevels' => $numberOfToplevels, 'documents' => $documents];
761
    }
762
763
    /**
764
     * Find all listed metadata for given document
765
     *
766
     * @param int $uid the uid of the document
767
     * @param \TYPO3\CMS\Extbase\Persistence\Generic\QueryResult $listedMetadata
768
     * @return array
769
     */
770
    protected function fetchMetadataFromSolr($uid, $listedMetadata = [])
771
    {
772
        // Prepare query parameters.
773
        $params = [];
774
        $metadataArray = [];
775
776
        // Set some query parameters.
777
        $params['query'] = 'uid:' . $uid;
778
        $params['start'] = 0;
779
        $params['rows'] = 1;
780
        $params['sort'] = ['score' => 'desc'];
781
        $params['listMetadataRecords'] = [];
782
783
        // Restrict the fields to the required ones.
784
        $params['fields'] = 'uid,toplevel';
785
786
        if ($listedMetadata) {
787
            foreach ($listedMetadata as $metadata) {
788
                if ($metadata->getIndexStored() || $metadata->getIndexIndexed()) {
789
                    $listMetadataRecord = $metadata->getIndexName() . '_' . ($metadata->getIndexTokenized() ? 't' : 'u') . ($metadata->getIndexStored() ? 's' : 'u') . ($metadata->getIndexIndexed() ? 'i' : 'u');
790
                    $params['fields'] .= ',' . $listMetadataRecord;
791
                    $params['listMetadataRecords'][$metadata->getIndexName()] = $listMetadataRecord;
792
                }
793
            }
794
        }
795
        // Set filter query to just get toplevel documents.
796
        $params['filterquery'][] = ['query' => 'toplevel:true'];
797
798
        // Perform search.
799
        $result = $this->searchSolr($params, true);
800
801
        if ($result['numFound'] > 0) {
802
            // There is only one result found because of toplevel:true.
803
            if (isset($result['documents'][0]['metadata'])) {
804
                $metadataArray = $result['documents'][0]['metadata'];
805
            }
806
        }
807
        return $metadataArray;
808
    }
809
810
    /**
811
     * Processes a search request
812
     *
813
     * @access public
814
     *
815
     * @param array $parameters: Additional search parameters
816
     * @param boolean $enableCache: Enable caching of Solr requests
817
     *
818
     * @return array The Apache Solr Documents that were fetched
819
     */
820
    protected function searchSolr($parameters = [], $enableCache = true)
821
    {
822
        // Set additional query parameters.
823
        $parameters['start'] = 0;
824
        // Set query.
825
        $parameters['query'] = isset($parameters['query']) ? $parameters['query'] : '*';
826
        $parameters['filterquery'] = isset($parameters['filterquery']) ? $parameters['filterquery'] : [];
827
828
        // Perform Solr query.
829
        // Instantiate search object.
830
        $solr = Solr::getInstance($this->settings['solrcore']);
831
        if (!$solr->ready) {
832
            Helper::log('Apache Solr not available', LOG_SEVERITY_ERROR);
833
            return [];
834
        }
835
836
        $cacheIdentifier = '';
837
        $cache = null;
838
        // Calculate cache identifier.
839
        if ($enableCache === true) {
840
            $cacheIdentifier = Helper::digest($solr->core . print_r($parameters, true));
0 ignored issues
show
Bug introduced by
Are you sure print_r($parameters, true) of type string|true can be used in concatenation? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

840
            $cacheIdentifier = Helper::digest($solr->core . /** @scrutinizer ignore-type */ print_r($parameters, true));
Loading history...
841
            $cache = GeneralUtility::makeInstance(CacheManager::class)->getCache('tx_dlf_solr');
842
        }
843
        $resultSet = [
844
            'documents' => [],
845
            'numFound' => 0,
846
        ];
847
        if ($enableCache === false || ($entry = $cache->get($cacheIdentifier)) === false) {
848
            $selectQuery = $solr->service->createSelect($parameters);
849
850
            if ($parameters['fulltext'] === true) {
851
                // get highlighting component and apply settings
852
                $selectQuery->getHighlighting();
853
            }
854
855
            $solrRequest = $solr->service->createRequest($selectQuery);
856
857
            if ($parameters['fulltext'] === true) {
858
                // If it is a fulltext search, enable highlighting.
859
                // field for which highlighting is going to be performed,
860
                // is required if you want to have OCR highlighting
861
                $solrRequest->addParam('hl.ocr.fl', 'fulltext');
862
                // return the coordinates of highlighted search as absolute coordinates
863
                $solrRequest->addParam('hl.ocr.absoluteHighlights', 'on');
864
                // max amount of snippets for a single page
865
                $solrRequest->addParam('hl.snippets', 20);
866
                // we store the fulltext on page level and can disable this option
867
                $solrRequest->addParam('hl.ocr.trackPages', 'off');
868
            }
869
870
            // Perform search for all documents with the same uid that either fit to the search or marked as toplevel.
871
            $response = $solr->service->executeRequest($solrRequest);
872
            $result = $solr->service->createResult($selectQuery, $response);
873
874
            /** @scrutinizer ignore-call */
875
            $resultSet['numFound'] = $result->getNumFound();
876
            $highlighting = [];
877
            if ($parameters['fulltext'] === true) {
878
                $data = $result->getData();
879
                $highlighting = $data['ocrHighlighting'];
880
            }
881
            $fields = Solr::getFields();
882
883
            foreach ($result as $record) {
884
                $resultDocument = new ResultDocument($record, $highlighting, $fields);
885
886
                $document = [
887
                    'id' => $resultDocument->getId(),
888
                    'page' => $resultDocument->getPage(),
889
                    'snippet' => $resultDocument->getSnippets(),
890
                    'thumbnail' => $resultDocument->getThumbnail(),
891
                    'title' => $resultDocument->getTitle(),
892
                    'toplevel' => $resultDocument->getToplevel(),
893
                    'type' => $resultDocument->getType(),
894
                    'uid' => !empty($resultDocument->getUid()) ? $resultDocument->getUid() : $parameters['uid'],
895
                    'highlight' => $resultDocument->getHighlightsIds(),
896
                ];
897
                foreach ($parameters['listMetadataRecords'] as $indexName => $solrField) {
898
                    if (!empty($record->$solrField)) {
899
                        $document['metadata'][$indexName] = $record->$solrField;
900
                    }
901
                }
902
                $resultSet['documents'][] = $document;
903
            }
904
905
            // Save value in cache.
906
            if (!empty($resultSet) && $enableCache === true) {
907
                $cache->set($cacheIdentifier, $resultSet);
908
            }
909
        } else {
910
            // Return cache hit.
911
            $resultSet = $entry;
912
        }
913
        return $resultSet;
914
    }
915
916
}
917