Passed
Push — deprecations ( 0237b0...c7191d )
by Tomas Norre
06:21
created

CrawlerApi   B

Complexity

Total Complexity 45

Size/Duplication

Total Lines 450
Duplicated Lines 0 %

Test Coverage

Coverage 38.02%

Importance

Changes 1
Bugs 0 Features 0
Metric Value
eloc 158
dl 0
loc 450
ccs 92
cts 242
cp 0.3802
rs 8.8
c 1
b 0
f 0
wmc 45

19 Methods

Rating   Name   Duplication   Size   Complexity  
A getLastProcessedQueueEntries() 0 3 1
A addPageToQueue() 0 5 1
A filterUnallowedConfigurations() 0 12 4
A setAllowedConfigurations() 0 3 1
A addPageToQueueTimed() 0 30 3
A getCurrentCrawlingSpeed() 0 33 5
A getQueueStatistics() 0 5 1
A getAllowedConfigurations() 0 3 1
A getLatestCrawlTimestampForPage() 0 32 4
A overwriteSetId() 0 3 1
A getSetId() 0 3 1
A getCrawlHistoryForPage() 0 17 2
A __construct() 0 6 1
B getPerformanceData() 0 48 8
A countEntriesInQueueForPageByScheduleTime() 0 26 2
A findCrawler() 0 11 3
A getQueueStatisticsByConfiguration() 0 12 2
A getCrawlerProcInstructions() 0 10 3
A getActiveProcessesCount() 0 5 1

How to fix   Complexity   

Complex Class

Complex classes like CrawlerApi often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use CrawlerApi, and based on these observations, apply Extract Interface, too.

1
<?php
2
3
declare(strict_types=1);
4
5
namespace AOE\Crawler\Api;
6
7
/***************************************************************
8
 *  Copyright notice
9
 *
10
 *  (c) 2018 AOE GmbH <[email protected]>
11
 *
12
 *  All rights reserved
13
 *
14
 *  This script is part of the TYPO3 project. The TYPO3 project is
15
 *  free software; you can redistribute it and/or modify
16
 *  it under the terms of the GNU General Public License as published by
17
 *  the Free Software Foundation; either version 3 of the License, or
18
 *  (at your option) any later version.
19
 *
20
 *  The GNU General Public License can be found at
21
 *  http://www.gnu.org/copyleft/gpl.html.
22
 *
23
 *  This script is distributed in the hope that it will be useful,
24
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
25
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
26
 *  GNU General Public License for more details.
27
 *
28
 *  This copyright notice MUST APPEAR in all copies of the script!
29
 ***************************************************************/
30
31
use AOE\Crawler\Controller\CrawlerController;
32
use AOE\Crawler\Domain\Repository\ProcessRepository;
33
use AOE\Crawler\Domain\Repository\QueueRepository;
34
use TYPO3\CMS\Core\Compatibility\PublicMethodDeprecationTrait;
35
use TYPO3\CMS\Core\Compatibility\PublicPropertyDeprecationTrait;
36
use TYPO3\CMS\Core\Database\ConnectionPool;
37
use TYPO3\CMS\Core\Database\Query\QueryBuilder;
38
use TYPO3\CMS\Core\Utility\GeneralUtility;
39
use TYPO3\CMS\Extbase\Object\ObjectManager;
40
use TYPO3\CMS\Frontend\Page\PageRepository;
41
42
/**
43
 * Class CrawlerApi
44
 * @deprecated The CrawlerApi is deprecated an will be removed in v10.x
45
 */
46
class CrawlerApi
47
{
48
    /**
49
     * @var QueueRepository
50
     * @deprecated
51
     */
52
    protected $queueRepository;
53
54
    /**
55
     * @var array
56
     * @deprecated
57
     */
58
    protected $allowedConfigurations = [];
59
60
    /**
61
     * @var QueryBuilder
62
     * @deprecated
63
     */
64
    protected $queryBuilder;
65
66
    /**
67
     * @var string
68
     * @deprecated
69
     */
70
    protected $tableName = 'tx_crawler_queue';
71
72
    /**
73
     * @var CrawlerController
74
     * @deprecated
75
     */
76
    protected $crawlerController;
77
78
    /**
79
     * CrawlerApi constructor.
80
     * @deprecated
81
     */
82 9
    public function __construct()
83
    {
84 9
        trigger_error('The ' . __CLASS__ . ' is deprecated an will be removed in v10.x, most of functionality is moved to the respective Domain Model and Repository, and some to the CrawlerController.');
85
        /** @var ObjectManager $objectManager */
86 9
        $objectManager = GeneralUtility::makeInstance(ObjectManager::class);
87 9
        $this->queueRepository = $objectManager->get(QueueRepository::class);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

87
        /** @scrutinizer ignore-deprecated */ $this->queueRepository = $objectManager->get(QueueRepository::class);
Loading history...
88 9
    }
89
90
    /**
91
     * Each crawler run has a setid, this facade method delegates
92
     * the it to the crawler object
93
     *
94
     * @param int $id
95
     * @throws \Exception
96
     */
97 1
    public function overwriteSetId(int $id): void
98
    {
99 1
        $this->findCrawler()->setID = $id;
100 1
    }
101
102
    /**
103
     * This method is used to limit the configuration selection to
104
     * a set of configurations.
105
     *
106
     * @param array $allowedConfigurations
107
     */
108 1
    public function setAllowedConfigurations(array $allowedConfigurations): void
109
    {
110 1
        $this->allowedConfigurations = $allowedConfigurations;
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$allowedConfigurations has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

110
        /** @scrutinizer ignore-deprecated */ $this->allowedConfigurations = $allowedConfigurations;
Loading history...
111 1
    }
112
113
    /**
114
     * @return array
115
     */
116 1
    public function getAllowedConfigurations()
117
    {
118 1
        return $this->allowedConfigurations;
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$allowedConfigurations has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

118
        return /** @scrutinizer ignore-deprecated */ $this->allowedConfigurations;
Loading history...
119
    }
120
121
    /**
122
     * Returns the setID of the crawler
123
     *
124
     * @return int
125
     */
126 1
    public function getSetId()
127
    {
128 1
        return $this->findCrawler()->setID;
129
    }
130
131
    /**
132
     * Method to get an instance of the internal crawler singleton
133
     *
134
     * @return CrawlerController Instance of the crawler lib
135
     *
136
     * @throws \Exception
137
     */
138 2
    protected function findCrawler()
139
    {
140 2
        if (!is_object($this->crawlerController)) {
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$crawlerController has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

140
        if (!is_object(/** @scrutinizer ignore-deprecated */ $this->crawlerController)) {
Loading history...
141 2
            $this->crawlerController = GeneralUtility::makeInstance(CrawlerController::class);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$crawlerController has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

141
            /** @scrutinizer ignore-deprecated */ $this->crawlerController = GeneralUtility::makeInstance(CrawlerController::class);
Loading history...
142 2
            $this->crawlerController->setID = GeneralUtility::md5int(microtime());
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$crawlerController has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

142
            /** @scrutinizer ignore-deprecated */ $this->crawlerController->setID = GeneralUtility::md5int(microtime());
Loading history...
143
        }
144
145 2
        if (is_object($this->crawlerController)) {
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$crawlerController has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

145
        if (is_object(/** @scrutinizer ignore-deprecated */ $this->crawlerController)) {
Loading history...
146 2
            return $this->crawlerController;
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$crawlerController has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

146
            return /** @scrutinizer ignore-deprecated */ $this->crawlerController;
Loading history...
147
        } else {
148
            throw new \Exception('no crawler object', 1512659759);
149
        }
150
    }
151
152
    /**
153
     * Adds a page to the crawlerqueue by uid
154
     *
155
     * @param int $uid uid
156
     */
157
    public function addPageToQueue($uid): void
158
    {
159
        $uid = intval($uid);
160
        //non timed elements will be added with timestamp 0
161
        $this->addPageToQueueTimed($uid, 0);
162
    }
163
164
    /**
165
     * This method is used to limit the processing instructions to the processing instructions
166
     * that are allowed.
167
     */
168 2
    protected function filterUnallowedConfigurations(array $configurations): array
169
    {
170 2
        if (count($this->allowedConfigurations) > 0) {
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$allowedConfigurations has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

170
        if (count(/** @scrutinizer ignore-deprecated */ $this->allowedConfigurations) > 0) {
Loading history...
171
            // 	remove configuration that does not match the current selection
172
            foreach ($configurations as $confKey => $confArray) {
173
                if (!in_array($confKey, $this->allowedConfigurations)) {
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$allowedConfigurations has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

173
                if (!in_array($confKey, /** @scrutinizer ignore-deprecated */ $this->allowedConfigurations)) {
Loading history...
174
                    unset($configurations[$confKey]);
175
                }
176
            }
177
        }
178
179 2
        return $configurations;
180
    }
181
182
    /**
183
     * Adds a page to the crawlerqueue by uid and sets a
184
     * timestamp when the page should be crawled.
185
     *
186
     * @param int $uid pageid
187
     * @param int $time timestamp
188
     *
189
     * @return void
190
     * @throws \Exception
191
     */
192 2
    public function addPageToQueueTimed($uid, $time): void
193
    {
194 2
        $uid = intval($uid);
195 2
        $time = intval($time);
196
197 2
        $crawler = $this->findCrawler();
198 2
        $pageData = GeneralUtility::makeInstance(PageRepository::class)->getPage($uid);
199 2
        $configurations = $crawler->getUrlsForPageRow($pageData);
200 2
        $configurations = $this->filterUnallowedConfigurations($configurations);
201 2
        $downloadUrls = [];
202 2
        $duplicateTrack = [];
203
204 2
        if (is_array($configurations)) {
0 ignored issues
show
introduced by
The condition is_array($configurations) is always true.
Loading history...
205 2
            foreach ($configurations as $cv) {
206
                //enable inserting of entries
207 2
                $crawler->registerQueueEntriesInternallyOnly = false;
208 2
                $crawler->urlListFromUrlArray(
209 2
                    $cv,
210 2
                    $pageData,
211 2
                    $time,
212 2
                    300,
213 2
                    true,
214 2
                    false,
215 2
                    $duplicateTrack,
216 2
                    $downloadUrls,
217 2
                    array_keys($this->getCrawlerProcInstructions())
218
                );
219
220
                //reset the queue because the entries have been written to the db
221 2
                unset($crawler->queueEntries);
222
            }
223
        }
224 2
    }
225
226
    /**
227
     * Counts all entries in the database which are scheduled for a given page id and a schedule timestamp.
228
     *
229
     * @param int $page_uid
230
     * @param int $schedule_timestamp
231
     *
232
     * @return int
233
     */
234 1
    protected function countEntriesInQueueForPageByScheduleTime($page_uid, $schedule_timestamp)
235
    {
236 1
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable($this->tableName);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

236
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable(/** @scrutinizer ignore-deprecated */ $this->tableName);
Loading history...
237
        $count = $queryBuilder
238 1
            ->count('*')
239 1
            ->from($this->tableName);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

239
            ->from(/** @scrutinizer ignore-deprecated */ $this->tableName);
Loading history...
240
241
        //if the same page is scheduled for the same time and has not be executed?
242
        //un-timed elements need an exec_time with 0 because they can occur multiple times
243 1
        if ($schedule_timestamp == 0) {
244 1
            $count->where(
245 1
                $queryBuilder->expr()->eq('page_id', $page_uid),
246 1
                $queryBuilder->expr()->eq('exec_time', 0),
247 1
                $queryBuilder->expr()->eq('scheduled', $schedule_timestamp)
248
            );
249
        } else {
250
            //timed elements have got a fixed schedule time, if a record with this time
251
            //exists it is maybe queued for the future, or is has been queue for the past and therefore
252
            //also been processed.
253 1
            $count->where(
254 1
                $queryBuilder->expr()->eq('page_id', $page_uid),
255 1
                $queryBuilder->expr()->eq('scheduled', $schedule_timestamp)
256
            );
257
        }
258
259 1
        return $count->execute()->rowCount();
260
    }
261
262
    /**
263
     * Method to return the latest Crawle Timestamp for a page.
264
     *
265
     * @param int $uid uid id of the page
266
     * @param bool $future_crawldates_only
267
     * @param bool $unprocessed_only
268
     *
269
     * @return int
270
     */
271 1
    public function getLatestCrawlTimestampForPage($uid, $future_crawldates_only = false, $unprocessed_only = false)
272
    {
273 1
        $uid = intval($uid);
274
275 1
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable($this->tableName);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

275
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable(/** @scrutinizer ignore-deprecated */ $this->tableName);
Loading history...
276
        $query = $queryBuilder
277 1
            ->from($this->tableName)
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

277
            ->from(/** @scrutinizer ignore-deprecated */ $this->tableName)
Loading history...
278 1
            ->selectLiteral('max(scheduled) as latest')
279 1
            ->where(
280 1
                $queryBuilder->expr()->eq('page_id', $queryBuilder->createNamedParameter($uid))
281
            );
282
283 1
        if ($future_crawldates_only) {
284
            $query->andWhere(
285
                $queryBuilder->expr()->gt('scheduled', time())
286
            );
287
        }
288
289 1
        if ($unprocessed_only) {
290
            $query->andWhere(
291
                $queryBuilder->expr()->eq('exec_time', 0)
292
            );
293
        }
294
295 1
        $row = $query->execute()->fetch(0);
296 1
        if ($row['latest']) {
297 1
            $res = $row['latest'];
298
        } else {
299
            $res = 0;
300
        }
301
302 1
        return intval($res);
303
    }
304
305
    /**
306
     * Returns an array with timestamps when the page has been scheduled for crawling and
307
     * at what time the scheduled crawl has been executed. The array also contains items that are
308
     * scheduled but have note been crawled yet.
309
     *
310
     * @return array array with the crawl-history of a page => 0 : scheduled time , 1 : executed_time, 2 : set_id
311
     */
312 1
    public function getCrawlHistoryForPage(int $uid, int $limit = 0)
313
    {
314 1
        $uid = intval($uid);
315 1
        $limit = intval($limit);
316
317 1
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable($this->tableName);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

317
        $queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)->getQueryBuilderForTable(/** @scrutinizer ignore-deprecated */ $this->tableName);
Loading history...
318
        $statement = $queryBuilder
319 1
            ->from($this->tableName)
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$tableName has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

319
            ->from(/** @scrutinizer ignore-deprecated */ $this->tableName)
Loading history...
320 1
            ->select('scheduled', 'exec_time', 'set_id')
321 1
            ->where(
322 1
                $queryBuilder->expr()->eq('page_id', $queryBuilder->createNamedParameter($uid, \PDO::PARAM_INT))
323
            );
324 1
        if ($limit) {
325 1
            $statement->setMaxResults($limit);
326
        }
327
328 1
        return $statement->execute()->fetchAll();
329
    }
330
331
    /**
332
     * Reads the registered processingInstructions of the crawler
333
     *
334
     * @return array
335
     */
336 2
    private function getCrawlerProcInstructions(): array
337
    {
338 2
        $crawlerProcInstructions = [];
339 2
        if (!empty($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['crawler']['procInstructions'])) {
340
            foreach ($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['crawler']['procInstructions'] as $configuration) {
341
                $crawlerProcInstructions[$configuration['key']] = $configuration['value'];
342
            }
343
        }
344
345 2
        return $crawlerProcInstructions;
346
    }
347
348
    /**
349
     * Get queue statistics
350
     */
351 1
    public function getQueueStatistics(): array
352
    {
353
        return [
354 1
            'assignedButUnprocessed' => $this->queueRepository->countAllAssignedPendingItems(),
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

354
            'assignedButUnprocessed' => /** @scrutinizer ignore-deprecated */ $this->queueRepository->countAllAssignedPendingItems(),
Loading history...
355 1
            'unprocessed' => $this->queueRepository->countAllPendingItems(),
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

355
            'unprocessed' => /** @scrutinizer ignore-deprecated */ $this->queueRepository->countAllPendingItems(),
Loading history...
356
        ];
357
    }
358
359
    /**
360
     * Get queue statistics by configuration
361
     *
362
     * @return array array of array('configuration' => <>, 'assignedButUnprocessed' => <>, 'unprocessed' => <>)
363
     */
364
    public function getQueueStatisticsByConfiguration()
365
    {
366
        $statistics = $this->queueRepository->countPendingItemsGroupedByConfigurationKey();
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

366
        $statistics = /** @scrutinizer ignore-deprecated */ $this->queueRepository->countPendingItemsGroupedByConfigurationKey();
Loading history...
367
        $setIds = $this->queueRepository->getSetIdWithUnprocessedEntries();
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

367
        $setIds = /** @scrutinizer ignore-deprecated */ $this->queueRepository->getSetIdWithUnprocessedEntries();
Loading history...
368
        $totals = $this->queueRepository->getTotalQueueEntriesByConfiguration($setIds);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

368
        $totals = /** @scrutinizer ignore-deprecated */ $this->queueRepository->getTotalQueueEntriesByConfiguration($setIds);
Loading history...
369
370
        // "merge" arrays
371
        foreach ($statistics as &$value) {
372
            $value['total'] = $totals[$value['configuration']];
373
        }
374
375
        return $statistics;
376
    }
377
378
    /**
379
     * Get active processes count
380
     */
381
    public function getActiveProcessesCount(): int
382
    {
383
        $processRepository = new ProcessRepository();
384
385
        return $processRepository->countActive();
386
    }
387
388
    /**
389
     * @param int $limit
390
     * @return array
391
     */
392
    public function getLastProcessedQueueEntries(int $limit): array
393
    {
394
        return $this->queueRepository->getLastProcessedEntries($limit);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

394
        return /** @scrutinizer ignore-deprecated */ $this->queueRepository->getLastProcessedEntries($limit);
Loading history...
395
    }
396
397
    /**
398
     * Get current crawling speed
399
     *
400
     * @return int|float|bool
401
     */
402
    public function getCurrentCrawlingSpeed()
403
    {
404
        $lastProcessedEntries = $this->queueRepository->getLastProcessedEntriesTimestamps();
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

404
        $lastProcessedEntries = /** @scrutinizer ignore-deprecated */ $this->queueRepository->getLastProcessedEntriesTimestamps();
Loading history...
405
406
        if (count($lastProcessedEntries) < 10) {
407
            // not enough information
408
            return false;
409
        }
410
411
        $tooOldDelta = 60; // time between two entries is "too old"
412
413
        $compareValue = time();
414
        $startTime = $lastProcessedEntries[0];
415
416
        $pages = 0;
417
418
        reset($lastProcessedEntries);
419
        foreach ($lastProcessedEntries as $timestamp) {
420
            if ($compareValue - $timestamp > $tooOldDelta) {
421
                break;
422
            }
423
            $compareValue = $timestamp;
424
            $pages++;
425
        }
426
427
        if ($pages < 10) {
428
            // not enough information
429
            return false;
430
        }
431
        $oldestTimestampThatIsNotTooOld = $compareValue;
432
        $time = $startTime - $oldestTimestampThatIsNotTooOld;
433
434
        return $pages / ($time / 60);
435
    }
436
437
    /**
438
     * Get some performance data
439
     *
440
     * @param integer $start
441
     * @param integer $end
442
     * @param integer $resolution
443
     *
444
     * @return array data
445
     *
446
     * @throws \Exception
447
     */
448
    public function getPerformanceData($start, $end, $resolution)
449
    {
450
        $data = [];
451
452
        $data['urlcount'] = 0;
453
        $data['start'] = $start;
454
        $data['end'] = $end;
455
        $data['duration'] = $data['end'] - $data['start'];
456
457
        if ($data['duration'] < 1) {
458
            throw new \Exception('End timestamp must be after start timestamp', 1512659945);
459
        }
460
461
        for ($slotStart = $start; $slotStart < $end; $slotStart += $resolution) {
462
            $slotEnd = min($slotStart + $resolution - 1, $end);
463
            $slotData = $this->queueRepository->getPerformanceData($slotStart, $slotEnd);
0 ignored issues
show
Deprecated Code introduced by
The property AOE\Crawler\Api\CrawlerApi::$queueRepository has been deprecated. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

463
            $slotData = /** @scrutinizer ignore-deprecated */ $this->queueRepository->getPerformanceData($slotStart, $slotEnd);
Loading history...
464
465
            $slotUrlCount = 0;
466
            foreach ($slotData as &$processData) {
467
                $duration = $processData['end'] - $processData['start'];
468
                if ($processData['urlcount'] > 5 && $duration > 0) {
469
                    $processData['speed'] = 60 * 1 / ($duration / $processData['urlcount']);
470
                }
471
                $slotUrlCount += $processData['urlcount'];
472
            }
473
474
            $data['urlcount'] += $slotUrlCount;
475
476
            $data['slots'][$slotEnd] = [
477
                'amountProcesses' => count($slotData),
478
                'urlcount' => $slotUrlCount,
479
                'processes' => $slotData,
480
            ];
481
482
            if ($slotUrlCount > 5) {
483
                $data['slots'][$slotEnd]['speed'] = 60 * 1 / ($slotEnd - $slotStart / $slotUrlCount);
484
            } else {
485
                $data['slots'][$slotEnd]['speed'] = 0;
486
            }
487
        }
488
489
        if ($data['urlcount'] > 5) {
490
            $data['speed'] = 60 * 1 / ($data['duration'] / $data['urlcount']);
491
        } else {
492
            $data['speed'] = 0;
493
        }
494
495
        return $data;
496
    }
497
}
498