| 1 |  |  | <?php | 
            
                                                                                                            
                            
            
                                    
            
            
                | 2 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 3 |  |  | declare(strict_types=1); | 
            
                                                                                                            
                            
            
                                    
            
            
                | 4 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 5 |  |  | namespace AOE\Crawler; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 6 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 7 |  |  | /* | 
            
                                                                                                            
                            
            
                                    
            
            
                | 8 |  |  |  * (c) 2020 AOE GmbH <[email protected]> | 
            
                                                                                                            
                            
            
                                    
            
            
                | 9 |  |  |  * | 
            
                                                                                                            
                            
            
                                    
            
            
                | 10 |  |  |  * This file is part of the TYPO3 Crawler Extension. | 
            
                                                                                                            
                            
            
                                    
            
            
                | 11 |  |  |  * | 
            
                                                                                                            
                            
            
                                    
            
            
                | 12 |  |  |  * It is free software; you can redistribute it and/or modify it under | 
            
                                                                                                            
                            
            
                                    
            
            
                | 13 |  |  |  * the terms of the GNU General Public License, either version 2 | 
            
                                                                                                            
                            
            
                                    
            
            
                | 14 |  |  |  * of the License, or any later version. | 
            
                                                                                                            
                            
            
                                    
            
            
                | 15 |  |  |  * | 
            
                                                                                                            
                            
            
                                    
            
            
                | 16 |  |  |  * For the full copyright and license information, please read the | 
            
                                                                                                            
                            
            
                                    
            
            
                | 17 |  |  |  * LICENSE.txt file that was distributed with this source code. | 
            
                                                                                                            
                            
            
                                    
            
            
                | 18 |  |  |  * | 
            
                                                                                                            
                            
            
                                    
            
            
                | 19 |  |  |  * The TYPO3 project - inspiring people to share! | 
            
                                                                                                            
                            
            
                                    
            
            
                | 20 |  |  |  */ | 
            
                                                                                                            
                            
            
                                    
            
            
                | 21 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 22 |  |  | use AOE\Crawler\Controller\CrawlerController; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 23 |  |  | use AOE\Crawler\Converter\JsonCompatibilityConverter; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 24 |  |  | use AOE\Crawler\CrawlStrategy\CallbackExecutionStrategy; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 25 |  |  | use AOE\Crawler\CrawlStrategy\CrawlStrategyFactory; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 26 |  |  | use AOE\Crawler\Utility\SignalSlotUtility; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 27 |  |  | use TYPO3\CMS\Core\Http\Uri; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 28 |  |  | use TYPO3\CMS\Core\SingletonInterface; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 29 |  |  | use TYPO3\CMS\Core\Utility\GeneralUtility; | 
            
                                                                                                            
                            
            
                                    
            
            
                | 30 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 31 |  |  | /** | 
            
                                                                                                            
                            
            
                                    
            
            
                | 32 |  |  |  * Fetches a URL based on the selected strategy or via a callback. | 
            
                                                                                                            
                                                                
            
                                    
            
            
                | 33 |  |  |  */ | 
            
                                                                        
                            
            
                                    
            
            
                | 34 |  |  | class QueueExecutor implements SingletonInterface | 
            
                                                                        
                            
            
                                    
            
            
                | 35 |  |  | { | 
            
                                                                        
                            
            
                                    
            
            
                | 36 |  |  |     /** | 
            
                                                                        
                            
            
                                    
            
            
                | 37 |  |  |      * @var CrawlStrategy | 
            
                                                                        
                            
            
                                    
            
            
                | 38 |  |  |      */ | 
            
                                                                        
                            
            
                                    
            
            
                | 39 |  |  |     protected $crawlStrategy; | 
            
                                                                        
                            
            
                                    
            
            
                | 40 |  |  |  | 
            
                                                                        
                            
            
                                    
            
            
                | 41 | 37 |  |     public function __construct(CrawlStrategyFactory $crawlStrategyFactory) | 
            
                                                                        
                            
            
                                    
            
            
                | 42 |  |  |     { | 
            
                                                                        
                            
            
                                    
            
            
                | 43 | 37 |  |         $this->crawlStrategy = $crawlStrategyFactory->create(); | 
                            
                    |  |  |  | 
                                                                                        
                                                                                     | 
            
                                                                        
                            
            
                                    
            
            
                | 44 | 37 |  |     } | 
            
                                                                        
                            
            
                                    
            
            
                | 45 |  |  |  | 
            
                                                                        
                            
            
                                    
            
            
                | 46 |  |  |     /** | 
            
                                                                        
                            
            
                                    
            
            
                | 47 |  |  |      * Takes a queue record and fetches the contents of the URL. | 
            
                                                                        
                            
            
                                    
            
            
                | 48 |  |  |      * In the future, updating the queue item & additional signal/slot/events should also happen in here. | 
            
                                                                        
                            
            
                                    
            
            
                | 49 |  |  |      * | 
            
                                                                        
                            
            
                                    
            
            
                | 50 |  |  |      * @return array|bool|mixed|string | 
            
                                                                        
                            
            
                                    
            
            
                | 51 |  |  |      */ | 
            
                                                                        
                            
            
                                    
            
            
                | 52 | 1 |  |     public function executeQueueItem(array $queueItem, CrawlerController $crawlerController) | 
            
                                                                        
                            
            
                                    
            
            
                | 53 |  |  |     { | 
            
                                                                        
                            
            
                                    
            
            
                | 54 | 1 |  |         $parameters = ''; | 
            
                                                                        
                            
            
                                    
            
            
                | 55 | 1 |  |         if (isset($queueItem['parameters'])) { | 
            
                                                                        
                            
            
                                    
            
            
                | 56 |  |  |             // Decode parameters: | 
            
                                                                        
                            
            
                                    
            
            
                | 57 |  |  |             /** @var JsonCompatibilityConverter $jsonCompatibleConverter */ | 
            
                                                                        
                            
            
                                    
            
            
                | 58 |  |  |             $jsonCompatibleConverter = GeneralUtility::makeInstance(JsonCompatibilityConverter::class); | 
            
                                                                        
                            
            
                                    
            
            
                | 59 |  |  |             $parameters = $jsonCompatibleConverter->convert($queueItem['parameters']); | 
            
                                                                        
                            
            
                                    
            
            
                | 60 |  |  |         } | 
            
                                                                        
                            
            
                                    
            
            
                | 61 |  |  |  | 
            
                                                                        
                            
            
                                    
            
            
                | 62 | 1 |  |         if (! is_array($parameters) || empty($parameters)) { | 
            
                                                                        
                            
            
                                    
            
            
                | 63 | 1 |  |             return 'ERROR'; | 
            
                                                                        
                            
            
                                    
            
            
                | 64 |  |  |         } | 
            
                                                                        
                            
            
                                    
            
            
                | 65 |  |  |         if ($parameters['_CALLBACKOBJ']) { | 
            
                                                                        
                            
            
                                    
            
            
                | 66 |  |  |             $className = $parameters['_CALLBACKOBJ']; | 
            
                                                                        
                            
            
                                    
            
            
                | 67 |  |  |             unset($parameters['_CALLBACKOBJ']); | 
            
                                                                        
                            
            
                                    
            
            
                | 68 |  |  |             $result = GeneralUtility::makeInstance(CallbackExecutionStrategy::class) | 
            
                                                                        
                            
            
                                    
            
            
                | 69 |  |  |                 ->fetchByCallback($className, $parameters, $crawlerController); | 
            
                                                                        
                            
            
                                    
            
            
                | 70 |  |  |             $result = ['content' => json_encode($result)]; | 
            
                                                                        
                            
            
                                    
            
            
                | 71 |  |  |         } else { | 
            
                                                                        
                            
            
                                    
            
            
                | 72 |  |  |             // Regular FE request | 
            
                                                                        
                            
            
                                    
            
            
                | 73 |  |  |             $crawlerId = $this->generateCrawlerIdFromQueueItem($queueItem); | 
            
                                                                        
                            
            
                                    
            
            
                | 74 |  |  |  | 
            
                                                                        
                            
            
                                    
            
            
                | 75 |  |  |             // Get result: | 
            
                                                                        
                            
            
                                    
            
            
                | 76 |  |  |             $url = new Uri($parameters['url']); | 
            
                                                                        
                            
            
                                    
            
            
                | 77 |  |  |             $result = $this->crawlStrategy->fetchUrlContents($url, $crawlerId); | 
            
                                                                        
                            
            
                                    
            
            
                | 78 |  |  |             if ($result !== false) { | 
            
                                                                        
                            
            
                                    
            
            
                | 79 |  |  |                 $result = ['content' => json_encode($result)]; | 
            
                                                                        
                            
            
                                    
            
            
                | 80 |  |  |             } | 
            
                                                                        
                            
            
                                    
            
            
                | 81 |  |  |  | 
            
                                                                        
                            
            
                                    
            
            
                | 82 |  |  |             $signalPayload = ['url' => $parameters['url'], 'result' => $result]; | 
            
                                                                        
                            
            
                                    
            
            
                | 83 |  |  |             SignalSlotUtility::emitSignal( | 
            
                                                                        
                            
            
                                    
            
            
                | 84 |  |  |                 self::class, | 
            
                                                                        
                            
            
                                    
            
            
                | 85 |  |  |                 SignalSlotUtility::SIGNAL_URL_CRAWLED, | 
            
                                                                        
                            
            
                                    
            
            
                | 86 |  |  |                 $signalPayload | 
            
                                                                        
                            
            
                                    
            
            
                | 87 |  |  |             ); | 
            
                                                                        
                            
            
                                    
            
            
                | 88 |  |  |         } | 
            
                                                                        
                            
            
                                    
            
            
                | 89 |  |  |         return $result; | 
            
                                                                        
                            
            
                                    
            
            
                | 90 |  |  |     } | 
            
                                                                        
                            
            
                                    
            
            
                | 91 |  |  |  | 
            
                                                                                                            
                            
            
                                    
            
            
                | 92 |  |  |     protected function generateCrawlerIdFromQueueItem(array $queueItem): string | 
            
                                                                                                            
                            
            
                                    
            
            
                | 93 |  |  |     { | 
            
                                                                                                            
                                                                
            
                                    
            
            
                | 94 |  |  |         return $queueItem['qid'] . ':' . md5($queueItem['qid'] . '|' . $queueItem['set_id'] . '|' . $GLOBALS['TYPO3_CONF_VARS']['SYS']['encryptionKey']); | 
            
                                                                        
                                                                
            
                                    
            
            
                | 95 |  |  |     } | 
            
                                                                        
                                                                
            
                                    
            
            
                | 96 |  |  | } | 
            
                                                                        
                                                                
            
                                    
            
            
                | 97 |  |  |  | 
            
                        
Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.
Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..