Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
| 1 | <?php |
||
| 23 | class Indexer extends SearchDbConnected |
||
| 24 | { |
||
| 25 | const SQLITE_MAX_COMPOUND_SELECT = 100; |
||
| 26 | protected $filters = array( |
||
| 27 | 'DutchStopWords', |
||
| 28 | 'EnglishStopWords' |
||
| 29 | ); |
||
| 30 | protected $storageDir; |
||
| 31 | /** |
||
| 32 | * @var double |
||
| 33 | */ |
||
| 34 | protected $loggingStart; |
||
| 35 | /** |
||
| 36 | * @var string |
||
| 37 | */ |
||
| 38 | protected $log; |
||
| 39 | /** |
||
| 40 | * @var double |
||
| 41 | */ |
||
| 42 | protected $lastLog; |
||
| 43 | |||
| 44 | const SEARCH_TEMP_DB = 'search_tmp.db'; |
||
| 45 | |||
| 46 | /** |
||
| 47 | * Creates a new temporary search db, cleans it if it exists |
||
| 48 | * then calculates and stores the search index in this db |
||
| 49 | * and finally if indexing completed replaces the current search |
||
| 50 | * db with the temporary one. Returns the log in string format. |
||
| 51 | * @return string |
||
| 52 | */ |
||
| 53 | public function updateIndex() |
||
| 76 | |||
| 77 | /** |
||
| 78 | * Count how often a term is used in a document |
||
| 79 | * |
||
| 80 | * @param $documents |
||
| 81 | */ |
||
| 82 | public function createDocumentTermCount($documents) |
||
| 87 | |||
| 88 | /** |
||
| 89 | * Calculate the frequency index for a term with |
||
| 90 | * a field |
||
| 91 | */ |
||
| 92 | public function createDocumentTermFrequency() |
||
| 97 | |||
| 98 | |||
| 99 | /** |
||
| 100 | * Resets the entire index |
||
| 101 | */ |
||
| 102 | public function resetIndex() |
||
| 115 | |||
| 116 | /** |
||
| 117 | * Calculates the inverse document frequency for each |
||
| 118 | * term. This is a representation of how often a certain |
||
| 119 | * term is used in comparison to all terms. |
||
| 120 | */ |
||
| 121 | public function createInverseDocumentFrequency() |
||
| 127 | |||
| 128 | /** |
||
| 129 | * @return int|mixed |
||
| 130 | */ |
||
| 131 | private function getTotalDocumentCount() |
||
| 135 | |||
| 136 | /** |
||
| 137 | * Calculates the Term Field Length Norm. |
||
| 138 | * This is an index determining how important a |
||
| 139 | * term is, based on the total length of the field |
||
| 140 | * it comes from. |
||
| 141 | */ |
||
| 142 | public function createTermFieldLengthNorm() |
||
| 147 | |||
| 148 | /** |
||
| 149 | * Stores the time the indexing started in memory |
||
| 150 | */ |
||
| 151 | private function startLogging() |
||
| 156 | |||
| 157 | /** |
||
| 158 | * Adds a logline with the time since last log |
||
| 159 | * @param $string |
||
| 160 | */ |
||
| 161 | private function addLog($string) |
||
| 167 | |||
| 168 | /** |
||
| 169 | * Creates the SQLite \PDO object if it doesnt |
||
| 170 | * exist and returns it. |
||
| 171 | * @return \PDO |
||
| 172 | */ |
||
| 173 | View Code Duplication | protected function getSearchDbHandle() |
|
| 181 | |||
| 182 | /** |
||
| 183 | * Replaces the old search index database with the new one. |
||
| 184 | */ |
||
| 185 | public function replaceOldIndex() |
||
| 191 | } |
Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.
You can also find more detailed suggestions in the “Code” section of your repository.