Completed
Push — prado-3.3 ( e90646...0b76d5 )
by Fabio
23:37 queued 03:01
created

Zend_Search_Lucene_Search_Query_MultiTerm   C

Complexity

Total Complexity 56

Size/Duplication

Total Lines 402
Duplicated Lines 7.71 %

Coupling/Cohesion

Components 1
Dependencies 4

Importance

Changes 0
Metric Value
dl 31
loc 402
rs 5.5199
c 0
b 0
f 0
wmc 56
lcom 1
cbo 4

11 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 20 5
A addTerm() 0 20 4
A getTerms() 0 4 1
A getSigns() 0 4 1
A setWeight() 0 4 1
A _createWeight() 0 4 1
B _calculateConjunctionResult() 31 31 8
D _calculateNonConjunctionResult() 0 82 18
A _conjunctionScore() 0 17 3
B _nonConjunctionScore() 0 34 8
B score() 0 24 6

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like Zend_Search_Lucene_Search_Query_MultiTerm often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Zend_Search_Lucene_Search_Query_MultiTerm, and based on these observations, apply Extract Interface, too.

1
<?php
2
/**
3
 * Zend Framework
4
 *
5
 * LICENSE
6
 *
7
 * This source file is subject to version 1.0 of the Zend Framework
8
 * license, that is bundled with this package in the file LICENSE, and
9
 * is available through the world-wide-web at the following URL:
10
 * http://www.zend.com/license/framework/1_0.txt. If you did not receive
11
 * a copy of the Zend Framework license and are unable to obtain it
12
 * through the world-wide-web, please send a note to [email protected]
13
 * so we can mail you a copy immediately.
14
 *
15
 * @package    Zend_Search_Lucene
16
 * @subpackage Search
17
 * @copyright  Copyright (c) 2005-2006 Zend Technologies USA Inc. (http://www.zend.com)
18
 * @license    http://www.zend.com/license/framework/1_0.txt Zend Framework License version 1.0
19
 */
20
21
22
/** Zend_Search_Lucene_Search_Query */
23
require_once 'Zend/Search/Lucene/Search/Query.php';
24
25
/** Zend_Search_Lucene_Search_Weight_MultiTerm */
26
require_once 'Zend/Search/Lucene/Search/Weight/MultiTerm.php';
27
28
29
/**
30
 * @package    Zend_Search_Lucene
31
 * @subpackage Search
32
 * @copyright  Copyright (c) 2005-2006 Zend Technologies USA Inc. (http://www.zend.com)
33
 * @license    http://www.zend.com/license/framework/1_0.txt Zend Framework License version 1.0
34
 */
35
class Zend_Search_Lucene_Search_Query_MultiTerm extends Zend_Search_Lucene_Search_Query
36
{
37
38
    /**
39
     * Terms to find.
40
     * Array of Zend_Search_Lucene_Index_Term
41
     *
42
     * @var array
43
     */
44
    private $_terms = array();
45
46
    /**
47
     * Term signs.
48
     * If true then term is required.
49
     * If false then term is prohibited.
50
     * If null then term is neither prohibited, nor required
51
     *
52
     * If array is null then all terms are required
53
     *
54
     * @var array
55
     */
56
57
    private $_signs = array();
58
59
    /**
60
     * Result vector.
61
     * Bitset or array of document IDs
62
     * (depending from Bitset extension availability).
63
     *
64
     * @var mixed
65
     */
66
    private $_resVector = null;
67
68
    /**
69
     * Terms positions vectors.
70
     * Array of Arrays:
71
     * term1Id => (docId => array( pos1, pos2, ... ), ...)
72
     * term2Id => (docId => array( pos1, pos2, ... ), ...)
73
     *
74
     * @var array
75
     */
76
    private $_termsPositions = array();
77
78
79
    /**
80
     * A score factor based on the fraction of all query terms
81
     * that a document contains.
82
     * float for conjunction queries
83
     * array of float for non conjunction queries
84
     *
85
     * @var mixed
86
     */
87
    private $_coord = null;
88
89
90
    /**
91
     * Terms weights
92
     * array of Zend_Search_Lucene_Search_Weight
93
     *
94
     * @var array
95
     */
96
    private $_weights = array();
97
98
99
    /**
100
     * Class constructor.  Create a new multi-term query object.
101
     *
102
     * @param array $terms    Array of Zend_Search_Lucene_Index_Term objects
103
     * @param array $signs    Array of signs.  Sign is boolean|null.
104
     * @return void
0 ignored issues
show
Comprehensibility Best Practice introduced by
Adding a @return annotation to constructors is generally not recommended as a constructor does not have a meaningful return value.

Adding a @return annotation to a constructor is not recommended, since a constructor does not have a meaningful return value.

Please refer to the PHP core documentation on constructors.

Loading history...
105
     */
106
    public function __construct($terms = null, $signs = null)
107
    {
108
        /**
109
         * @todo Check contents of $terms and $signs before adding them.
110
         */
111
        if (is_array($terms)) {
112
            $this->_terms = $terms;
113
114
            $this->_signs = null;
0 ignored issues
show
Documentation Bug introduced by
It seems like null of type null is incompatible with the declared type array of property $_signs.

Our type inference engine has found an assignment to a property that is incompatible with the declared type of that property.

Either this assignment is in error or the assigned type should be added to the documentation/type hint for that property..

Loading history...
115
            // Check if all terms are required
116
            if (is_array($signs)) {
117
                foreach ($signs as $sign ) {
118
                    if ($sign !== true) {
119
                        $this->_signs = $signs;
120
                        continue;
121
                    }
122
                }
123
            }
124
        }
125
    }
126
127
128
    /**
129
     * Add a $term (Zend_Search_Lucene_Index_Term) to this query.
130
     *
131
     * The sign is specified as:
132
     *     TRUE  - term is required
133
     *     FALSE - term is prohibited
134
     *     NULL  - term is neither prohibited, nor required
135
     *
136
     * @param  Zend_Search_Lucene_Index_Term $term
137
     * @param  boolean|null $sign
138
     * @return void
139
     */
140
    public function addTerm(Zend_Search_Lucene_Index_Term $term, $sign=null) {
141
        $this->_terms[] = $term;
142
143
        /**
144
         * @todo This is not good.  Sometimes $this->_signs is an array, sometimes
145
         * it is null, even when there are terms.  It will be changed so that
146
         * it is always an array.
147
         */
148
        if ($this->_signs === null) {
149
            if ($sign !== null) {
150
                $this->_signs = array();
151
                foreach ($this->_terms as $term) {
152
                    $this->_signs[] = null;
153
                }
154
                $this->_signs[] = $sign;
155
            }
156
        } else {
157
            $this->_signs[] = $sign;
158
        }
159
    }
160
161
162
    /**
163
     * Returns query term
164
     *
165
     * @return array
166
     */
167
    public function getTerms()
168
    {
169
        return $this->_terms;
170
    }
171
172
173
    /**
174
     * Return terms signs
175
     *
176
     * @return array
177
     */
178
    public function getSigns()
179
    {
180
        return $this->_signs;
181
    }
182
183
184
    /**
185
     * Set weight for specified term
186
     *
187
     * @param integer $num
188
     * @param Zend_Search_Lucene_Search_Weight_Term $weight
189
     */
190
    public function setWeight($num, $weight)
191
    {
192
        $this->_weights[$num] = $weight;
193
    }
194
195
196
    /**
197
     * Constructs an appropriate Weight implementation for this query.
198
     *
199
     * @param Zend_Search_Lucene $reader
200
     * @return Zend_Search_Lucene_Search_Weight
201
     */
202
    protected function _createWeight($reader)
203
    {
204
        return new Zend_Search_Lucene_Search_Weight_MultiTerm($this, $reader);
205
    }
206
207
208
    /**
209
     * Calculate result vector for Conjunction query
210
     * (like '+something +another')
211
     *
212
     * @param Zend_Search_Lucene $reader
213
     */
214 View Code Duplication
    private function _calculateConjunctionResult($reader)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
215
    {
216
        if (extension_loaded('bitset')) {
217
            foreach( $this->_terms as $termId=>$term ) {
218
                if($this->_resVector === null) {
219
                    $this->_resVector = bitset_from_array($reader->termDocs($term));
220
                } else {
221
                    $this->_resVector = bitset_intersection(
222
                                $this->_resVector,
223
                                bitset_from_array($reader->termDocs($term)) );
224
                }
225
226
                $this->_termsPositions[$termId] = $reader->termPositions($term);
227
            }
228
        } else {
229
            foreach( $this->_terms as $termId=>$term ) {
230
                if($this->_resVector === null) {
231
                    $this->_resVector = array_flip($reader->termDocs($term));
232
                } else {
233
                    $termDocs = array_flip($reader->termDocs($term));
234
                    foreach($this->_resVector as $key=>$value) {
235
                        if (!isset( $termDocs[$key] )) {
236
                            unset( $this->_resVector[$key] );
237
                        }
238
                    }
239
                }
240
241
                $this->_termsPositions[$termId] = $reader->termPositions($term);
242
            }
243
        }
244
    }
245
246
247
    /**
248
     * Calculate result vector for non Conjunction query
249
     * (like '+something -another')
250
     *
251
     * @param Zend_Search_Lucene $reader
252
     */
253
    private function _calculateNonConjunctionResult($reader)
254
    {
255
        if (extension_loaded('bitset')) {
256
            $required   = null;
257
            $neither    = bitset_empty();
258
            $prohibited = bitset_empty();
259
260
            foreach ($this->_terms as $termId => $term) {
261
                $termDocs = bitset_from_array($reader->termDocs($term));
262
263
                if ($this->_signs[$termId] === true) {
264
                    // required
265
                    if ($required !== null) {
266
                        $required = bitset_intersection($required, $termDocs);
267
                    } else {
268
                        $required = $termDocs;
269
                    }
270
                } elseif ($this->_signs[$termId] === false) {
271
                    // prohibited
272
                    $prohibited = bitset_union($prohibited, $termDocs);
273
                } else {
274
                    // neither required, nor prohibited
275
                    $neither = bitset_union($neither, $termDocs);
276
                }
277
278
                $this->_termsPositions[$termId] = $reader->termPositions($term);
279
            }
280
281
            if ($required === null) {
282
                $required = $neither;
283
            }
284
            $this->_resVector = bitset_intersection( $required,
285
                                                     bitset_invert($prohibited, $reader->count()) );
286
        } else {
287
            $required   = null;
288
            $neither    = array();
289
            $prohibited = array();
290
291
            foreach ($this->_terms as $termId => $term) {
292
                $termDocs = array_flip($reader->termDocs($term));
293
294
                if ($this->_signs[$termId] === true) {
295
                    // required
296
                    if ($required !== null) {
297
                        // substitute for bitset_intersection
298
                        foreach ($required as $key => $value) {
0 ignored issues
show
Bug introduced by
The expression $required of type null|array is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
299
                            if (!isset( $termDocs[$key] )) {
300
                                unset($required[$key]);
301
                            }
302
                        }
303
                    } else {
304
                        $required = $termDocs;
305
                    }
306
                } elseif ($this->_signs[$termId] === false) {
307
                    // prohibited
308
                    // substitute for bitset_union
309
                    foreach ($termDocs as $key => $value) {
310
                        $prohibited[$key] = $value;
311
                    }
312
                } else {
313
                    // neither required, nor prohibited
314
                    // substitute for bitset_union
315
                    foreach ($termDocs as $key => $value) {
316
                        $neither[$key] = $value;
317
                    }
318
                }
319
320
                $this->_termsPositions[$termId] = $reader->termPositions($term);
321
            }
322
323
            if ($required === null) {
324
                $required = $neither;
325
            }
326
327
            foreach ($required as $key=>$value) {
328
                if (isset( $prohibited[$key] )) {
329
                    unset($required[$key]);
330
                }
331
            }
332
            $this->_resVector = $required;
333
        }
334
    }
335
336
337
    /**
338
     * Score calculator for conjunction queries (all terms are required)
339
     *
340
     * @param integer $docId
341
     * @param Zend_Search_Lucene $reader
342
     * @return float
343
     */
344
    public function _conjunctionScore($docId, $reader)
345
    {
346
        if ($this->_coord === null) {
347
            $this->_coord = $reader->getSimilarity()->coord(count($this->_terms),
348
                                                            count($this->_terms) );
349
        }
350
351
        $score = 0.0;
352
353
        foreach ($this->_terms as $termId=>$term) {
354
            $score += $reader->getSimilarity()->tf(count($this->_termsPositions[$termId][$docId]) ) *
355
                      $this->_weights[$termId]->getValue() *
356
                      $reader->norm($docId, $term->field);
357
        }
358
359
        return $score * $this->_coord;
360
    }
361
362
363
    /**
364
     * Score calculator for non conjunction queries (not all terms are required)
365
     *
366
     * @param integer $docId
367
     * @param Zend_Search_Lucene $reader
368
     * @return float
369
     */
370
    public function _nonConjunctionScore($docId, $reader)
371
    {
372
        if ($this->_coord === null) {
373
            $this->_coord = array();
374
375
            $maxCoord = 0;
376
            foreach ($this->_signs as $sign) {
377
                if ($sign !== false /* not prohibited */) {
378
                    $maxCoord++;
379
                }
380
            }
381
382
            for ($count = 0; $count <= $maxCoord; $count++) {
383
                $this->_coord[$count] = $reader->getSimilarity()->coord($count, $maxCoord);
384
            }
385
        }
386
387
        $score = 0.0;
388
        $matchedTerms = 0;
389
        foreach ($this->_terms as $termId=>$term) {
390
            // Check if term is
391
            if ($this->_signs[$termId] !== false &&            // not prohibited
392
                isset($this->_termsPositions[$termId][$docId]) // matched
393
               ) {
394
                $matchedTerms++;
395
                $score +=
396
                      $reader->getSimilarity()->tf(count($this->_termsPositions[$termId][$docId]) ) *
397
                      $this->_weights[$termId]->getValue() *
398
                      $reader->norm($docId, $term->field);
399
            }
400
        }
401
402
        return $score * $this->_coord[$matchedTerms];
403
    }
404
405
    /**
406
     * Score specified document
407
     *
408
     * @param integer $docId
409
     * @param Zend_Search_Lucene $reader
410
     * @return float
411
     */
412
    public function score($docId, $reader)
413
    {
414
        if($this->_resVector === null) {
415
            if ($this->_signs === null) {
416
                $this->_calculateConjunctionResult($reader);
417
            } else {
418
                $this->_calculateNonConjunctionResult($reader);
419
            }
420
421
            $this->_initWeight($reader);
422
        }
423
424
        if ( (extension_loaded('bitset')) ?
425
                bitset_in($this->_resVector, $docId) :
426
                isset($this->_resVector[$docId])  ) {
427
            if ($this->_signs === null) {
428
                return $this->_conjunctionScore($docId, $reader);
429
            } else {
430
                return $this->_nonConjunctionScore($docId, $reader);
431
            }
432
        } else {
433
            return 0;
434
        }
435
    }
436
}
437
438