Completed
Push — release-2.1 ( 02c5c7...5a39ee )
by Colin
09:26
created

fulltext_search::supportsMethod()   B

Complexity

Conditions 6
Paths 10

Size

Total Lines 24
Code Lines 15

Duplication

Lines 24
Ratio 100 %

Importance

Changes 0
Metric Value
cc 6
eloc 15
nc 10
nop 2
dl 24
loc 24
rs 8.5125
c 0
b 0
f 0
1
<?php
2
3
/**
4
 * Simple Machines Forum (SMF)
5
 *
6
 * @package SMF
7
 * @author Simple Machines http://www.simplemachines.org
8
 * @copyright 2018 Simple Machines and individual contributors
9
 * @license http://www.simplemachines.org/about/smf/license.php BSD
10
 *
11
 * @version 2.1 Beta 4
12
 */
13
14
if (!defined('SMF'))
15
	die('No direct access...');
16
17
/**
18
 * Class fulltext_search
19
 * Used for fulltext index searching
20
 */
21
class fulltext_search extends search_api
22
{
23
	/**
24
	 * @var array Which words are banned
25
	 */
26
	protected $bannedWords = array();
27
28
	/**
29
	 * @var int The minimum word length
30
	 */
31
	protected $min_word_length = 4;
32
33
	/**
34
	 * @var array Which databases support this method?
35
	 */
36
	protected $supported_databases = array('mysql','postgresql');
37
38
	/**
39
	 * The constructor function
40
	 */
41
	public function __construct()
42
	{
43
		global $modSettings, $db_type;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
44
45
		// Is this database supported?
46
		if (!in_array($db_type, $this->supported_databases))
47
		{
48
			$this->is_supported = false;
49
			return;
50
		}
51
52
		$this->bannedWords = empty($modSettings['search_banned_words']) ? array() : explode(',', $modSettings['search_banned_words']);
53
		$this->min_word_length = $this->_getMinWordLength();
54
	}
55
56
	/**
57
	 * {@inheritDoc}
58
	 */
59 View Code Duplication
	public function supportsMethod($methodName, $query_params = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
60
	{
61
		$return = false;
0 ignored issues
show
Unused Code introduced by
$return is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
62
		switch ($methodName)
63
		{
64
			case 'searchSort':
65
			case 'prepareIndexes':
66
			case 'indexedWordQuery':
67
			case 'postRemoved':
68
				$return = true;
69
			break;
70
71
			// All other methods, too bad dunno you.
72
			default:
73
				$return = false;
74
			break;
75
		}
76
77
		// Maybe parent got support
78
		if (!$return)
79
			$return = parent::supportsMethod($methodName, $query_params);
80
81
		return $return;
82
	}
83
84
	/**
85
	 * fulltext_search::_getMinWordLength()
86
	 *
87
	 * What is the minimum word length full text supports?
88
	 *
89
	 * @return int The minimum word length
90
	 */
91
	protected function _getMinWordLength()
92
	{
93
		global $smcFunc, $db_type;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
94
95
		if ($db_type == 'postgresql')
96
			return 0;
97
		// Try to determine the minimum number of letters for a fulltext search.
98
		$request = $smcFunc['db_search_query']('max_fulltext_length', '
99
			SHOW VARIABLES
100
			LIKE {string:fulltext_minimum_word_length}',
101
			array(
102
				'fulltext_minimum_word_length' => 'ft_min_word_len',
103
			)
104
		);
105 View Code Duplication
		if ($request !== false && $smcFunc['db_num_rows']($request) == 1)
106
		{
107
			list (, $min_word_length) = $smcFunc['db_fetch_row']($request);
108
			$smcFunc['db_free_result']($request);
109
		}
110
		// 4 is the MySQL default...
111
		else
112
			$min_word_length = 4;
113
114
		return $min_word_length;
115
	}
116
117
	/**
118
	 * {@inheritDoc}
119
	 */
120
	public function searchSort($a, $b)
0 ignored issues
show
Comprehensibility introduced by
Avoid variables with short names like $a. Configured minimum length is 3.

Short variable names may make your code harder to understand. Variable names should be self-descriptive. This check looks for variable names who are shorter than a configured minimum.

Loading history...
Comprehensibility introduced by
Avoid variables with short names like $b. Configured minimum length is 3.

Short variable names may make your code harder to understand. Variable names should be self-descriptive. This check looks for variable names who are shorter than a configured minimum.

Loading history...
121
	{
122
		global $excludedWords, $smcFunc;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
123
124
		$x = $smcFunc['strlen']($a) - (in_array($a, $excludedWords) ? 1000 : 0);
0 ignored issues
show
Comprehensibility introduced by
Avoid variables with short names like $x. Configured minimum length is 3.

Short variable names may make your code harder to understand. Variable names should be self-descriptive. This check looks for variable names who are shorter than a configured minimum.

Loading history...
125
		$y = $smcFunc['strlen']($b) - (in_array($b, $excludedWords) ? 1000 : 0);
0 ignored issues
show
Comprehensibility introduced by
Avoid variables with short names like $y. Configured minimum length is 3.

Short variable names may make your code harder to understand. Variable names should be self-descriptive. This check looks for variable names who are shorter than a configured minimum.

Loading history...
126
127
		return $x < $y ? 1 : ($x > $y ? -1 : 0);
128
	}
129
130
	/**
131
	 * {@inheritDoc}
132
	 */
133
	public function prepareIndexes($word, array &$wordsSearch, array &$wordsExclude, $isExcluded)
134
	{
135
		global $modSettings, $smcFunc;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
136
137
		$subwords = text2words($word, null, false);
138
139
		if (empty($modSettings['search_force_index']))
140
		{
141
			// A boolean capable search engine and not forced to only use an index, we may use a non indexed search
142
			// this is harder on the server so we are restrictive here
143
			if (count($subwords) > 1 && preg_match('~[.:@$]~', $word))
144
			{
145
				// using special characters that a full index would ignore and the remaining words are short which would also be ignored
146
				if (($smcFunc['strlen'](current($subwords)) < $this->min_word_length) && ($smcFunc['strlen'](next($subwords)) < $this->min_word_length))
147
				{
148
					$wordsSearch['words'][] = trim($word, "/*- ");
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal /*- does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
149
					$wordsSearch['complex_words'][] = count($subwords) === 1 ? $word : '"' . $word . '"';
150
				}
151
			}
152
			elseif ($smcFunc['strlen'](trim($word, "/*- ")) < $this->min_word_length)
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal /*- does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
153
			{
154
				// short words have feelings too
155
				$wordsSearch['words'][] = trim($word, "/*- ");
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal /*- does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
156
				$wordsSearch['complex_words'][] = count($subwords) === 1 ? $word : '"' . $word . '"';
157
			}
158
		}
159
160
		$fulltextWord = count($subwords) === 1 ? $word : '"' . $word . '"';
161
		$wordsSearch['indexed_words'][] = $fulltextWord;
162
		if ($isExcluded)
163
			$wordsExclude[] = $fulltextWord;
164
	}
165
166
	/**
167
	 * {@inheritDoc}
168
	 */
169
	public function indexedWordQuery(array $words, array $search_data)
170
	{
171
		global $modSettings, $smcFunc;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
172
173
		$query_select = array(
174
			'id_msg' => 'm.id_msg',
175
		);
176
		$query_where = array();
177
		$query_params = $search_data['params'];
178
179
		if( $smcFunc['db_title'] == "PostgreSQL")
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal PostgreSQL does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
180
			$modSettings['search_simple_fulltext'] = true;
181
182
		if ($query_params['id_search'])
183
			$query_select['id_search'] = '{int:id_search}';
184
185
		$count = 0;
186
		if (empty($modSettings['search_simple_fulltext']))
187 View Code Duplication
			foreach ($words['words'] as $regularWord)
188
			{
189
				$query_where[] = 'm.body' . (in_array($regularWord, $query_params['excluded_words']) ? ' NOT' : '') . (empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? ' LIKE ' : 'RLIKE') . '{string:complex_body_' . $count . '}';
190
				$query_params['complex_body_' . $count++] = empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? '%' . strtr($regularWord, array('_' => '\\_', '%' => '\\%')) . '%' : '[[:<:]]' . addcslashes(preg_replace(array('/([\[\]$.+*?|{}()])/'), array('[$1]'), $regularWord), '\\\'') . '[[:>:]]';
0 ignored issues
show
Coding Style introduced by
Increment and decrement operators must be bracketed when used in string concatenation
Loading history...
191
			}
192
193
		if ($query_params['user_query'])
194
			$query_where[] = '{raw:user_query}';
195
		if ($query_params['board_query'])
196
			$query_where[] = 'm.id_board {raw:board_query}';
197
198
		if ($query_params['topic'])
199
			$query_where[] = 'm.id_topic = {int:topic}';
200
		if ($query_params['min_msg_id'])
201
			$query_where[] = 'm.id_msg >= {int:min_msg_id}';
202
		if ($query_params['max_msg_id'])
203
			$query_where[] = 'm.id_msg <= {int:max_msg_id}';
204
205
		$count = 0;
206 View Code Duplication
		if (!empty($query_params['excluded_phrases']) && empty($modSettings['search_force_index']))
207
			foreach ($query_params['excluded_phrases'] as $phrase)
208
			{
209
				$query_where[] = 'subject NOT ' . (empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? ' LIKE ' : 'RLIKE') . '{string:exclude_subject_phrase_' . $count . '}';
210
				$query_params['exclude_subject_phrase_' . $count++] = empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? '%' . strtr($phrase, array('_' => '\\_', '%' => '\\%')) . '%' : '[[:<:]]' . addcslashes(preg_replace(array('/([\[\]$.+*?|{}()])/'), array('[$1]'), $phrase), '\\\'') . '[[:>:]]';
0 ignored issues
show
Coding Style introduced by
Increment and decrement operators must be bracketed when used in string concatenation
Loading history...
211
			}
212
		$count = 0;
213 View Code Duplication
		if (!empty($query_params['excluded_subject_words']) && empty($modSettings['search_force_index']))
214
			foreach ($query_params['excluded_subject_words'] as $excludedWord)
215
			{
216
				$query_where[] = 'subject NOT ' . (empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? ' LIKE ' : 'RLIKE') . '{string:exclude_subject_words_' . $count . '}';
217
				$query_params['exclude_subject_words_' . $count++] = empty($modSettings['search_match_words']) || $search_data['no_regexp'] ? '%' . strtr($excludedWord, array('_' => '\\_', '%' => '\\%')) . '%' : '[[:<:]]' . addcslashes(preg_replace(array('/([\[\]$.+*?|{}()])/'), array('[$1]'), $excludedWord), '\\\'') . '[[:>:]]';
0 ignored issues
show
Coding Style introduced by
Increment and decrement operators must be bracketed when used in string concatenation
Loading history...
218
			}
219
220
		if (!empty($modSettings['search_simple_fulltext']))
221
		{
222 View Code Duplication
			if ($smcFunc['db_title'] == "PostgreSQL")
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal PostgreSQL does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
223
			{
224
				$language_ftx = $smcFunc['db_search_language']();
225
226
				$query_where[] = 'to_tsvector({string:language_ftx},body) @@ plainto_tsquery({string:language_ftx},{string:body_match})';
227
				$query_params['language_ftx'] = $language_ftx;
228
			}
229
			else
230
				$query_where[] = 'MATCH (body) AGAINST ({string:body_match})';
231
			$query_params['body_match'] = implode(' ', array_diff($words['indexed_words'], $query_params['excluded_index_words']));
232
		}
233
		else
234
		{
235
			$query_params['boolean_match'] = '';
236
237
			// remove any indexed words that are used in the complex body search terms
238
			$words['indexed_words'] = array_diff($words['indexed_words'], $words['complex_words']);
239
240
			if ($smcFunc['db_title'] == "PostgreSQL")
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal PostgreSQL does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
241
			{
242
				$row = 0;
243
				foreach ($words['indexed_words'] as $fulltextWord)
244
				{
245
					$query_params['boolean_match'] .= ($row <> 0 ? '&' : '');
246
					$query_params['boolean_match'] .= (in_array($fulltextWord, $query_params['excluded_index_words']) ? '!' : '') . $fulltextWord . ' ';
247
					$row++;
248
				}
249
			}
250
			else
251
				foreach ($words['indexed_words'] as $fulltextWord)
252
					$query_params['boolean_match'] .= (in_array($fulltextWord, $query_params['excluded_index_words']) ? '-' : '+') . $fulltextWord . ' ';
253
254
			$query_params['boolean_match'] = substr($query_params['boolean_match'], 0, -1);
255
256
			// if we have bool terms to search, add them in
257 View Code Duplication
			if ($query_params['boolean_match'])
258
			{
259
				if($smcFunc['db_title'] == "PostgreSQL")
0 ignored issues
show
Coding Style Comprehensibility introduced by
The string literal PostgreSQL does not require double quotes, as per coding-style, please use single quotes.

PHP provides two ways to mark string literals. Either with single quotes 'literal' or with double quotes "literal". The difference between these is that string literals in double quotes may contain variables with are evaluated at run-time as well as escape sequences.

String literals in single quotes on the other hand are evaluated very literally and the only two characters that needs escaping in the literal are the single quote itself (\') and the backslash (\\). Every other character is displayed as is.

Double quoted string literals may contain other variables or more complex escape sequences.

<?php

$singleQuoted = 'Value';
$doubleQuoted = "\tSingle is $singleQuoted";

print $doubleQuoted;

will print an indented: Single is Value

If your string literal does not contain variables or escape sequences, it should be defined using single quotes to make that fact clear.

For more information on PHP string literals and available escape sequences see the PHP core documentation.

Loading history...
260
				{
261
					$language_ftx = $smcFunc['db_search_language']();
262
263
					$query_where[] = 'to_tsvector({string:language_ftx},body) @@ plainto_tsquery({string:language_ftx},{string:boolean_match})';
264
					$query_params['language_ftx'] = $language_ftx;
265
				}
266
				else
267
					$query_where[] = 'MATCH (body) AGAINST ({string:boolean_match} IN BOOLEAN MODE)';
268
			}
269
		}
270
271
		$ignoreRequest = $smcFunc['db_search_query']('insert_into_log_messages_fulltext', ($smcFunc['db_support_ignore'] ? ( '
272
			INSERT IGNORE INTO {db_prefix}' . $search_data['insert_into'] . '
273
				(' . implode(', ', array_keys($query_select)) . ')') : '') . '
274
			SELECT ' . implode(', ', $query_select) . '
275
			FROM {db_prefix}messages AS m
276
			WHERE ' . implode('
277
				AND ', $query_where) . (empty($search_data['max_results']) ? '' : '
278
			LIMIT ' . ($search_data['max_results'] - $search_data['indexed_results'])),
279
			$query_params
280
		);
281
282
		return $ignoreRequest;
283
	}
284
}
285
286
?>