Completed
Push — add/sync-partial-sync-checksum... ( b214f6...f71524 )
by
unknown
08:04
created

Table_Checksum::build_checksum_query()   A

Complexity

Conditions 3
Paths 4

Size

Total Lines 49

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 3
nc 4
nop 4
dl 0
loc 49
rs 9.1127
c 0
b 0
f 0
1
<?php
2
3
namespace Automattic\Jetpack\Sync\Replicastore;
4
5
use Automattic\Jetpack\Sync\Settings;
6
use Exception;
7
use WP_error;
8
9
// TODO add rest endpoints to work with this, hopefully in the same folder
10
11
class Table_Checksum {
12
	public $table                 = '';
13
	public $table_configuration   = array();
14
	public $range_field           = '';
15
	public $key_fields            = array();
16
	public $checksum_fields       = array();
17
	public $additional_filter_sql = '';
18
19
20
	public $default_tables = array();
21
22
	public $salt = '';
23
24
	public $allowed_tables = array();
25
26
	/**
27
	 * Table_Checksum constructor.
28
	 *
29
	 * @param string $table
30
	 * @param string $salt
31
	 * @param string $range_field
0 ignored issues
show
Documentation introduced by
Should the type for parameter $salt not be string|null?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
32
	 * @param null   $key_fields
0 ignored issues
show
Bug introduced by
There is no parameter named $range_field. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
33
	 * @param null   $filter_field
0 ignored issues
show
Bug introduced by
There is no parameter named $key_fields. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
34
	 * @param array  $checksum_fields
0 ignored issues
show
Bug introduced by
There is no parameter named $filter_field. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
35
	 */
0 ignored issues
show
Bug introduced by
There is no parameter named $checksum_fields. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
36
37
	public function __construct( $table, $salt = null ) {
38
		$this->salt = $salt;
39
40
		global $wpdb;
41
42
		$this->default_tables = array(
43
			'posts'              => array(
44
				'table'           => $wpdb->posts,
45
				'range_field'     => 'ID',
46
				'key_fields'      => array( 'ID' ),
47
				'checksum_fields' => array( 'post_modified' ),
48
				'filter_sql'      => Settings::get_blacklisted_post_types_sql(),
49
			),
50
			'postmeta'           => array(
51
				'table'           => $wpdb->postmeta,
52
				'range_field'     => 'post_id',
53
				'key_fields'      => array( 'post_id', 'meta_key' ),
54
				'checksum_fields' => array( 'meta_key', 'meta_value' ),
55
				'filter_sql'      => Settings::get_whitelisted_post_meta_sql(),
56
			),
57
			'comments'           => array(
58
				'table'           => $wpdb->comments,
59
				'range_field'     => 'comment_ID',
60
				'key_fields'      => array( 'comment_ID' ),
61
				'checksum_fields' => array( 'comment_content' ),
62
				'filter_sql'      => Settings::get_comments_filter_sql(),
63
			),
64
			'commentmeta'        => array(
65
				'table'           => $wpdb->commentmeta,
66
				'range_field'     => 'comment_id',
67
				'key_fields'      => array( 'comment_id', 'meta_key' ),
68
				'checksum_fields' => array( 'meta_key', 'meta_value' ),
69
				'filter_sql'      => Settings::get_whitelisted_comment_meta_sql(),
70
			),
71
			'terms'              => array(
72
				'table'           => $wpdb->terms,
73
				'range_field'     => 'term_id',
74
				'key_fields'      => array( 'term_id' ),
75
				'checksum_fields' => array( 'term_id', 'name', 'slug' ),
76
			),
77
			'termmeta'           => array(
78
				'table'           => $wpdb->termmeta,
79
				'range_field'     => 'term_id',
80
				'key_fields'      => array( 'term_id', 'meta_key' ),
81
				'checksum_fields' => array( 'meta_key', 'meta_value' ),
82
			),
83
			'term_relationships' => array(
84
				'table'           => $wpdb->term_relationships,
85
				'range_field'     => 'object_id',
86
				'key_fields'      => array( 'object_id' ),
87
				'checksum_fields' => array( /** TODO */ ),
88
			),
89
			'term_taxonomy'      => array(
90
				'table'           => $wpdb->term_taxonomy,
91
				'range_field'     => 'term_taxonomy_id',
92
				'key_fields'      => array( 'term_taxonomy_id' ),
93
				'checksum_fields' => array( /** TODO */ ),
94
			),
95
			'links'              => $wpdb->links, // TODO describe in the array format or add exceptions
96
			'options'            => $wpdb->options, // TODO describe in the array format or add exceptions
97
		);
98
99
		// TODO change filters to allow the array format
100
		// TODO add get_fields or similar method to get things out of the table
101
		// TODO extract this configuration in a better way, still make it work with `$wpdb` names.
102
		// TODO take over the replicastore functions and move them over to this class
103
		// TODO make the API work
104
105
		$this->allowed_tables = apply_filters( 'jetpack_sync_checksum_allowed_tables', $this->default_tables );
106
107
		$this->table               = $this->validate_table_name( $table );
108
		$this->table_configuration = $this->allowed_tables[ $table ];
109
110
		$this->prepare_fields( $this->table_configuration );
111
112
	}
113
114
	private function prepare_fields( $table_configuration ) {
115
		$this->key_fields            = $table_configuration['key_fields'];
116
		$this->range_field           = $table_configuration['range_field'];
117
		$this->checksum_fields       = $table_configuration['checksum_fields'];
118
		$this->additional_filter_sql = ! empty( $table_configuration['filter_sql'] ) ? $table_configuration['filter_sql'] : '';
119
	}
120
121
	private function validate_table_name( $table ) {
122
		if ( empty( $table ) ) {
123
			throw new Exception( 'Invalid table name: empty' );
124
		}
125
126
		if ( ! array_key_exists( $table, $this->allowed_tables ) ) {
127
			throw new Exception( 'Invalid table name: not allowed' );
128
		}
129
130
		// TODO other checks if such are needed.
131
132
		return $this->allowed_tables[ $table ]['table'];
133
	}
134
135
	private function validate_fields( $fields ) {
136
		foreach ( $fields as $field ) {
137
			if ( ! preg_match( '/^[0-9,a-z,A-Z$_]+$/i', $field ) ) {
138
				throw new Exception( "Invalid field name: {$field} is not allowed" );
139
			}
140
141
			// TODO other verifications of the field names
142
		}
143
	}
144
145
	private function validate_fields_against_table( $fields ) {
146
		global $wpdb;
147
148
		// TODO: Is this safe enough?
149
		$result = $wpdb->get_row( "SELECT * FROM {$this->table} LIMIT 1", ARRAY_A );
150
151
		if ( ! is_array( $result ) ) {
152
			throw new Exception( 'Unexpected $wpdb->query output: not array' );
153
		}
154
155
		// Check if the fields are actually contained in the table
156
		foreach ( $fields as $field_to_check ) {
157
			if ( ! array_key_exists( $field_to_check, $result ) ) {
158
				throw new Exception( "Invalid field name: field '{$field_to_check}' doesn't exist in table {$this->table}" );
159
			}
160
		}
161
162
		return true;
163
	}
164
165
	private function validate_input() {
166
		$fields = array_merge( array( $this->range_field ), $this->key_fields, $this->checksum_fields );
167
168
		$this->validate_fields( $fields );
169
		$this->validate_fields_against_table( $fields );
170
	}
171
172
173
	private function build_filter_statement( $range_from = null, $range_to = null, $filter_values = null ) {
174
		global $wpdb;
175
176
		/**
177
		 * Prepare the ranges
178
		 */
179
180
		$filter_array = array();
181 View Code Duplication
		if ( $range_from !== null ) {
182
			$filter_array[] = $wpdb->prepare( "{$this->range_field} > %d", array( intval( $range_from ) ) );
183
		}
184 View Code Duplication
		if ( $range_to !== null ) {
185
			$filter_array[] = $wpdb->prepare( "{$this->range_field} < %d", array( intval( $range_to ) ) );
186
		}
187
188
		/**
189
		 * End prepare the ranges
190
		 */
191
192
		/**
193
		 * Prepare data filters
194
		 */
195
		// TODO add support for multiple filter fields from array syntax (i.e. filter => values, filter => values, ...).
196
		// TODO this doesn't work right now, until we properly migrate all the filtering functions to array syntax
197
		$filter_prepared_statement = '';
0 ignored issues
show
Unused Code introduced by
$filter_prepared_statement is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
198
		if ( 0 & ! empty( $filter_values ) ) {
199
			// Prepare filtering
200
			$filter_placeholders = "AND {$this->filter_field} IN(" . implode( ',', array_fill( 0, count( $filter_values ), '%s' ) ) . ')';
0 ignored issues
show
Bug introduced by
The property filter_field does not exist. Did you maybe forget to declare it?

In PHP it is possible to write to properties without declaring them. For example, the following is perfectly valid PHP code:

class MyClass { }

$x = new MyClass();
$x->foo = true;

Generally, it is a good practice to explictly declare properties to avoid accidental typos and provide IDE auto-completion:

class MyClass {
    public $foo;
}

$x = new MyClass();
$x->foo = true;
Loading history...
201
			$filter_array[]      = $wpdb->prepare( $filter_placeholders, $filter_values );
202
		}
203
204
		// Add any additional filters via direct SQL statement.
205
		// Currently used only because the above isn't done ( `$filter_values` )
206
		$additional_filter_sql = '';
0 ignored issues
show
Unused Code introduced by
$additional_filter_sql is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
207
		if ( $this->additional_filter_sql ) {
208
			$filter_array[] = $this->additional_filter_sql;
209
		}
210
211
		/**
212
		 * End prepare data filters
213
		 */
214
		return implode( ' AND ', $filter_array );
215
	}
216
217
	// TODO make sure the function is described as DOESN'T DO VALIDATION
218
	private function build_checksum_query( $range_from = null, $range_to = null, $filter_values = null, $granular_result = false ) {
219
		global $wpdb;
220
221
		// Escape the salt
222
		$salt = $wpdb->prepare( '%s', $this->salt ); // TODO escape or prepare statement
223
224
		// Prepare the compound key
225
		$key_fields = implode( ',', $this->key_fields );
226
227
		// Prepare the checksum fields
228
		$checksum_fields_string = implode( ',', array_merge( $this->checksum_fields, array( $salt ) ) );
229
230
		$additional_fields = '';
231
		if ( $granular_result ) {
232
			// TODO uniq the fields as sometimes(most) range_index is the key and there's no need to select the same field twice
233
			$additional_fields = "
234
				{$this->range_field} as range_index,
235
			    {$key_fields},
236
			";
237
		}
238
239
		$filter_stamenet = $this->build_filter_statement( $range_from, $range_to, $filter_values );
240
241
		$query = "
242
			SELECT
243
				{$additional_fields}
244
				SUM(
245
					CRC32(
246
						CONCAT_WS( '#', {$salt}, {$checksum_fields_string} )
247
					)
248
				)  AS checksum
249
			 FROM
250
			    {$this->table}
251
			 WHERE
252
				{$filter_stamenet}
253
		";
254
255
		/**
256
		 * We need the GROUP BY only for compound keys
257
		 */
258
		if ( $granular_result ) {
259
			$query .= "
260
				GROUP BY {$key_fields}
261
			";
262
		}
263
264
		return $query;
265
266
	}
267
268
	public function get_range_edges( $range_from = null, $range_to = null, $limit = null ) {
269
		global $wpdb;
270
271
		$this->validate_fields( array( $this->range_field ) );
272
273
		// `trim()` to make sure we don't add the statement if it's empty
274
		$filters = trim( $this->build_filter_statement( $range_from, $range_to ) );
275
276
		$filter_statement = '';
277
		if ( ! empty( $filters ) ) {
278
			$filter_statement = "
279
				WHERE
280
					{$filters}
281
			";
282
		}
283
284
		// Only make the distinct count when we know there can be multiple entries for the range column
285
		$distinct_count = count( $this->key_fields ) > 1 ? 'DISTINCT' : '';
286
287
		$query = "
288
			SELECT
289
			       MIN({$this->range_field}) as min_range,
290
			       MAX({$this->range_field}) as max_range,
291
			       COUNT( {$distinct_count} {$this->range_field}) as item_count
292
			FROM
293
		";
294
295
		/**
296
		 * If `$limit` is not specified, we can directly use the table
297
		 */
298
		if ( ! $limit ) {
299
			$query .= "
300
				{$this->table}
301
	            {$filter_statement}
302
			";
303
		} else {
304
			/**
305
			 * If there is `$limit` specified, we can't directly use `MIN/MAX()` as they don't work with `LIMIT`.
306
			 * That's why we will alter the query for this case.
307
			 */
308
			$limit = intval( $limit );
309
310
			$query .= "
311
				(
312
					SELECT
313
						{$distinct_count} {$this->range_field}
314
					FROM
315
						{$this->table}
316
						{$filter_statement}
317
					ORDER BY
318
						{$this->range_field} ASC
319
					LIMIT {$limit}
320
				) as ids_query
321
			";
322
		}
323
324
		$result = $wpdb->get_row( $query, ARRAY_A );
325
326
		if ( ! $result || ! is_array( $result ) ) {
327
			throw new Exception( 'Unable to get range edges' );
328
		}
329
330
		return $result;
331
	}
332
333
	public function prepare_results_for_output( &$results ) {
334
		// get the compound key
335
		// only return range and compound key for granular results
336
337
		foreach ( $results as &$result ) {
338
			// Working on reference to save memory here.
339
340
			$key = array();
341
			foreach ( $this->key_fields as $field ) {
342
				$key[] = $result[ $field ];
343
			}
344
345
			$result = array(
346
				'key'      => implode( '-', $key ),
347
				'checksum' => $result['checksum'],
348
			);
349
350
		}
351
	}
352
353
	public function calculate_checksum( $range_from = null, $range_to = null, $filter_values = null, $granular_result = false, $simple_return_value = true ) {
354
		try {
355
			$this->validate_input();
356
		} catch ( Exception $ex ) {
357
			return new WP_error( 'invalid_input', $ex->getMessage() );
0 ignored issues
show
Unused Code introduced by
The call to WP_Error::__construct() has too many arguments starting with 'invalid_input'.

This check compares calls to functions or methods with their respective definitions. If the call has more arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress.

In this case you can add the @ignore PhpDoc annotation to the duplicate definition and it will be ignored.

Loading history...
358
		}
359
360
		$query = $this->build_checksum_query( $range_from, $range_to, $filter_values, $granular_result );
361
362
		global $wpdb;
363
364
		if ( ! $granular_result ) {
365
			$result = $wpdb->get_row( $query, ARRAY_A );
366
367
			if ( ! is_array( $result ) ) {
368
				return new WP_Error( 'invalid_query', "Result wasn't an array" );
0 ignored issues
show
Unused Code introduced by
The call to WP_Error::__construct() has too many arguments starting with 'invalid_query'.

This check compares calls to functions or methods with their respective definitions. If the call has more arguments than are defined, it raises an issue.

If a function is defined several times with a different number of parameters, the check may pick up the wrong definition and report false positives. One codebase where this has been known to happen is Wordpress.

In this case you can add the @ignore PhpDoc annotation to the duplicate definition and it will be ignored.

Loading history...
369
			}
370
371
			if ( $simple_return_value ) {
372
				return $result['checksum'];
373
			}
374
375
			return array(
376
				'range'    => $range_from . '-' . $range_to,
377
				'checksum' => $result['checksum'],
378
			);
379
		} else {
380
			$result = $wpdb->get_results( $query, ARRAY_A );
381
			$this->prepare_results_for_output( $result );
382
383
			return $result;
384
		}
385
	}
386
}
387