Issues (946)

engine/classes/ElggBatch.php (3 issues)

1
<?php
2
3
use Elgg\BatchResult;
4
5
/**
6
 * A lazy-loading proxy for a result array from a fetching function
7
 *
8
 * A batch can be counted or iterated over via foreach, where the batch will
9
 * internally fetch results several rows at a time. This allows you to efficiently
10
 * work on large result sets without loading all results in memory.
11
 *
12
 * A batch can run operations for any function that supports an options array
13
 * and supports the keys "offset", "limit", and "count". This is usually used
14
 * with elgg_get_entities() and friends, elgg_get_annotations(), and
15
 * elgg_get_metadata(). In fact, those functions will return results as
16
 * batches by passing in "batch" as true.
17
 *
18
 * Unlike a real array, direct access of results is not supported.
19
 *
20
 * If you pass a valid PHP callback, all results will be run through that
21
 * callback. You can still foreach() through the result set after.  Valid
22
 * PHP callbacks can be a string, an array, or a closure.
23
 * {@link http://php.net/manual/en/language.pseudo-types.php}
24
 *
25
 * The callback function must accept 3 arguments: an entity, the getter
26
 * used, and the options used.
27
 *
28
 * Results from the callback are stored in callbackResult. If the callback
29
 * returns only booleans, callbackResults will be the combined result of
30
 * all calls. If no entities are processed, callbackResults will be null.
31
 *
32
 * If the callback returns anything else, callbackresult will be an indexed
33
 * array of whatever the callback returns.  If returning error handling
34
 * information, you should include enough information to determine which
35
 * result you're referring to.
36
 *
37
 * Don't combine returning bools and returning something else.
38
 *
39
 * Note that returning false will not stop the foreach.
40
 *
41
 * @warning If your callback or foreach loop deletes or disable entities
42
 * you MUST call setIncrementOffset(false) or set that when instantiating.
43
 * This forces the offset to stay what it was in the $options array.
44
 *
45
 * @example
46
 * <code>
47
 * // using foreach
48
 * $batch = new \ElggBatch('elgg_get_entities', array());
49
 * $batch->setIncrementOffset(false);
50
 *
51
 * foreach ($batch as $entity) {
52
 * 	   $entity->disable();
53
 * }
54
 *
55
 * // using both a callback
56
 * $callback = function($result, $getter, $options) {
57
 * 	   var_dump("Looking at annotation id: $result->id");
58
 *     return true;
59
 * }
60
 *
61
 * $batch = new \ElggBatch('elgg_get_annotations', array('guid' => 2), $callback);
62
 *
63
 * // get a batch from an Elgg getter function
64
 * $batch = elgg_get_entities([
65
 *     'batch' => true,
66
 * ]);
67
 * </code>
68
 *
69
 * @package    Elgg.Core
70
 * @subpackage DataModel
71
 * @since      1.8
72
 */
73
class ElggBatch implements BatchResult {
74
75
	/**
76
	 * The objects to iterate over.
77
	 *
78
	 * @var array
79
	 */
80
	private $results = [];
81
82
	/**
83
	 * The function used to get results.
84
	 *
85
	 * @var callable
86
	 */
87
	private $getter = null;
88
89
	/**
90
	 * The given $options to alter and pass to the getter.
91
	 *
92
	 * @var array
93
	 */
94
	private $options = [];
95
96
	/**
97
	 * The number of results to grab at a time.
98
	 *
99
	 * @var int
100
	 */
101
	private $chunkSize = 25;
102
103
	/**
104
	 * A callback function to pass results through.
105
	 *
106
	 * @var callable
107
	 */
108
	private $callback = null;
109
110
	/**
111
	 * Start after this many results.
112
	 *
113
	 * @var int
114
	 */
115
	private $offset = 0;
116
117
	/**
118
	 * Stop after this many results.
119
	 *
120
	 * @var int
121
	 */
122
	private $limit = 0;
123
124
	/**
125
	 * Number of processed results.
126
	 *
127
	 * @var int
128
	 */
129
	private $retrievedResults = 0;
130
131
	/**
132
	 * The index of the current result within the current chunk
133
	 *
134
	 * @var int
135
	 */
136
	private $resultIndex = 0;
137
138
	/**
139
	 * The index of the current chunk
140
	 *
141
	 * @var int
142
	 */
143
	private $chunkIndex = 0;
144
145
	/**
146
	 * The number of results iterated through
147
	 *
148
	 * @var int
149
	 */
150
	private $processedResults = 0;
151
152
	/**
153
	 * Is the getter a valid callback
154
	 *
155
	 * @var bool
156
	 */
157
	private $validGetter = null;
158
159
	/**
160
	 * The result of running all entities through the callback function.
161
	 *
162
	 * @var mixed
163
	 */
164
	public $callbackResult = null;
165
166
	/**
167
	 * If false, offset will not be incremented. This is used for callbacks/loops that delete.
168
	 *
169
	 * @var bool
170
	 */
171
	private $incrementOffset = true;
172
173
	/**
174
	 * Entities that could not be instantiated during a fetch
175
	 *
176
	 * @var \stdClass[]
177
	 */
178
	private $incompleteEntities = [];
179
180
	/**
181
	 * Total number of incomplete entities fetched
182
	 *
183
	 * @var int
184
	 */
185
	private $totalIncompletes = 0;
186
187
	/**
188
	 * Batches operations on any elgg_get_*() or compatible function that supports
189
	 * an options array.
190
	 *
191
	 * Instead of returning all objects in memory, it goes through $chunk_size
192
	 * objects, then requests more from the server.  This avoids OOM errors.
193
	 *
194
	 * @param callable $getter     The function used to get objects.  Usually
195
	 *                           an elgg_get_*() function, but can be any valid PHP callback.
196
	 * @param array  $options    The options array to pass to the getter function. If limit is
197
	 *                           not set, 10 is used as the default. In most cases that is not
198
	 *                           what you want.
199
	 * @param mixed  $callback   An optional callback function that all results will be passed
200
	 *                           to upon load.  The callback needs to accept $result, $getter,
201
	 *                           $options.
202
	 * @param int    $chunk_size The number of entities to pull in before requesting more.
203
	 *                           You have to balance this between running out of memory in PHP
204
	 *                           and hitting the db server too often.
205
	 * @param bool   $inc_offset Increment the offset on each fetch. This must be false for
206
	 *                           callbacks that delete rows. You can set this after the
207
	 *                           object is created with {@link \ElggBatch::setIncrementOffset()}.
208
	 */
209 487
	public function __construct(callable $getter, $options, $callback = null, $chunk_size = 25,
210
			$inc_offset = true) {
211
212 487
		$this->getter = $getter;
213 487
		$this->options = $options;
214 487
		$this->callback = $callback;
215 487
		$this->chunkSize = $chunk_size;
216 487
		$this->setIncrementOffset($inc_offset);
217
218 487
		if ($this->chunkSize <= 0) {
219
			$this->chunkSize = 25;
220
		}
221
222
		// store these so we can compare later
223 487
		$this->offset = elgg_extract('offset', $options, 0);
224 487
		$this->limit = elgg_extract('limit', $options, _elgg_config()->default_limit);
225
226
		// if passed a callback, create a new \ElggBatch with the same options
227
		// and pass each to the callback.
228 487
		if ($callback && is_callable($callback)) {
229
			$batch = new \ElggBatch($getter, $options, null, $chunk_size, $inc_offset);
230
231
			$all_results = null;
232
233
			foreach ($batch as $result) {
234
				$result = call_user_func($callback, $result, $getter, $options);
235
236
				if (!isset($all_results)) {
237
					if ($result === true || $result === false || $result === null) {
238
						$all_results = $result;
239
					} else {
240
						$all_results = [];
241
					}
242
				}
243
244
				if (($result === true || $result === false || $result === null) && !is_array($all_results)) {
245
					$all_results = $result && $all_results;
246
				} else {
247
					$all_results[] = $result;
248
				}
249
			}
250
251
			$this->callbackResult = $all_results;
252
		}
253 487
	}
254
255
	/**
256
	 * Fetches the next chunk of results
257
	 *
258
	 * @return bool
259
	 */
260 438
	private function getNextResultsChunk() {
0 ignored issues
show
Private method name "ElggBatch::getNextResultsChunk" must be prefixed with an underscore
Loading history...
261
262
		// always reset results.
263 438
		$this->results = [];
264
265 438
		if (!isset($this->validGetter)) {
266 438
			$this->validGetter = is_callable($this->getter);
267
		}
268
269 438
		if (!$this->validGetter) {
270
			return false;
271
		}
272
273 438
		$limit = $this->chunkSize;
274
275
		// if someone passed limit = 0 they want everything.
276 438
		if ($this->limit != 0) {
277 11
			if ($this->retrievedResults >= $this->limit) {
278
				return false;
279
			}
280
281
			// if original limit < chunk size, set limit to original limit
282
			// else if the number of results we'll fetch if greater than the original limit
283 11
			if ($this->limit < $this->chunkSize) {
284 9
				$limit = $this->limit;
285 2
			} elseif ($this->retrievedResults + $this->chunkSize > $this->limit) {
286
				// set the limit to the number of results remaining in the original limit
287 2
				$limit = $this->limit - $this->retrievedResults;
288
			}
289
		}
290
291 438
		if ($this->incrementOffset) {
292 198
			$offset = $this->offset + $this->retrievedResults;
293
		} else {
294 432
			$offset = $this->offset + $this->totalIncompletes;
295
		}
296
297
		$current_options = [
298 438
			'limit' => $limit,
299 438
			'offset' => $offset,
300 438
			'__ElggBatch' => $this,
301
		];
302
303 438
		$options = array_merge($this->options, $current_options);
304
305 438
		$this->incompleteEntities = [];
306 438
		$this->results = call_user_func($this->getter, $options);
307
308
		// batch result sets tend to be large; we don't want to cache these.
309 438
		_elgg_services()->queryCache->disable();
310
311 438
		$num_results = count($this->results);
312 438
		$num_incomplete = count($this->incompleteEntities);
313
314 438
		$this->totalIncompletes += $num_incomplete;
315
316 438
		if (!empty($this->incompleteEntities)) {
317
			// pad the front of the results with nulls representing the incompletes
318
			array_splice($this->results, 0, 0, array_pad([], $num_incomplete, null));
319
			// ...and skip past them
320
			reset($this->results);
321
			for ($i = 0; $i < $num_incomplete; $i++) {
322
				next($this->results);
323
			}
324
		}
325
326 438
		if ($this->results) {
327 276
			$this->chunkIndex++;
328
329
			// let the system know we've jumped past the nulls
330 276
			$this->resultIndex = $num_incomplete;
331
332 276
			$this->retrievedResults += ($num_results + $num_incomplete);
333 276
			if ($num_results == 0) {
334
				// This fetch was *all* incompletes! We need to fetch until we can either
335
				// offer at least one row to iterate over, or give up.
336
				return $this->getNextResultsChunk();
337
			}
338 276
			_elgg_services()->queryCache->enable();
339 276
			return true;
340
		} else {
341 431
			_elgg_services()->queryCache->enable();
342 431
			return false;
343
		}
344
	}
345
346
	/**
347
	 * Increment the offset from the original options array? Setting to
348
	 * false is required for callbacks that delete rows.
349
	 *
350
	 * @param bool $increment Set to false when deleting data
351
	 * @return void
352
	 */
353 487
	public function setIncrementOffset($increment = true) {
354 487
		$this->incrementOffset = (bool) $increment;
355 487
	}
356
357
	/**
358
	 * Set chunk size
359
	 * @param int $size Size
360
	 * @return void
361
	 */
362
	public function setChunkSize($size = 25) {
363
		$this->chunkSize = $size;
364
	}
365
	/**
366
	 * Implements Iterator
367
	 */
368
369
	/**
370
	 * {@inheritdoc}
371
	 */
372 438
	public function rewind() {
373 438
		$this->resultIndex = 0;
374 438
		$this->retrievedResults = 0;
375 438
		$this->processedResults = 0;
376
377
		// only grab results if we haven't yet or we're crossing chunks
378 438
		if ($this->chunkIndex == 0 || $this->limit > $this->chunkSize) {
379 438
			$this->chunkIndex = 0;
380 438
			$this->getNextResultsChunk();
381
		}
382 438
	}
383
384
	/**
385
	 * {@inheritdoc}
386
	 */
387 276
	public function current() {
388 276
		return current($this->results);
389
	}
390
391
	/**
392
	 * {@inheritdoc}
393
	 */
394 4
	public function key() {
395 4
		return $this->processedResults;
396
	}
397
398
	/**
399
	 * {@inheritdoc}
400
	 */
401 276
	public function next() {
402
		// if we'll be at the end.
403 276
		if (($this->processedResults + 1) >= $this->limit && $this->limit > 0) {
404 8
			$this->results = [];
405 8
			return false;
0 ignored issues
show
Bug Best Practice introduced by Brett Profitt
The expression return false returns the type false which is incompatible with the return type mandated by Iterator::next() of void.

In the issue above, the returned value is violating the contract defined by the mentioned interface.

Let's take a look at an example:

interface HasName {
    /** @return string */
    public function getName();
}

class Name {
    public $name;
}

class User implements HasName {
    /** @return string|Name */
    public function getName() {
        return new Name('foo'); // This is a violation of the ``HasName`` interface
                                // which only allows a string value to be returned.
    }
}
Loading history...
406
		}
407
408
		// if we'll need new results.
409 276
		if (($this->resultIndex + 1) >= $this->chunkSize) {
410 2
			if (!$this->getNextResultsChunk()) {
411
				$this->results = [];
412
				return false;
0 ignored issues
show
Bug Best Practice introduced by Brett Profitt
The expression return false returns the type false which is incompatible with the return type mandated by Iterator::next() of void.

In the issue above, the returned value is violating the contract defined by the mentioned interface.

Let's take a look at an example:

interface HasName {
    /** @return string */
    public function getName();
}

class Name {
    public $name;
}

class User implements HasName {
    /** @return string|Name */
    public function getName() {
        return new Name('foo'); // This is a violation of the ``HasName`` interface
                                // which only allows a string value to be returned.
    }
}
Loading history...
413
			}
414
415 2
			$result = current($this->results);
416
		} else {
417
			// the function above resets the indexes, so only inc if not
418
			// getting new set
419 276
			$this->resultIndex++;
420 276
			$result = next($this->results);
421
		}
422
423 276
		$this->processedResults++;
424 276
		return $result;
425
	}
426
427
	/**
428
	 * {@inheritdoc}
429
	 */
430 438
	public function valid() {
431 438
		if (!is_array($this->results)) {
432
			return false;
433
		}
434 438
		$key = key($this->results);
435 438
		return ($key !== null && $key !== false);
436
	}
437
438
	/**
439
	 * Count the total results available at this moment.
440
	 *
441
	 * As this performs a separate query, the count returned may not match the number of results you can
442
	 * fetch via iteration on a very active DB.
443
	 *
444
	 * @see Countable::count()
445
	 * @return int
446
	 */
447 355
	public function count() {
448 355
		if (!is_callable($this->getter)) {
449
			$inspector = new \Elgg\Debug\Inspector();
450
			throw new RuntimeException("Getter is not callable: " . $inspector->describeCallable($this->getter));
451
		}
452
453 355
		$options = array_merge($this->options, ['count' => true]);
454
455 355
		return call_user_func($this->getter, $options);
456
	}
457
}
458