Issues (1401)

Security Analysis    no request data  

This project does not seem to handle request data directly as such no vulnerable execution paths were found.

  Cross-Site Scripting
Cross-Site Scripting enables an attacker to inject code into the response of a web-request that is viewed by other users. It can for example be used to bypass access controls, or even to take over other users' accounts.
  File Exposure
File Exposure allows an attacker to gain access to local files that he should not be able to access. These files can for example include database credentials, or other configuration files.
  File Manipulation
File Manipulation enables an attacker to write custom data to files. This potentially leads to injection of arbitrary code on the server.
  Object Injection
Object Injection enables an attacker to inject an object into PHP code, and can lead to arbitrary code execution, file exposure, or file manipulation attacks.
  Code Injection
Code Injection enables an attacker to execute arbitrary code on the server.
  Response Splitting
Response Splitting can be used to send arbitrary responses.
  File Inclusion
File Inclusion enables an attacker to inject custom files into PHP's file loading mechanism, either explicitly passed to include, or for example via PHP's auto-loading mechanism.
  Command Injection
Command Injection enables an attacker to inject a shell command that is execute with the privileges of the web-server. This can be used to expose sensitive data, or gain access of your server.
  SQL Injection
SQL Injection enables an attacker to execute arbitrary SQL code on your database server gaining access to user data, or manipulating user data.
  XPath Injection
XPath Injection enables an attacker to modify the parts of XML document that are read. If that XML document is for example used for authentication, this can lead to further vulnerabilities similar to SQL Injection.
  LDAP Injection
LDAP Injection enables an attacker to inject LDAP statements potentially granting permission to run unauthorized queries, or modify content inside the LDAP tree.
  Header Injection
  Other Vulnerability
This category comprises other attack vectors such as manipulating the PHP runtime, loading custom extensions, freezing the runtime, or similar.
  Regex Injection
Regex Injection enables an attacker to execute arbitrary code in your PHP process.
  XML Injection
XML Injection enables an attacker to read files on your local filesystem including configuration files, or can be abused to freeze your web-server process.
  Variable Injection
Variable Injection enables an attacker to overwrite program variables with custom data, and can lead to further vulnerabilities.
Unfortunately, the security analysis is currently not available for your project. If you are a non-commercial open-source project, please contact support to gain access.

repo/includes/Dumpers/DumpGenerator.php (2 issues)

Upgrade to new PHP Analysis Engine

These results are based on our legacy PHP analysis, consider migrating to our new PHP analysis engine instead. Learn more

1
<?php
2
3
namespace Wikibase\Repo\Dumpers;
4
5
use InvalidArgumentException;
6
use LogicException;
7
use Onoi\MessageReporter\MessageReporter;
8
use Onoi\MessageReporter\NullMessageReporter;
9
use Wikibase\DataModel\Entity\EntityId;
10
use Wikibase\DataModel\Services\Entity\EntityPrefetcher;
11
use Wikibase\DataModel\Services\EntityId\EntityIdPager;
12
use Wikibase\DataModel\Services\Lookup\EntityLookupException;
13
use Wikibase\Lib\Reporting\ExceptionHandler;
14
use Wikibase\Lib\Reporting\RethrowingExceptionHandler;
15
use Wikibase\Lib\Store\StorageException;
16
use Wikibase\Repo\Store\Sql\SqlEntityIdPager;
17
18
/**
19
 * DumpGenerator generates a dump of a given set of entities, excluding
20
 * redirects.
21
 *
22
 * @license GPL-2.0-or-later
23
 * @author Daniel Kinzler
24
 */
25
abstract class DumpGenerator {
26
27
	/**
28
	 * @var int The max number of entities to process in a single batch.
29
	 *      Also controls the interval for progress reports.
30
	 */
31
	protected $batchSize = 100;
32
33
	/**
34
	 * @var resource File handle for output
35
	 */
36
	protected $out;
37
38
	/**
39
	 * @var int Total number of shards a request should be split into
40
	 */
41
	protected $shardingFactor = 1;
42
43
	/**
44
	 * @var int Number of the requested shard
45
	 */
46
	protected $shard = 0;
47
48
	/**
49
	 * @var MessageReporter
50
	 */
51
	protected $progressReporter;
52
53
	/**
54
	 * @var ExceptionHandler
55
	 */
56
	protected $exceptionHandler;
57
58
	/**
59
	 * @var EntityPrefetcher
60
	 */
61
	protected $entityPrefetcher;
62
63
	/**
64
	 * @var int[] String to int map of types to include.
65
	 */
66
	protected $entityTypes;
67
68
	/**
69
	 * Entity count limit - dump will generate this many
70
	 *
71
	 * @var int
72
	 */
73
	protected $limit = 0;
74
75
	/**
76
	 * @param resource $out
77
	 * @param EntityPrefetcher $entityPrefetcher
78
	 *
79
	 * @throws InvalidArgumentException
80
	 */
81
	public function __construct( $out, EntityPrefetcher $entityPrefetcher ) {
82
		if ( !is_resource( $out ) ) {
83
			throw new InvalidArgumentException( '$out must be a file handle!' );
84
		}
85
86
		$this->out = $out;
87
88
		$this->entityPrefetcher = $entityPrefetcher;
89
		$this->progressReporter = new NullMessageReporter();
90
		$this->exceptionHandler = new RethrowingExceptionHandler();
91
	}
92
93
	/**
94
	 * Set maximum number of entities produced
95
	 *
96
	 * @param int $limit
97
	 */
98
	public function setLimit( $limit ) {
99
		$this->limit = (int)$limit;
100
	}
101
102
	/**
103
	 * Sets the batch size for processing. The batch size is used as the limit
104
	 * when listing IDs via the EntityIdPager::getNextBatchOfIds() method, and
105
	 * also controls the interval of progress reports.
106
	 *
107
	 * @param int $batchSize
108
	 *
109
	 * @throws InvalidArgumentException
110
	 */
111
	public function setBatchSize( $batchSize ) {
112
		if ( !is_int( $batchSize ) || $batchSize < 1 ) {
113
			throw new InvalidArgumentException( '$batchSize must be an integer >= 1' );
114
		}
115
116
		$this->batchSize = $batchSize;
117
	}
118
119
	public function setProgressReporter( MessageReporter $progressReporter ) {
120
		$this->progressReporter = $progressReporter;
121
	}
122
123
	public function setExceptionHandler( ExceptionHandler $exceptionHandler ) {
124
		$this->exceptionHandler = $exceptionHandler;
125
	}
126
127
	/**
128
	 * Set the sharding factor and desired shard.
129
	 * For instance, to generate four dumps in parallel, use setShardingFilter( 4, 0 )
130
	 * for the first dump, setShardingFilter( 4, 1 ) for the second dump, etc.
131
	 *
132
	 * @param int $shardingFactor
133
	 * @param int $shard
134
	 *
135
	 * @throws InvalidArgumentException
136
	 */
137
	public function setShardingFilter( $shardingFactor, $shard ) {
138
		if ( !is_int( $shardingFactor ) || $shardingFactor < 1 ) {
139
			throw new InvalidArgumentException( '$shardingFactor must be a positive integer.' );
140
		}
141
142
		if ( !is_int( $shard ) || $shard < 0 ) {
143
			throw new InvalidArgumentException( '$shard must be a non-negative integer.' );
144
		}
145
146
		if ( $shard >= $shardingFactor ) {
147
			throw new InvalidArgumentException( '$shard must be smaller than $shardingFactor.' );
148
		}
149
150
		$this->shardingFactor = $shardingFactor;
151
		$this->shard = $shard;
152
	}
153
154
	/**
155
	 * Set the entity types to be included in the output.
156
	 *
157
	 * @param string[]|null $types The desired types (use null for any type).
158
	 */
159
	public function setEntityTypesFilter( $types ) {
160
		if ( is_array( $types ) ) {
161
			$types = array_flip( $types );
162
		}
163
		$this->entityTypes = $types;
164
	}
165
166
	private function idMatchesFilters( EntityId $entityId ) {
167
		return $this->idMatchesShard( $entityId ) && $this->idMatchesType( $entityId );
168
	}
169
170
	private function idMatchesShard( EntityId $entityId ) {
171
		// Shorten out
172
		if ( $this->shardingFactor === 1 ) {
173
			return true;
174
		}
175
176
		$hash = sha1( $entityId->getSerialization() );
177
		$shard = (int)hexdec( substr( $hash, 0, 8 ) ); // 4 bytes of the hash
178
		$shard = abs( $shard ); // avoid negative numbers on 32 bit systems
179
		$shard %= $this->shardingFactor; // modulo number of shards
180
181
		return $shard === $this->shard;
182
	}
183
184
	private function idMatchesType( EntityId $entityId ) {
185
		return $this->entityTypes === null || ( array_key_exists( $entityId->getEntityType(), $this->entityTypes ) );
186
	}
187
188
	/**
189
	 * Writers the given string to the output provided to the constructor.
190
	 *
191
	 * @param string $data
192
	 */
193
	protected function writeToDump( $data ) {
194
		//TODO: use output stream object
195
		fwrite( $this->out, $data );
196
	}
197
198
	/**
199
	 * Do something before dumping data
200
	 */
201
	protected function preDump() {
202
		// Nothing by default
203
	}
204
205
	/**
206
	 * Do something after dumping data
207
	 */
208
	protected function postDump() {
209
		// Nothing by default
210
	}
211
212
	/**
213
	 * Do something before dumping a batch of entities
214
	 * @param EntityId[] $entities
215
	 */
216
	protected function preBatchDump( $entities ) {
217
		$this->entityPrefetcher->prefetch( $entities );
218
	}
219
220
	/**
221
	 * Do something before dumping entity
222
	 *
223
	 * @param int $dumpCount
224
	 */
225
	protected function preEntityDump( $dumpCount ) {
226
		// Nothing by default
227
	}
228
229
	/**
230
	 * Do something after dumping entity
231
	 *
232
	 * @param int $dumpCount
233
	 */
234
	protected function postEntityDump( $dumpCount ) {
0 ignored issues
show
The parameter $dumpCount is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
235
		// Nothing by default
236
	}
237
238
	/**
239
	 * Generates a dump, writing to the file handle provided to the constructor.
240
	 *
241
	 * @param EntityIdPager $idPager
242
	 */
243
	public function generateDump( EntityIdPager $idPager ) {
244
		$dumpCount = 0;
245
246
		$this->preDump();
247
248
		// Iterate over batches of IDs, maintaining the current position of the pager in the $position variable.
249
		while ( true ) {
250
			if ( $this->limit && ( $dumpCount + $this->batchSize ) > $this->limit ) {
251
				// Try not to overrun $limit in order to make sure pager's position can be used for continuing.
252
				$limit = $this->limit - $dumpCount;
253
			} else {
254
				$limit = $this->batchSize;
255
			}
256
257
			$ids = $idPager->fetchIds( $limit );
258
			if ( !$ids ) {
259
				break;
260
			}
261
262
			$this->dumpEntities( $ids, $dumpCount );
263
264
			$this->progressReporter->reportMessage( 'Processed ' . $dumpCount . ' entities.' );
265
266
			if ( $this->limit && $dumpCount >= $this->limit ) {
267
				$this->progressReporter->reportMessage( 'Reached entity dump limit of ' . $this->limit . '.' );
268
269
				if ( $idPager instanceof SqlEntityIdPager ) {
270
					// This message is possibly being parsed for continuation purposes, thus avoid changing it.
271
					$this->progressReporter->reportMessage( 'Last SqlEntityIdPager position: ' . $idPager->getPosition() . '.' );
272
				}
273
274
				break;
275
			}
276
		}
277
278
		$this->postDump();
279
	}
280
281
	/**
282
	 * Dump list of entities
283
	 *
284
	 * @param EntityId[] $entityIds
285
	 * @param int &$dumpCount The number of entities already dumped (will be updated).
286
	 */
287
	private function dumpEntities( array $entityIds, &$dumpCount ) {
288
		$toLoad = [];
289
		foreach ( $entityIds as $entityId ) {
290
			if ( $this->idMatchesFilters( $entityId ) ) {
291
				$toLoad[] = $entityId;
292
			}
293
		}
294
295
		$this->preBatchDump( $toLoad );
296
297
		foreach ( $toLoad as $entityId ) {
298
			try {
299
				$data = $this->generateDumpForEntityId( $entityId );
300
				if ( !$data ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $data of type string|null is loosely compared to false; this is ambiguous if the string can be empty. You might want to explicitly use === null instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
301
					continue;
302
				}
303
304
				$this->preEntityDump( $dumpCount );
305
				$this->writeToDump( $data );
306
				$this->postEntityDump( $dumpCount );
307
308
				$dumpCount++;
309
				if ( $this->limit && $dumpCount >= $this->limit ) {
310
					break;
311
				}
312
			} catch ( EntityLookupException $ex ) {
313
				$this->exceptionHandler->handleException( $ex, 'failed-to-dump', 'Failed to dump ' . $entityId );
314
			} catch ( StorageException $ex ) {
315
				$this->exceptionHandler->handleException( $ex, 'failed-to-dump', 'Failed to dump ' . $entityId );
316
			} catch ( LogicException $ex ) {
317
				$this->exceptionHandler->handleException( $ex, 'failed-to-dump', 'Failed to dump ' . $entityId );
318
			}
319
		}
320
	}
321
322
	/**
323
	 * Produce dump data for specific entity
324
	 *
325
	 * @param EntityId $entityId
326
	 *
327
	 * @throws EntityLookupException
328
	 * @throws StorageException
329
	 * @return string|null
330
	 */
331
	abstract protected function generateDumpForEntityId( EntityId $entityId );
332
333
}
334