Completed
Branch master (939199)
by
unknown
39:35
created

maintenance/purgeChangedFiles.php (3 issues)

Upgrade to new PHP Analysis Engine

These results are based on our legacy PHP analysis, consider migrating to our new PHP analysis engine instead. Learn more

1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 32 and the first side effect is on line 24.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
/**
3
 * Scan the logging table and purge affected files within a timeframe.
4
 *
5
 * This program is free software; you can redistribute it and/or modify
6
 * it under the terms of the GNU General Public License as published by
7
 * the Free Software Foundation; either version 2 of the License, or
8
 * (at your option) any later version.
9
 *
10
 * This program is distributed in the hope that it will be useful,
11
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13
 * GNU General Public License for more details.
14
 *
15
 * You should have received a copy of the GNU General Public License along
16
 * with this program; if not, write to the Free Software Foundation, Inc.,
17
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
18
 * http://www.gnu.org/copyleft/gpl.html
19
 *
20
 * @file
21
 * @ingroup Maintenance
22
 */
23
24
require_once __DIR__ . '/Maintenance.php';
25
26
/**
27
 * Maintenance script that scans the deletion log and purges affected files
28
 * within a timeframe.
29
 *
30
 * @ingroup Maintenance
31
 */
32
class PurgeChangedFiles extends Maintenance {
33
	/**
34
	 * Mapping from type option to log type and actions.
35
	 * @var array
36
	 */
37
	private static $typeMappings = [
38
		'created' => [
39
			'upload' => [ 'upload' ],
40
			'import' => [ 'upload', 'interwiki' ],
41
		],
42
		'deleted' => [
43
			'delete' => [ 'delete', 'revision' ],
44
			'suppress' => [ 'delete', 'revision' ],
45
		],
46
		'modified' => [
47
			'upload' => [ 'overwrite', 'revert' ],
48
			'move' => [ 'move', 'move_redir' ],
49
		],
50
	];
51
52
	/**
53
	 * @var string
54
	 */
55
	private $startTimestamp;
56
57
	/**
58
	 * @var string
59
	 */
60
	private $endTimestamp;
61
62
	public function __construct() {
63
		parent::__construct();
64
		$this->addDescription( 'Scan the logging table and purge files and thumbnails.' );
65
		$this->addOption( 'starttime', 'Starting timestamp', true, true );
66
		$this->addOption( 'endtime', 'Ending timestamp', true, true );
67
		$this->addOption( 'type', 'Comma-separated list of types of changes to send purges for (' .
68
			implode( ',', array_keys( self::$typeMappings ) ) . ',all)', false, true );
69
		$this->addOption( 'htcp-dest', 'HTCP announcement destination (IP:port)', false, true );
70
		$this->addOption( 'dry-run', 'Do not send purge requests' );
71
		$this->addOption( 'sleep-per-batch', 'Milliseconds to sleep between batches', false, true );
72
		$this->addOption( 'verbose', 'Show more output', false, false, 'v' );
73
		$this->setBatchSize( 100 );
74
	}
75
76
	public function execute() {
77
		global $wgHTCPRouting;
78
79 View Code Duplication
		if ( $this->hasOption( 'htcp-dest' ) ) {
80
			$parts = explode( ':', $this->getOption( 'htcp-dest' ) );
81
			if ( count( $parts ) < 2 ) {
82
				// Add default htcp port
83
				$parts[] = '4827';
84
			}
85
86
			// Route all HTCP messages to provided host:port
87
			$wgHTCPRouting = [
88
				'' => [ 'host' => $parts[0], 'port' => $parts[1] ],
89
			];
90
			$this->verbose( "HTCP broadcasts to {$parts[0]}:{$parts[1]}\n" );
91
		}
92
93
		// Find out which actions we should be concerned with
94
		$typeOpt = $this->getOption( 'type', 'all' );
95
		$validTypes = array_keys( self::$typeMappings );
96
		if ( $typeOpt === 'all' ) {
97
			// Convert 'all' to all registered types
98
			$typeOpt = implode( ',', $validTypes );
99
		}
100
		$typeList = explode( ',', $typeOpt );
101
		foreach ( $typeList as $type ) {
102
			if ( !in_array( $type, $validTypes ) ) {
103
				$this->error( "\nERROR: Unknown type: {$type}\n" );
104
				$this->maybeHelp( true );
105
			}
106
		}
107
108
		// Validate the timestamps
109
		$dbr = $this->getDB( DB_REPLICA );
110
		$this->startTimestamp = $dbr->timestamp( $this->getOption( 'starttime' ) );
111
		$this->endTimestamp = $dbr->timestamp( $this->getOption( 'endtime' ) );
112
113
		if ( $this->startTimestamp > $this->endTimestamp ) {
114
			$this->error( "\nERROR: starttime after endtime\n" );
115
			$this->maybeHelp( true );
116
		}
117
118
		// Turn on verbose when dry-run is enabled
119
		if ( $this->hasOption( 'dry-run' ) ) {
120
			$this->mOptions['verbose'] = 1;
121
		}
122
123
		$this->verbose( 'Purging files that were: ' . implode( ', ', $typeList ) . "\n" );
124
		foreach ( $typeList as $type ) {
125
			$this->verbose( "Checking for {$type} files...\n" );
126
			$this->purgeFromLogType( $type );
127
			if ( !$this->hasOption( 'dry-run' ) ) {
128
				$this->verbose( "...{$type} files purged.\n\n" );
129
			}
130
		}
131
	}
132
133
	/**
134
	 * Purge cache and thumbnails for changes of the given type.
135
	 *
136
	 * @param string $type Type of change to find
137
	 */
138
	protected function purgeFromLogType( $type ) {
139
		$repo = RepoGroup::singleton()->getLocalRepo();
140
		$dbr = $this->getDB( DB_REPLICA );
141
142
		foreach ( self::$typeMappings[$type] as $logType => $logActions ) {
143
			$this->verbose( "Scanning for {$logType}/" . implode( ',', $logActions ) . "\n" );
144
145
			$res = $dbr->select(
146
				'logging',
147
				[ 'log_title', 'log_timestamp', 'log_params' ],
148
				[
149
					'log_namespace' => NS_FILE,
150
					'log_type' => $logType,
151
					'log_action' => $logActions,
152
					'log_timestamp >= ' . $dbr->addQuotes( $this->startTimestamp ),
153
					'log_timestamp <= ' . $dbr->addQuotes( $this->endTimestamp ),
154
				],
155
				__METHOD__
156
			);
157
158
			$bSize = 0;
159
			foreach ( $res as $row ) {
160
				$file = $repo->newFile( Title::makeTitle( NS_FILE, $row->log_title ) );
161
162
				if ( $this->hasOption( 'dry-run' ) ) {
163
					$this->verbose( "{$type}[{$row->log_timestamp}]: {$row->log_title}\n" );
164
					continue;
165
				}
166
167
				// Purge current version and its thumbnails
168
				$file->purgeCache();
169
				// Purge the old versions and their thumbnails
170
				foreach ( $file->getHistory() as $oldFile ) {
171
					$oldFile->purgeCache();
172
				}
173
174
				if ( $logType === 'delete' ) {
175
					// If there is an orphaned storage file... delete it
176 View Code Duplication
					if ( !$file->exists() && $repo->fileExists( $file->getPath() ) ) {
177
						$dpath = $this->getDeletedPath( $repo, $file );
178
						if ( $repo->fileExists( $dpath ) ) {
179
							// Sanity check to avoid data loss
180
							$repo->getBackend()->delete( [ 'src' => $file->getPath() ] );
181
							$this->verbose( "Deleted orphan file: {$file->getPath()}.\n" );
182
						} else {
183
							$this->error( "File was not deleted: {$file->getPath()}.\n" );
184
						}
185
					}
186
187
					// Purge items from fileachive table (rows are likely here)
188
					$this->purgeFromArchiveTable( $repo, $file );
189
				} elseif ( $logType === 'move' ) {
190
					// Purge the target file as well
191
192
					$params = unserialize( $row->log_params );
193
					if ( isset( $params['4::target'] ) ) {
194
						$target = $params['4::target'];
195
						$targetFile = $repo->newFile( Title::makeTitle( NS_FILE, $target ) );
196
						$targetFile->purgeCache();
197
						$this->verbose( "Purged file {$target}; move target @{$row->log_timestamp}.\n" );
198
					}
199
				}
200
201
				$this->verbose( "Purged file {$row->log_title}; {$type} @{$row->log_timestamp}.\n" );
202
203
				if ( $this->hasOption( 'sleep-per-batch' ) && ++$bSize > $this->mBatchSize ) {
204
					$bSize = 0;
205
					// sleep-per-batch is milliseconds, usleep wants micro seconds.
206
					usleep( 1000 * (int)$this->getOption( 'sleep-per-batch' ) );
207
				}
208
			}
209
		}
210
	}
211
212
	protected function purgeFromArchiveTable( LocalRepo $repo, LocalFile $file ) {
213
		$dbr = $repo->getSlaveDB();
214
		$res = $dbr->select(
215
			'filearchive',
216
			[ 'fa_archive_name' ],
217
			[ 'fa_name' => $file->getName() ],
218
			__METHOD__
219
		);
220
221
		foreach ( $res as $row ) {
222
			if ( $row->fa_archive_name === null ) {
223
				// Was not an old version (current version names checked already)
224
				continue;
225
			}
226
			$ofile = $repo->newFromArchiveName( $file->getTitle(), $row->fa_archive_name );
227
			// If there is an orphaned storage file still there...delete it
228 View Code Duplication
			if ( !$file->exists() && $repo->fileExists( $ofile->getPath() ) ) {
0 ignored issues
show
It seems like $ofile->getPath() targeting File::getPath() can also be of type boolean; however, FileRepo::fileExists() does only seem to accept string, maybe add an additional type check?

This check looks at variables that are passed out again to other methods.

If the outgoing method call has stricter type requirements than the method itself, an issue is raised.

An additional type check may prevent trouble.

Loading history...
229
				$dpath = $this->getDeletedPath( $repo, $ofile );
230
				if ( $repo->fileExists( $dpath ) ) {
231
					// Sanity check to avoid data loss
232
					$repo->getBackend()->delete( [ 'src' => $ofile->getPath() ] );
233
					$this->output( "Deleted orphan file: {$ofile->getPath()}.\n" );
234
				} else {
235
					$this->error( "File was not deleted: {$ofile->getPath()}.\n" );
236
				}
237
			}
238
			$file->purgeOldThumbnails( $row->fa_archive_name );
239
		}
240
	}
241
242
	protected function getDeletedPath( LocalRepo $repo, LocalFile $file ) {
243
		$hash = $repo->getFileSha1( $file->getPath() );
0 ignored issues
show
It seems like $file->getPath() targeting File::getPath() can also be of type boolean; however, FileRepo::getFileSha1() does only seem to accept string, maybe add an additional type check?

This check looks at variables that are passed out again to other methods.

If the outgoing method call has stricter type requirements than the method itself, an issue is raised.

An additional type check may prevent trouble.

Loading history...
244
		$key = "{$hash}.{$file->getExtension()}";
245
246
		return $repo->getDeletedHashPath( $key ) . $key;
247
	}
248
249
	/**
250
	 * Send an output message iff the 'verbose' option has been provided.
251
	 *
252
	 * @param string $msg Message to output
253
	 */
254
	protected function verbose( $msg ) {
255
		if ( $this->hasOption( 'verbose' ) ) {
256
			$this->output( $msg );
257
		}
258
	}
259
}
260
261
$maintClass = "PurgeChangedFiles";
262
require_once RUN_MAINTENANCE_IF_MAIN;
263