FindOrphanedFiles::execute()   C
last analyzed

Complexity

Conditions 8
Paths 64

Size

Total Lines 37
Code Lines 23

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 8
eloc 23
nc 64
nop 0
dl 0
loc 37
rs 5.3846
c 0
b 0
f 0
1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 24 and the first side effect is on line 22.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
/**
3
 * This program is free software; you can redistribute it and/or modify
4
 * it under the terms of the GNU General Public License as published by
5
 * the Free Software Foundation; either version 2 of the License, or
6
 * (at your option) any later version.
7
 *
8
 * This program is distributed in the hope that it will be useful,
9
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
10
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11
 * GNU General Public License for more details.
12
 *
13
 * You should have received a copy of the GNU General Public License along
14
 * with this program; if not, write to the Free Software Foundation, Inc.,
15
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
16
 * http://www.gnu.org/copyleft/gpl.html
17
 *
18
 * @file
19
 * @author Aaron Schulz
20
 */
21
22
require_once __DIR__ . '/Maintenance.php';
23
24
class FindOrphanedFiles extends Maintenance {
25
	function __construct() {
26
		parent::__construct();
27
28
		$this->addDescription( "Find unregistered files in the 'public' repo zone." );
29
		$this->addOption( 'subdir',
30
			'Only scan files in this subdirectory (e.g. "a/a0")', false, true );
31
		$this->addOption( 'verbose', "Mention file paths checked" );
32
		$this->setBatchSize( 500 );
33
	}
34
35
	function execute() {
36
		$subdir = $this->getOption( 'subdir', '' );
37
		$verbose = $this->hasOption( 'verbose' );
38
39
		$repo = RepoGroup::singleton()->getLocalRepo();
40
		if ( $repo->hasSha1Storage() ) {
41
			$this->error( "Local repo uses SHA-1 file storage names; aborting.", 1 );
42
		}
43
44
		$directory = $repo->getZonePath( 'public' );
45
		if ( $subdir != '' ) {
46
			$directory .= "/$subdir/";
47
		}
48
49
		if ( $verbose ) {
50
			$this->output( "Scanning files under $directory:\n" );
51
		}
52
53
		$list = $repo->getBackend()->getFileList( [ 'dir' => $directory ] );
54
		if ( $list === null ) {
55
			$this->error( "Could not get file listing.", 1 );
56
		}
57
58
		$pathBatch = [];
59
		foreach ( $list as $path ) {
0 ignored issues
show
Bug introduced by
The expression $list of type object<Traversable>|array|null is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
60
			if ( preg_match( '#^(thumb|deleted)/#', $path ) ) {
61
				continue; // handle ugly nested containers on stock installs
62
			}
63
64
			$pathBatch[] = $path;
65
			if ( count( $pathBatch ) >= $this->mBatchSize ) {
66
				$this->checkFiles( $repo, $pathBatch, $verbose );
67
				$pathBatch = [];
68
			}
69
		}
70
		$this->checkFiles( $repo, $pathBatch, $verbose );
71
	}
72
73
	protected function checkFiles( LocalRepo $repo, array $paths, $verbose ) {
74
		if ( !count( $paths ) ) {
75
			return;
76
		}
77
78
		$dbr = $repo->getSlaveDB();
79
80
		$curNames = [];
81
		$oldNames = [];
82
		$imgIN = [];
83
		$oiWheres = [];
84
		foreach ( $paths as $path ) {
85
			$name = basename( $path );
86
			if ( preg_match( '#^archive/#', $path ) ) {
87
				if ( $verbose ) {
88
					$this->output( "Checking old file $name\n" );
89
				}
90
91
				$oldNames[] = $name;
92
				list( , $base ) = explode( '!', $name, 2 ); // <TS_MW>!<img_name>
93
				$oiWheres[] = $dbr->makeList(
94
					[ 'oi_name' => $base, 'oi_archive_name' => $name ],
95
					LIST_AND
96
				);
97
			} else {
98
				if ( $verbose ) {
99
					$this->output( "Checking current file $name\n" );
100
				}
101
102
				$curNames[] = $name;
103
				$imgIN[] = $name;
104
			}
105
		}
106
107
		$res = $dbr->query(
108
			$dbr->unionQueries(
109
				[
110
					$dbr->selectSQLText(
111
						'image',
112
						[ 'name' => 'img_name', 'old' => 0 ],
113
						$imgIN ? [ 'img_name' => $imgIN ] : '1=0'
0 ignored issues
show
Bug introduced by
It seems like $imgIN ? array('img_name' => $imgIN) : '1=0' can also be of type array<string,array,{"img_name":"array"}>; however, Database::selectSQLText() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
114
					),
115
					$dbr->selectSQLText(
116
						'oldimage',
117
						[ 'name' => 'oi_archive_name', 'old' => 1 ],
118
						$oiWheres ? $dbr->makeList( $oiWheres, LIST_OR ) : '1=0'
119
					)
120
				],
121
				true // UNION ALL (performance)
122
			),
123
			__METHOD__
124
		);
125
126
		$curNamesFound = [];
127
		$oldNamesFound = [];
128
		foreach ( $res as $row ) {
129
			if ( $row->old ) {
130
				$oldNamesFound[] = $row->name;
131
			} else {
132
				$curNamesFound[] = $row->name;
133
			}
134
		}
135
136
		foreach ( array_diff( $curNames, $curNamesFound ) as $name ) {
137
			$file = $repo->newFile( $name );
138
			// Print name and public URL to ease recovery
139
			if ( $file ) {
140
				$this->output( $name . "\n" . $file->getCanonicalUrl() . "\n\n" );
141
			} else {
142
				$this->error( "Cannot get URL for bad file title '$name'" );
143
			}
144
		}
145
146
		foreach ( array_diff( $oldNames, $oldNamesFound ) as $name ) {
147
			list( , $base ) = explode( '!', $name, 2 ); // <TS_MW>!<img_name>
148
			$file = $repo->newFromArchiveName( Title::makeTitle( NS_FILE, $base ), $name );
149
			// Print name and public URL to ease recovery
150
			$this->output( $name . "\n" . $file->getCanonicalUrl() . "\n\n" );
151
		}
152
	}
153
}
154
155
$maintClass = 'FindOrphanedFiles';
156
require_once RUN_MAINTENANCE_IF_MAIN;
157