FindHooks   B
last analyzed

Complexity

Total Complexity 44

Size/Duplication

Total Lines 298
Duplicated Lines 3.36 %

Coupling/Cohesion

Components 1
Dependencies 3

Importance

Changes 0
Metric Value
c 0
b 0
f 0
dl 10
loc 298
rs 8.3396
wmc 44
lcom 1
cbo 3

11 Methods

Rating   Name   Duplication   Size   Complexity  
A __construct() 0 5 1
A getDbType() 0 3 1
C execute() 10 79 15
A getHooksFromDoc() 0 7 2
B getHooksFromLocalDoc() 0 24 4
A getHooksFromOnlineDoc() 0 5 1
B getHooksFromOnlineDocCategory() 0 30 5
B getHooksFromFile() 0 41 5
A getBadHooksFromFile() 0 12 2
A printArray() 0 7 2
B getHooksFromDir() 0 26 6

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like FindHooks often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use FindHooks, and based on these observations, apply Extract Interface, too.

1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 44 and the first side effect is on line 37.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
/**
3
 * Simple script that try to find documented hook and hooks actually
4
 * in the code and show what's missing.
5
 *
6
 * This script assumes that:
7
 * - hooks names in hooks.txt are at the beginning of a line and single quoted.
8
 * - hooks names in code are the first parameter of wfRunHooks.
9
 *
10
 * if --online option is passed, the script will compare the hooks in the code
11
 * with the ones at https://www.mediawiki.org/wiki/Manual:Hooks
12
 *
13
 * Any instance of wfRunHooks that doesn't meet these parameters will be noted.
14
 *
15
 * Copyright © Antoine Musso
16
 *
17
 * This program is free software; you can redistribute it and/or modify
18
 * it under the terms of the GNU General Public License as published by
19
 * the Free Software Foundation; either version 2 of the License, or
20
 * (at your option) any later version.
21
 *
22
 * This program is distributed in the hope that it will be useful,
23
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
24
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
25
 * GNU General Public License for more details.
26
 *
27
 * You should have received a copy of the GNU General Public License along
28
 * with this program; if not, write to the Free Software Foundation, Inc.,
29
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
30
 * http://www.gnu.org/copyleft/gpl.html
31
 *
32
 * @file
33
 * @ingroup Maintenance
34
 * @author Antoine Musso <hashar at free dot fr>
35
 */
36
37
require_once __DIR__ . '/Maintenance.php';
38
39
/**
40
 * Maintenance script that compares documented and actually present mismatches.
41
 *
42
 * @ingroup Maintenance
43
 */
44
class FindHooks extends Maintenance {
45
	const FIND_NON_RECURSIVE = 0;
46
	const FIND_RECURSIVE = 1;
47
48
	/*
49
	 * Hooks that are ignored
50
	 */
51
	protected static $ignore = [ 'testRunLegacyHooks', 'Test' ];
52
53
	public function __construct() {
54
		parent::__construct();
55
		$this->addDescription( 'Find hooks that are undocumented, missing, or just plain wrong' );
56
		$this->addOption( 'online', 'Check against MediaWiki.org hook documentation' );
57
	}
58
59
	public function getDbType() {
60
		return Maintenance::DB_NONE;
61
	}
62
63
	public function execute() {
64
		global $IP;
65
66
		$documentedHooks = $this->getHooksFromDoc( $IP . '/docs/hooks.txt' );
67
		$potentialHooks = [];
68
		$badHooks = [];
69
70
		$recurseDirs = [
71
			"$IP/includes/",
72
			"$IP/mw-config/",
73
			"$IP/languages/",
74
			"$IP/maintenance/",
75
			// Omit $IP/tests/phpunit as it contains hook tests that shouldn't be documented
76
			"$IP/tests/parser",
77
			"$IP/tests/phpunit/suites",
78
		];
79
		$nonRecurseDirs = [
80
			"$IP/",
81
		];
82
83 View Code Duplication
		foreach ( $recurseDirs as $dir ) {
84
			$ret = $this->getHooksFromDir( $dir, self::FIND_RECURSIVE );
85
			$potentialHooks = array_merge( $potentialHooks, $ret['good'] );
86
			$badHooks = array_merge( $badHooks, $ret['bad'] );
87
		}
88 View Code Duplication
		foreach ( $nonRecurseDirs as $dir ) {
89
			$ret = $this->getHooksFromDir( $dir );
90
			$potentialHooks = array_merge( $potentialHooks, $ret['good'] );
91
			$badHooks = array_merge( $badHooks, $ret['bad'] );
92
		}
93
94
		$documented = array_keys( $documentedHooks );
95
		$potential = array_keys( $potentialHooks );
96
		$potential = array_unique( $potential );
97
		$badHooks = array_diff( array_unique( $badHooks ), self::$ignore );
98
		$todo = array_diff( $potential, $documented, self::$ignore );
99
		$deprecated = array_diff( $documented, $potential, self::$ignore );
100
101
		// Check parameter count and references
102
		$badParameterCount = $badParameterReference = [];
103
		foreach ( $potentialHooks as $hook => $args ) {
104
			if ( !isset( $documentedHooks[$hook] ) ) {
105
				// Not documented, but that will also be in $todo
106
				continue;
107
			}
108
			$argsDoc = $documentedHooks[$hook];
109
			if ( $args === 'unknown' || $argsDoc === 'unknown' ) {
110
				// Could not get parameter information
111
				continue;
112
			}
113
			if ( count( $argsDoc ) !== count( $args ) ) {
114
				$badParameterCount[] = $hook . ': Doc: ' . count( $argsDoc ) . ' vs. Code: ' . count( $args );
115
			} else {
116
				// Check if & is equal
117
				foreach ( $argsDoc as $index => $argDoc ) {
118
					$arg = $args[$index];
119
					if ( ( $arg[0] === '&' ) !== ( $argDoc[0] === '&' ) ) {
120
						$badParameterReference[] = $hook . ': References different: Doc: ' . $argDoc .
121
							' vs. Code: ' . $arg;
122
					}
123
				}
124
			}
125
		}
126
127
		// Print the results
128
		$this->printArray( 'Undocumented', $todo );
129
		$this->printArray( 'Documented and not found', $deprecated );
130
		$this->printArray( 'Unclear hook calls', $badHooks );
131
		$this->printArray( 'Different parameter count', $badParameterCount );
132
		$this->printArray( 'Different parameter reference', $badParameterReference );
133
134
		if ( !$todo && !$deprecated && !$badHooks
135
			&& !$badParameterCount && !$badParameterReference
136
		) {
137
			$this->output( "Looks good!\n" );
138
		} else {
139
			$this->error( 'The script finished with errors.', 1 );
140
		}
141
	}
142
143
	/**
144
	 * Get the hook documentation, either locally or from MediaWiki.org
145
	 * @param string $doc
146
	 * @return array Array: key => hook name; value => array of arguments or string 'unknown'
147
	 */
148
	private function getHooksFromDoc( $doc ) {
149
		if ( $this->hasOption( 'online' ) ) {
150
			return $this->getHooksFromOnlineDoc();
151
		} else {
152
			return $this->getHooksFromLocalDoc( $doc );
153
		}
154
	}
155
156
	/**
157
	 * Get hooks from a local file (for example docs/hooks.txt)
158
	 * @param string $doc Filename to look in
159
	 * @return array Array: key => hook name; value => array of arguments or string 'unknown'
160
	 */
161
	private function getHooksFromLocalDoc( $doc ) {
162
		$m = [];
163
		$content = file_get_contents( $doc );
164
		preg_match_all(
165
			"/\n'(.*?)':.*((?:\n.+)*)/",
166
			$content,
167
			$m,
168
			PREG_SET_ORDER
169
		);
170
171
		// Extract the documented parameter
172
		$hooks = [];
173
		foreach ( $m as $match ) {
0 ignored issues
show
Bug introduced by
The expression $m of type null|array<integer,array<integer,string>> is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
174
			$args = [];
175
			if ( isset( $match[2] ) ) {
176
				$n = [];
177
				if ( preg_match_all( "/\n(&?\\$\w+):.+/", $match[2], $n ) ) {
178
					$args = $n[1];
179
				}
180
			}
181
			$hooks[$match[1]] = $args;
182
		}
183
		return $hooks;
184
	}
185
186
	/**
187
	 * Get hooks from www.mediawiki.org using the API
188
	 * @return array Array: key => hook name; value => string 'unknown'
189
	 */
190
	private function getHooksFromOnlineDoc() {
191
		$allhooks = $this->getHooksFromOnlineDocCategory( 'MediaWiki_hooks' );
192
		$removed = $this->getHooksFromOnlineDocCategory( 'Removed_hooks' );
193
		return array_diff_key( $allhooks, $removed );
194
	}
195
196
	/**
197
	 * @param string $title
198
	 * @return array
199
	 */
200
	private function getHooksFromOnlineDocCategory( $title ) {
201
		$params = [
202
			'action' => 'query',
203
			'list' => 'categorymembers',
204
			'cmtitle' => "Category:$title",
205
			'cmlimit' => 500,
206
			'format' => 'json',
207
			'continue' => '',
208
		];
209
210
		$retval = [];
211
		while ( true ) {
212
			$json = Http::get(
213
				wfAppendQuery( 'http://www.mediawiki.org/w/api.php', $params ),
214
				[],
215
				__METHOD__
216
			);
217
			$data = FormatJson::decode( $json, true );
0 ignored issues
show
Bug introduced by
It seems like $json defined by \Http::get(wfAppendQuery...), array(), __METHOD__) on line 212 can also be of type boolean; however, FormatJson::decode() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
218
			foreach ( $data['query']['categorymembers'] as $page ) {
219
				if ( preg_match( '/Manual\:Hooks\/([a-zA-Z0-9- :]+)/', $page['title'], $m ) ) {
220
					// parameters are unknown, because that needs parsing of wikitext
221
					$retval[str_replace( ' ', '_', $m[1] )] = 'unknown';
222
				}
223
			}
224
			if ( !isset( $data['continue'] ) ) {
225
				return $retval;
226
			}
227
			$params = array_replace( $params, $data['continue'] );
228
		}
229
	}
230
231
	/**
232
	 * Get hooks from a PHP file
233
	 * @param string $filePath Full file path to the PHP file.
234
	 * @return array Array: key => hook name; value => array of arguments or string 'unknown'
235
	 */
236
	private function getHooksFromFile( $filePath ) {
237
		$content = file_get_contents( $filePath );
238
		$m = [];
239
		preg_match_all(
240
			// All functions which runs hooks
241
			'/(?:wfRunHooks|Hooks\:\:run|ContentHandler\:\:runLegacyHooks)\s*\(\s*' .
242
				// First argument is the hook name as string
243
				'([\'"])(.*?)\1' .
244
				// Comma for second argument
245
				'(?:\s*(,))?' .
246
				// Second argument must start with array to be processed
247
				'(?:\s*(?:array\s*\(|\[)' .
248
				// Matching inside array - allows one deep of brackets
249
				'((?:[^\(\)\[\]]|\((?-1)\)|\[(?-1)\])*)' .
250
				// End
251
				'[\)\]])?/',
252
			$content,
253
			$m,
254
			PREG_SET_ORDER
255
		);
256
257
		// Extract parameter
258
		$hooks = [];
259
		foreach ( $m as $match ) {
0 ignored issues
show
Bug introduced by
The expression $m of type null|array<integer,array<integer,string>> is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
260
			$args = [];
261
			if ( isset( $match[4] ) ) {
262
				$n = [];
263
				if ( preg_match_all( '/((?:[^,\(\)]|\([^\(\)]*\))+)/', $match[4], $n ) ) {
264
					$args = array_map( 'trim', $n[1] );
265
				}
266
			} elseif ( isset( $match[3] ) ) {
267
				// Found a parameter for Hooks::run,
268
				// but could not extract the hooks argument,
269
				// because there are given by a variable
270
				$args = 'unknown';
271
			}
272
			$hooks[$match[2]] = $args;
273
		}
274
275
		return $hooks;
276
	}
277
278
	/**
279
	 * Get bad hooks (where the hook name could not be determined) from a PHP file
280
	 * @param string $filePath Full filename to the PHP file.
281
	 * @return array Array of bad wfRunHooks() lines
282
	 */
283
	private function getBadHooksFromFile( $filePath ) {
284
		$content = file_get_contents( $filePath );
285
		$m = [];
286
		// We want to skip the "function wfRunHooks()" one.  :)
287
		preg_match_all( '/(?<!function )wfRunHooks\(\s*[^\s\'"].*/', $content, $m );
288
		$list = [];
289
		foreach ( $m[0] as $match ) {
290
			$list[] = $match . "(" . $filePath . ")";
291
		}
292
293
		return $list;
294
	}
295
296
	/**
297
	 * Get hooks from a directory of PHP files.
298
	 * @param string $dir Directory path to start at
299
	 * @param int $recursive Pass self::FIND_RECURSIVE
0 ignored issues
show
Bug introduced by
There is no parameter named $recursive. Was it maybe removed?

This check looks for PHPDoc comments describing methods or function parameters that do not exist on the corresponding method or function.

Consider the following example. The parameter $italy is not defined by the method finale(...).

/**
 * @param array $germany
 * @param array $island
 * @param array $italy
 */
function finale($germany, $island) {
    return "2:1";
}

The most likely cause is that the parameter was removed, but the annotation was not.

Loading history...
300
	 * @return array Array: key => hook name; value => array of arguments or string 'unknown'
301
	 */
302
	private function getHooksFromDir( $dir, $recurse = 0 ) {
303
		$good = [];
304
		$bad = [];
305
306
		if ( $recurse === self::FIND_RECURSIVE ) {
307
			$iterator = new RecursiveIteratorIterator(
308
				new RecursiveDirectoryIterator( $dir, RecursiveDirectoryIterator::SKIP_DOTS ),
309
				RecursiveIteratorIterator::SELF_FIRST
310
			);
311
		} else {
312
			$iterator = new DirectoryIterator( $dir );
313
		}
314
315
		foreach ( $iterator as $info ) {
316
			// Ignore directories, work only on php files,
317
			if ( $info->isFile() && in_array( $info->getExtension(), [ 'php', 'inc' ] )
318
				// Skip this file as it contains text that looks like a bad wfRunHooks() call
319
				&& $info->getRealPath() !== __FILE__
320
			) {
321
				$good = array_merge( $good, $this->getHooksFromFile( $info->getRealPath() ) );
322
				$bad = array_merge( $bad, $this->getBadHooksFromFile( $info->getRealPath() ) );
323
			}
324
		}
325
326
		return [ 'good' => $good, 'bad' => $bad ];
327
	}
328
329
	/**
330
	 * Nicely sort an print an array
331
	 * @param string $msg A message to show before the value
332
	 * @param array $arr
333
	 */
334
	private function printArray( $msg, $arr ) {
335
		asort( $arr );
336
337
		foreach ( $arr as $v ) {
338
			$this->output( "$msg: $v\n" );
339
		}
340
	}
341
}
342
343
$maintClass = 'FindHooks';
344
require_once RUN_MAINTENANCE_IF_MAIN;
345