BenchmarkParse::onFetchTemplate()   A
last analyzed

Complexity

Conditions 3
Paths 4

Size

Total Lines 12
Code Lines 8

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 3
eloc 8
nc 4
nop 4
dl 0
loc 12
rs 9.4285
c 0
b 0
f 0
1
<?php
0 ignored issues
show
Coding Style Compatibility introduced by
For compatibility and reusability of your code, PSR1 recommends that a file should introduce either new symbols (like classes, functions, etc.) or have side-effects (like outputting something, or including other files), but not both at the same time. The first symbol is defined on line 35 and the first side effect is on line 25.

The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.

The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.

To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.

Loading history...
2
/**
3
 * Benchmark script for parse operations
4
 *
5
 * This program is free software; you can redistribute it and/or modify
6
 * it under the terms of the GNU General Public License as published by
7
 * the Free Software Foundation; either version 2 of the License, or
8
 * (at your option) any later version.
9
 *
10
 * This program is distributed in the hope that it will be useful,
11
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13
 * GNU General Public License for more details.
14
 *
15
 * You should have received a copy of the GNU General Public License along
16
 * with this program; if not, write to the Free Software Foundation, Inc.,
17
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
18
 * http://www.gnu.org/copyleft/gpl.html
19
 *
20
 * @file
21
 * @author Tim Starling <[email protected]>
22
 * @ingroup Benchmark
23
 */
24
25
require __DIR__ . '/../Maintenance.php';
26
27
use MediaWiki\MediaWikiServices;
28
29
/**
30
 * Maintenance script to benchmark how long it takes to parse a given title at an optionally
31
 * specified timestamp
32
 *
33
 * @since 1.23
34
 */
35
class BenchmarkParse extends Maintenance {
36
	/** @var string MediaWiki concatenated string timestamp (YYYYMMDDHHMMSS) */
37
	private $templateTimestamp = null;
38
39
	private $clearLinkCache = false;
40
41
	/**
42
	 * @var LinkCache
43
	 */
44
	private $linkCache;
45
46
	/** @var array Cache that maps a Title DB key to revision ID for the requested timestamp */
47
	private $idCache = [];
48
49
	function __construct() {
50
		parent::__construct();
51
		$this->addDescription( 'Benchmark parse operation' );
52
		$this->addArg( 'title', 'The name of the page to parse' );
53
		$this->addOption( 'warmup', 'Repeat the parse operation this number of times to warm the cache',
54
			false, true );
55
		$this->addOption( 'loops', 'Number of times to repeat parse operation post-warmup',
56
			false, true );
57
		$this->addOption( 'page-time',
58
			'Use the version of the page which was current at the given time',
59
			false, true );
60
		$this->addOption( 'tpl-time',
61
			'Use templates which were current at the given time (except that moves and ' .
62
			'deletes are not handled properly)',
63
			false, true );
64
		$this->addOption( 'reset-linkcache', 'Reset the LinkCache after every parse.',
65
			false, false );
66
	}
67
68
	function execute() {
69
		if ( $this->hasOption( 'tpl-time' ) ) {
70
			$this->templateTimestamp = wfTimestamp( TS_MW, strtotime( $this->getOption( 'tpl-time' ) ) );
0 ignored issues
show
Documentation Bug introduced by
It seems like wfTimestamp(TS_MW, strto...getOption('tpl-time'))) can also be of type false. However, the property $templateTimestamp is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
71
			Hooks::register( 'BeforeParserFetchTemplateAndtitle', [ $this, 'onFetchTemplate' ] );
72
		}
73
74
		$this->clearLinkCache = $this->hasOption( 'reset-linkcache' );
75
		// Set as a member variable to avoid function calls when we're timing the parse
76
		$this->linkCache = MediaWikiServices::getInstance()->getLinkCache();
77
78
		$title = Title::newFromText( $this->getArg() );
79
		if ( !$title ) {
80
			$this->error( "Invalid title" );
81
			exit( 1 );
0 ignored issues
show
Coding Style Compatibility introduced by
The method execute() contains an exit expression.

An exit expression should only be used in rare cases. For example, if you write a short command line script.

In most cases however, using an exit expression makes the code untestable and often causes incompatibilities with other libraries. Thus, unless you are absolutely sure it is required here, we recommend to refactor your code to avoid its usage.

Loading history...
82
		}
83
84
		if ( $this->hasOption( 'page-time' ) ) {
85
			$pageTimestamp = wfTimestamp( TS_MW, strtotime( $this->getOption( 'page-time' ) ) );
86
			$id = $this->getRevIdForTime( $title, $pageTimestamp );
0 ignored issues
show
Security Bug introduced by
It seems like $pageTimestamp defined by wfTimestamp(TS_MW, strto...etOption('page-time'))) on line 85 can also be of type false; however, BenchmarkParse::getRevIdForTime() does only seem to accept string, did you maybe forget to handle an error condition?

This check looks for type mismatches where the missing type is false. This is usually indicative of an error condtion.

Consider the follow example

<?php

function getDate($date)
{
    if ($date !== null) {
        return new DateTime($date);
    }

    return false;
}

This function either returns a new DateTime object or false, if there was an error. This is a typical pattern in PHP programming to show that an error has occurred without raising an exception. The calling code should check for this returned false before passing on the value to another function or method that may not be able to handle a false.

Loading history...
87
			if ( !$id ) {
88
				$this->error( "The page did not exist at that time" );
89
				exit( 1 );
0 ignored issues
show
Coding Style Compatibility introduced by
The method execute() contains an exit expression.

An exit expression should only be used in rare cases. For example, if you write a short command line script.

In most cases however, using an exit expression makes the code untestable and often causes incompatibilities with other libraries. Thus, unless you are absolutely sure it is required here, we recommend to refactor your code to avoid its usage.

Loading history...
90
			}
91
92
			$revision = Revision::newFromId( $id );
93
		} else {
94
			$revision = Revision::newFromTitle( $title );
95
		}
96
97
		if ( !$revision ) {
98
			$this->error( "Unable to load revision, incorrect title?" );
99
			exit( 1 );
0 ignored issues
show
Coding Style Compatibility introduced by
The method execute() contains an exit expression.

An exit expression should only be used in rare cases. For example, if you write a short command line script.

In most cases however, using an exit expression makes the code untestable and often causes incompatibilities with other libraries. Thus, unless you are absolutely sure it is required here, we recommend to refactor your code to avoid its usage.

Loading history...
100
		}
101
102
		$warmup = $this->getOption( 'warmup', 1 );
103
		for ( $i = 0; $i < $warmup; $i++ ) {
104
			$this->runParser( $revision );
105
		}
106
107
		$loops = $this->getOption( 'loops', 1 );
108
		if ( $loops < 1 ) {
109
			$this->error( 'Invalid number of loops specified', true );
110
		}
111
		$startUsage = getrusage();
112
		$startTime = microtime( true );
113
		for ( $i = 0; $i < $loops; $i++ ) {
114
			$this->runParser( $revision );
115
		}
116
		$endUsage = getrusage();
117
		$endTime = microtime( true );
118
119
		printf( "CPU time = %.3f s, wall clock time = %.3f s\n",
120
			// CPU time
121
			( $endUsage['ru_utime.tv_sec'] + $endUsage['ru_utime.tv_usec'] * 1e-6
122
			- $startUsage['ru_utime.tv_sec'] - $startUsage['ru_utime.tv_usec'] * 1e-6 ) / $loops,
123
			// Wall clock time
124
			( $endTime - $startTime ) / $loops
125
		);
126
	}
127
128
	/**
129
	 * Fetch the ID of the revision of a Title that occurred
130
	 *
131
	 * @param Title $title
132
	 * @param string $timestamp
133
	 * @return bool|string Revision ID, or false if not found or error
134
	 */
135
	function getRevIdForTime( Title $title, $timestamp ) {
136
		$dbr = $this->getDB( DB_REPLICA );
137
138
		$id = $dbr->selectField(
139
			[ 'revision', 'page' ],
140
			'rev_id',
141
			[
142
				'page_namespace' => $title->getNamespace(),
143
				'page_title' => $title->getDBkey(),
144
				'rev_timestamp <= ' . $dbr->addQuotes( $timestamp )
145
			],
146
			__METHOD__,
147
			[ 'ORDER BY' => 'rev_timestamp DESC', 'LIMIT' => 1 ],
148
			[ 'revision' => [ 'INNER JOIN', 'rev_page=page_id' ] ]
149
		);
150
151
		return $id;
152
	}
153
154
	/**
155
	 * Parse the text from a given Revision
156
	 *
157
	 * @param Revision $revision
158
	 */
159
	function runParser( Revision $revision ) {
160
		$content = $revision->getContent();
161
		$content->getParserOutput( $revision->getTitle(), $revision->getId() );
0 ignored issues
show
Bug introduced by
It seems like $revision->getTitle() can be null; however, getParserOutput() does not accept null, maybe add an additional type check?

Unless you are absolutely sure that the expression can never be null because of other conditions, we strongly recommend to add an additional type check to your code:

/** @return stdClass|null */
function mayReturnNull() { }

function doesNotAcceptNull(stdClass $x) { }

// With potential error.
function withoutCheck() {
    $x = mayReturnNull();
    doesNotAcceptNull($x); // Potential error here.
}

// Safe - Alternative 1
function withCheck1() {
    $x = mayReturnNull();
    if ( ! $x instanceof stdClass) {
        throw new \LogicException('$x must be defined.');
    }
    doesNotAcceptNull($x);
}

// Safe - Alternative 2
function withCheck2() {
    $x = mayReturnNull();
    if ($x instanceof stdClass) {
        doesNotAcceptNull($x);
    }
}
Loading history...
162
		if ( $this->clearLinkCache ) {
163
			$this->linkCache->clear();
164
		}
165
	}
166
167
	/**
168
	 * Hook into the parser's revision ID fetcher. Make sure that the parser only
169
	 * uses revisions around the specified timestamp.
170
	 *
171
	 * @param Parser $parser
172
	 * @param Title $title
173
	 * @param bool &$skip
174
	 * @param string|bool &$id
175
	 * @return bool
176
	 */
177
	function onFetchTemplate( Parser $parser, Title $title, &$skip, &$id ) {
178
		$pdbk = $title->getPrefixedDBkey();
179
		if ( !isset( $this->idCache[$pdbk] ) ) {
180
			$proposedId = $this->getRevIdForTime( $title, $this->templateTimestamp );
181
			$this->idCache[$pdbk] = $proposedId;
182
		}
183
		if ( $this->idCache[$pdbk] !== false ) {
184
			$id = $this->idCache[$pdbk];
185
		}
186
187
		return true;
188
	}
189
}
190
191
$maintClass = 'BenchmarkParse';
192
require RUN_MAINTENANCE_IF_MAIN;
193