Completed
Branch master (dc3656)
by
unknown
30:14
created

LinksDeletionUpdate::batchDeleteByPK()   B

Complexity

Conditions 4
Paths 6

Size

Total Lines 23
Code Lines 15

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 0 Features 0
Metric Value
cc 4
eloc 15
c 1
b 0
f 0
nc 6
nop 4
dl 0
loc 23
rs 8.7972
1
<?php
2
/**
3
 * Updater for link tracking tables after a page edit.
4
 *
5
 * This program is free software; you can redistribute it and/or modify
6
 * it under the terms of the GNU General Public License as published by
7
 * the Free Software Foundation; either version 2 of the License, or
8
 * (at your option) any later version.
9
 *
10
 * This program is distributed in the hope that it will be useful,
11
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13
 * GNU General Public License for more details.
14
 *
15
 * You should have received a copy of the GNU General Public License along
16
 * with this program; if not, write to the Free Software Foundation, Inc.,
17
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
18
 * http://www.gnu.org/copyleft/gpl.html
19
 *
20
 * @file
21
 */
22
use MediaWiki\MediaWikiServices;
23
24
/**
25
 * Update object handling the cleanup of links tables after a page was deleted.
26
 **/
27
class LinksDeletionUpdate extends DataUpdate implements EnqueueableDataUpdate {
28
	/** @var WikiPage */
29
	protected $page;
30
	/** @var integer */
31
	protected $pageId;
32
	/** @var string */
33
	protected $timestamp;
34
35
	/** @var IDatabase */
36
	private $db;
37
38
	/**
39
	 * @param WikiPage $page Page we are updating
40
	 * @param integer|null $pageId ID of the page we are updating [optional]
41
	 * @param string|null $timestamp TS_MW timestamp of deletion
42
	 * @throws MWException
43
	 */
44
	function __construct( WikiPage $page, $pageId = null, $timestamp = null ) {
45
		$this->page = $page;
46
		if ( $pageId ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $pageId of type integer|null is loosely compared to true; this is ambiguous if the integer can be zero. You might want to explicitly use !== null instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For integer values, zero is a special case, in particular the following results might be unexpected:

0   == false // true
0   == null  // true
123 == false // false
123 == null  // false

// It is often better to use strict comparison
0 === false // false
0 === null  // false
Loading history...
47
			$this->pageId = $pageId; // page ID at time of deletion
48
		} elseif ( $page->exists() ) {
49
			$this->pageId = $page->getId();
50
		} else {
51
			throw new InvalidArgumentException( "Page ID not known. Page doesn't exist?" );
52
		}
53
54
		$this->timestamp = $timestamp ?: wfTimestampNow();
0 ignored issues
show
Documentation Bug introduced by
It seems like $timestamp ?: wfTimestampNow() can also be of type false. However, the property $timestamp is declared as type string. Maybe add an additional type check?

Our type inference engine has found a suspicous assignment of a value to a property. This check raises an issue when a value that can be of a mixed type is assigned to a property that is type hinted more strictly.

For example, imagine you have a variable $accountId that can either hold an Id object or false (if there is no account id yet). Your code now assigns that value to the id property of an instance of the Account class. This class holds a proper account, so the id value must no longer be false.

Either this assignment is in error or a type check should be added for that assignment.

class Id
{
    public $id;

    public function __construct($id)
    {
        $this->id = $id;
    }

}

class Account
{
    /** @var  Id $id */
    public $id;
}

$account_id = false;

if (starsAreRight()) {
    $account_id = new Id(42);
}

$account = new Account();
if ($account instanceof Id)
{
    $account->id = $account_id;
}
Loading history...
55
	}
56
57
	public function doUpdate() {
58
		$services = MediaWikiServices::getInstance();
59
		$config = $services->getMainConfig();
60
		$lbFactory = $services->getDBLoadBalancerFactory();
61
		$batchSize = $config->get( 'UpdateRowsPerQuery' );
62
63
		// Page may already be deleted, so don't just getId()
64
		$id = $this->pageId;
65
		// Make sure all links update threads see the changes of each other.
66
		// This handles the case when updates have to batched into several COMMITs.
67
		$scopedLock = LinksUpdate::acquirePageLock( $this->getDB(), $id );
68
69
		$title = $this->page->getTitle();
70
		$dbw = $this->getDB(); // convenience
71
72
		// Delete restrictions for it
73
		$dbw->delete( 'page_restrictions', [ 'pr_page' => $id ], __METHOD__ );
74
75
		// Fix category table counts
76
		$cats = $dbw->selectFieldValues(
77
			'categorylinks',
78
			'cl_to',
79
			[ 'cl_from' => $id ],
80
			__METHOD__
81
		);
82
		$catBatches = array_chunk( $cats, $batchSize );
83
		foreach ( $catBatches as $catBatch ) {
84
			$this->page->updateCategoryCounts( [], $catBatch, $id );
85 View Code Duplication
			if ( count( $catBatches ) > 1 ) {
86
				$lbFactory->commitAndWaitForReplication(
87
					__METHOD__, $this->ticket, [ 'wiki' => $dbw->getWikiID() ]
88
				);
89
			}
90
		}
91
92
		// Refresh the category table entry if it seems to have no pages. Check
93
		// master for the most up-to-date cat_pages count.
94
		if ( $title->getNamespace() === NS_CATEGORY ) {
95
			$row = $dbw->selectRow(
96
				'category',
97
				[ 'cat_id', 'cat_title', 'cat_pages', 'cat_subcats', 'cat_files' ],
98
				[ 'cat_title' => $title->getDBkey(), 'cat_pages <= 0' ],
99
				__METHOD__
100
			);
101
			if ( $row ) {
102
				Category::newFromRow( $row, $title )->refreshCounts();
0 ignored issues
show
Bug introduced by
It seems like $row defined by $dbw->selectRow('categor...ges <= 0'), __METHOD__) on line 95 can also be of type boolean; however, Category::newFromRow() does only seem to accept object, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
103
			}
104
		}
105
106
		// If using cascading deletes, we can skip some explicit deletes
107
		if ( !$dbw->cascadingDeletes() ) {
0 ignored issues
show
Bug introduced by
The method cascadingDeletes() does not exist on IDatabase. Did you maybe mean delete()?

This check marks calls to methods that do not seem to exist on an object.

This is most likely the result of a method being renamed without all references to it being renamed likewise.

Loading history...
108
			// Delete outgoing links
109
			$this->batchDeleteByPK(
110
				'pagelinks',
111
				[ 'pl_from' => $id ],
112
				[ 'pl_from', 'pl_namespace', 'pl_title' ],
113
				$batchSize
114
			);
115
			$this->batchDeleteByPK(
116
				'imagelinks',
117
				[ 'il_from' => $id ],
118
				[ 'il_from', 'il_to' ],
119
				$batchSize
120
			);
121
			$this->batchDeleteByPK(
122
				'categorylinks',
123
				[ 'cl_from' => $id ],
124
				[ 'cl_from', 'cl_to' ],
125
				$batchSize
126
			);
127
			$this->batchDeleteByPK(
128
				'templatelinks',
129
				[ 'tl_from' => $id ],
130
				[ 'tl_from', 'tl_namespace', 'tl_title' ],
131
				$batchSize
132
			);
133
			$this->batchDeleteByPK(
134
				'externallinks',
135
				[ 'el_from' => $id ],
136
				[ 'el_id' ],
137
				$batchSize
138
			);
139
			$this->batchDeleteByPK(
140
				'langlinks',
141
				[ 'll_from' => $id ],
142
				[ 'll_from', 'll_lang' ],
143
				$batchSize
144
			);
145
			$this->batchDeleteByPK(
146
				'iwlinks',
147
				[ 'iwl_from' => $id ],
148
				[ 'iwl_from', 'iwl_prefix', 'iwl_title' ],
149
				$batchSize
150
			);
151
			// Delete any redirect entry or page props entries
152
			$dbw->delete( 'redirect', [ 'rd_from' => $id ], __METHOD__ );
153
			$dbw->delete( 'page_props', [ 'pp_page' => $id ], __METHOD__ );
154
		}
155
156
		// If using cleanup triggers, we can skip some manual deletes
157
		if ( !$dbw->cleanupTriggers() ) {
158
			// Find recentchanges entries to clean up...
159
			$rcIdsForTitle = $dbw->selectFieldValues(
160
				'recentchanges',
161
				'rc_id',
162
				[
163
					'rc_type != ' . RC_LOG,
164
					'rc_namespace' => $title->getNamespace(),
165
					'rc_title' => $title->getDBkey(),
166
					'rc_timestamp < ' .
167
						$dbw->addQuotes( $dbw->timestamp( $this->timestamp ) )
168
				],
169
				__METHOD__
170
			);
171
			$rcIdsForPage = $dbw->selectFieldValues(
172
				'recentchanges',
173
				'rc_id',
174
				[ 'rc_type != ' . RC_LOG, 'rc_cur_id' => $id ],
175
				__METHOD__
176
			);
177
178
			// T98706: delete by PK to avoid lock contention with RC delete log insertions
179
			$rcIdBatches = array_chunk( array_merge( $rcIdsForTitle, $rcIdsForPage ), $batchSize );
180
			foreach ( $rcIdBatches as $rcIdBatch ) {
181
				$dbw->delete( 'recentchanges', [ 'rc_id' => $rcIdBatch ], __METHOD__ );
182 View Code Duplication
				if ( count( $rcIdBatches ) > 1 ) {
183
					$lbFactory->commitAndWaitForReplication(
184
						__METHOD__, $this->ticket, [ 'wiki' => $dbw->getWikiID() ]
185
					);
186
				}
187
			}
188
		}
189
190
		// Commit and release the lock
191
		ScopedCallback::consume( $scopedLock );
192
	}
193
194
	private function batchDeleteByPK( $table, array $conds, array $pk, $bSize ) {
195
		$services = MediaWikiServices::getInstance();
196
		$lbFactory = $services->getDBLoadBalancerFactory();
197
		$dbw = $this->getDB(); // convenience
198
199
		$res = $dbw->select( $table, $pk, $conds, __METHOD__ );
200
201
		$pkDeleteConds = [];
202
		foreach ( $res as $row ) {
0 ignored issues
show
Bug introduced by
The expression $res of type object<ResultWrapper>|boolean is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
203
			$pkDeleteConds[] = $dbw->makeList( (array)$row, LIST_AND );
204
			if ( count( $pkDeleteConds ) >= $bSize ) {
205
				$dbw->delete( $table, $dbw->makeList( $pkDeleteConds, LIST_OR ), __METHOD__ );
206
				$lbFactory->commitAndWaitForReplication(
207
					__METHOD__, $this->ticket, [ 'wiki' => $dbw->getWikiID() ]
208
				);
209
				$pkDeleteConds = [];
210
			}
211
		}
212
213
		if ( $pkDeleteConds ) {
214
			$dbw->delete( $table, $dbw->makeList( $pkDeleteConds, LIST_OR ), __METHOD__ );
215
		}
216
	}
217
218
	protected function getDB() {
219
		if ( !$this->db ) {
220
			$this->db = wfGetDB( DB_MASTER );
221
		}
222
223
		return $this->db;
224
	}
225
226
	public function getAsJobSpecification() {
227
		return [
228
			'wiki' => $this->getDB()->getWikiID(),
229
			'job'  => new JobSpecification(
230
				'deleteLinks',
231
				[ 'pageId' => $this->pageId, 'timestamp' => $this->timestamp ],
232
				[ 'removeDuplicates' => true ],
233
				$this->page->getTitle()
234
			)
235
		];
236
	}
237
}
238