Passed
Pull Request — main (#426)
by MusikAnimal
08:27 queued 04:14
created

GlobalContribsRepository::getRevisions()   C

Complexity

Conditions 12
Paths 67

Size

Total Lines 123
Code Lines 62

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 12
eloc 62
nc 67
nop 7
dl 0
loc 123
rs 6.4024
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
declare(strict_types = 1);
3
4
namespace App\Repository;
5
6
use App\Model\Project;
7
use App\Model\User;
8
use PDO;
9
use Symfony\Component\DependencyInjection\ContainerInterface;
10
use Wikimedia\IPUtils;
11
12
/**
13
 * A GlobalContribsRepository is responsible for retrieving information from the database for the GlobalContribs tool.
14
 * @codeCoverageIgnore
15
 */
16
class GlobalContribsRepository extends Repository
17
{
18
    /** @var Project CentralAuth project (meta.wikimedia for WMF installation). */
19
    protected $caProject;
20
21
    /**
22
     * Create Project and ProjectRepository once we have the container.
23
     * @param ContainerInterface $container
24
     */
25
    public function setContainer(ContainerInterface $container): void
26
    {
27
        parent::setContainer($container);
28
29
        $this->caProject = ProjectRepository::getProject(
30
            $this->container->getParameter('central_auth_project'),
31
            $this->container
32
        );
33
        $this->caProject->getRepository()
34
            ->setContainer($this->container);
35
    }
36
37
    /**
38
     * Get a user's edit count for each project.
39
     * @see GlobalContribsRepository::globalEditCountsFromCentralAuth()
40
     * @see GlobalContribsRepository::globalEditCountsFromDatabases()
41
     * @param User $user The user.
42
     * @return mixed[] Elements are arrays with 'project' (Project), and 'total' (int). Null if anon (too slow).
43
     */
44
    public function globalEditCounts(User $user): ?array
45
    {
46
        if ($user->isAnon()) {
47
            return null;
48
        }
49
50
        // Get the edit counts from CentralAuth or database.
51
        $editCounts = $this->globalEditCountsFromCentralAuth($user);
52
53
        // Pre-populate all projects' metadata, to prevent each project call from fetching it.
54
        $this->caProject->getRepository()->getAll();
0 ignored issues
show
Bug introduced by
The method getAll() does not exist on App\Repository\Repository. It seems like you code against a sub-type of App\Repository\Repository such as App\Repository\ProjectRepository. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

54
        $this->caProject->getRepository()->/** @scrutinizer ignore-call */ getAll();
Loading history...
55
56
        // Compile the output.
57
        $out = [];
58
        foreach ($editCounts as $editCount) {
59
            $out[] = [
60
                'dbName' => $editCount['dbName'],
61
                'total' => $editCount['total'],
62
                'project' => ProjectRepository::getProject($editCount['dbName'], $this->container),
63
            ];
64
        }
65
        return $out;
66
    }
67
68
    /**
69
     * Get a user's total edit count on one or more project.
70
     * Requires the CentralAuth extension to be installed on the project.
71
     * @param User $user The user.
72
     * @return mixed[]|false Elements are arrays with 'dbName' (string), and 'total' (int). False for logged out users.
73
     */
74
    protected function globalEditCountsFromCentralAuth(User $user)
75
    {
76
        if (true === $user->isAnon()) {
77
            return false;
78
        }
79
80
        // Set up cache.
81
        $cacheKey = $this->getCacheKey(func_get_args(), 'gc_globaleditcounts');
82
        if ($this->cache->hasItem($cacheKey)) {
83
            return $this->cache->getItem($cacheKey)->get();
84
        }
85
86
        $params = [
87
            'meta' => 'globaluserinfo',
88
            'guiprop' => 'editcount|merged',
89
            'guiuser' => $user->getUsername(),
90
        ];
91
        $result = $this->executeApiRequest($this->caProject, $params);
92
        if (!isset($result['query']['globaluserinfo']['merged'])) {
93
            return [];
94
        }
95
        $out = [];
96
        foreach ($result['query']['globaluserinfo']['merged'] as $result) {
97
            $out[] = [
98
                'dbName' => $result['wiki'],
99
                'total' => $result['editcount'],
100
            ];
101
        }
102
103
        // Cache and return.
104
        return $this->setCache($cacheKey, $out);
105
    }
106
107
    /**
108
     * Loop through the given dbNames and create Project objects for each.
109
     * @param array $dbNames
110
     * @return Project[] Keyed by database name.
111
     */
112
    private function formatProjects(array $dbNames): array
113
    {
114
        $projects = [];
115
116
        foreach ($dbNames as $dbName) {
117
            $projects[$dbName] = ProjectRepository::getProject($dbName, $this->container);
118
        }
119
120
        return $projects;
121
    }
122
123
    /**
124
     * Get all Projects on which the user has made at least one edit.
125
     * @param User $user
126
     * @return Project[]
127
     */
128
    public function getProjectsWithEdits(User $user): array
129
    {
130
        if ($user->isAnon()) {
131
            $dbNames = array_keys($this->getDbNamesAndActorIds($user));
132
        } else {
133
            $dbNames = [];
134
135
            foreach ($this->globalEditCountsFromCentralAuth($user) as $projectMeta) {
136
                if ($projectMeta['total'] > 0) {
137
                    $dbNames[] = $projectMeta['dbName'];
138
                }
139
            }
140
        }
141
142
        return $this->formatProjects($dbNames);
143
    }
144
145
    /**
146
     * Get projects that the user has made at least one edit on, and the associated actor ID.
147
     * @param User $user
148
     * @param string[] $dbNames Loop over these projects instead of all of them.
149
     * @return mixed[] Keys are database names, values are actor IDs.
150
     */
151
    public function getDbNamesAndActorIds(User $user, ?array $dbNames = null): array
152
    {
153
        // Check cache.
154
        $cacheKey = $this->getCacheKey(func_get_args(), 'gc_db_names_actor_ids');
155
        if ($this->cache->hasItem($cacheKey)) {
156
            return $this->cache->getItem($cacheKey)->get();
157
        }
158
159
        if (!$dbNames) {
160
            $dbNames = array_column($this->caProject->getRepository()->getAll(), 'dbName');
161
        }
162
163
        if ($user->isIpRange()) {
164
            $username = $user->getIpSubstringFromCidr().'%';
165
            $whereClause = "actor_name LIKE :actor";
166
        } else {
167
            $username = $user->getUsername();
168
            $whereClause = "actor_name = :actor";
169
        }
170
171
        $queriesBySlice = [];
172
173
        foreach ($dbNames as $dbName) {
174
            $slice = $this->getDbList()[$dbName];
175
            // actor_revision table only includes users who have made at least one edit.
176
            $actorTable = $this->getTableName($dbName, 'actor', 'revision');
177
            $queriesBySlice[$slice][] = "SELECT '$dbName' AS `dbName`, actor_id " .
178
                "FROM $actorTable WHERE $whereClause";
179
        }
180
181
        $actorIds = [];
182
183
        foreach ($queriesBySlice as $slice => $queries) {
184
            $sql = implode(' UNION ', $queries);
185
            $resultQuery = $this->executeProjectsQuery($slice, $sql, [
186
                'actor' => $username,
187
            ]);
188
189
            while ($row = $resultQuery->fetch()) {
0 ignored issues
show
Deprecated Code introduced by
The function Doctrine\DBAL\ForwardCompatibility\Result::fetch() has been deprecated: Use fetchNumeric(), fetchAssociative() or fetchOne() instead. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

189
            while ($row = /** @scrutinizer ignore-deprecated */ $resultQuery->fetch()) {

This function has been deprecated. The supplier of the function has supplied an explanatory message.

The explanatory message should give you some clue as to whether and when the function will be removed and what other function to use instead.

Loading history...
190
                $actorIds[$row['dbName']] = (int)$row['actor_id'];
191
            }
192
        }
193
194
        return $this->setCache($cacheKey, $actorIds);
195
    }
196
197
    /**
198
     * Get revisions by this user across the given Projects.
199
     * @param string[] $dbNames Database names of projects to iterate over.
200
     * @param User $user The user.
201
     * @param int|string $namespace Namespace ID or 'all' for all namespaces.
202
     * @param int|false $start Unix timestamp or false.
203
     * @param int|false $end Unix timestamp or false.
204
     * @param int $limit The maximum number of revisions to fetch from each project.
205
     * @param int|false $offset Unix timestamp. Used for pagination.
206
     * @return array
207
     */
208
    public function getRevisions(
209
        array $dbNames,
210
        User $user,
211
        $namespace = 'all',
212
        $start = false,
213
        $end = false,
214
        int $limit = 31, // One extra to know whether there should be another page.
215
        $offset = false
216
    ): array {
217
        // Check cache.
218
        $cacheKey = $this->getCacheKey(func_get_args(), 'gc_revisions');
219
        if ($this->cache->hasItem($cacheKey)) {
220
            return $this->cache->getItem($cacheKey)->get();
221
        }
222
223
        // Just need any Connection to use the ->quote() method.
224
        $quoteConn = $this->getProjectsConnection('s1');
225
        $username = $quoteConn->quote($user->getUsername(), PDO::PARAM_STR);
226
227
        // IP range handling.
228
        $startIp = '';
229
        $endIp = '';
230
        if ($user->isIpRange()) {
231
            [$startIp, $endIp] = IPUtils::parseRange($user->getUsername());
232
            $startIp = $quoteConn->quote($startIp, PDO::PARAM_STR);
233
            $endIp = $quoteConn->quote($endIp, PDO::PARAM_STR);
234
        }
235
236
        // Fetch actor IDs (for IP ranges, it strips trailing zeros and uses a LIKE query).
237
        $actorIds = $this->getDbNamesAndActorIds($user, $dbNames);
238
239
        if (!$actorIds) {
240
            return [];
241
        }
242
243
        $namespaceCond = 'all' === $namespace
244
            ? ''
245
            : 'AND page_namespace = '.(int)$namespace;
246
        $revDateConditions = $this->getDateConditions($start, $end, $offset, 'revs.', 'rev_timestamp');
247
248
        // Assemble queries.
249
        $queriesBySlice = [];
250
        $projectRepo = $this->caProject->getRepository();
251
        foreach ($dbNames as $dbName) {
252
            if (isset($actorIds[$dbName])) {
253
                $revisionTable = $projectRepo->getTableName($dbName, 'revision');
254
                $pageTable = $projectRepo->getTableName($dbName, 'page');
255
                $commentTable = $projectRepo->getTableName($dbName, 'comment', 'revision');
256
                $actorTable = $projectRepo->getTableName($dbName, 'actor', 'revision');
257
                $tagTable = $projectRepo->getTableName($dbName, 'change_tag');
258
                $tagDefTable = $projectRepo->getTableName($dbName, 'change_tag_def');
259
260
                if ($user->isIpRange()) {
261
                    $ipcTable = $projectRepo->getTableName($dbName, 'ip_changes');
262
                    $ipcJoin = "JOIN $ipcTable ON revs.rev_id = ipc_rev_id";
263
                    $whereClause = "ipc_hex BETWEEN $startIp AND $endIp";
264
                    $username = 'actor_name';
265
                } else {
266
                    $ipcJoin = '';
267
                    $whereClause = 'revs.rev_actor = '.$actorIds[$dbName];
268
                }
269
270
                $slice = $this->getDbList()[$dbName];
271
                $queriesBySlice[$slice][] = "
272
                    SELECT
273
                        '$dbName' AS dbName,
274
                        revs.rev_id AS id,
275
                        revs.rev_timestamp AS timestamp,
276
                        UNIX_TIMESTAMP(revs.rev_timestamp) AS unix_timestamp,
277
                        revs.rev_minor_edit AS minor,
278
                        revs.rev_deleted AS deleted,
279
                        revs.rev_len AS length,
280
                        (CAST(revs.rev_len AS SIGNED) - IFNULL(parentrevs.rev_len, 0)) AS length_change,
281
                        revs.rev_parent_id AS parent_id,
282
                        $username AS username,
283
                        page.page_title,
284
                        page.page_namespace,
285
                        comment_text AS comment,
286
                        (
287
                            SELECT 1
288
                            FROM $tagTable
289
                            WHERE ct_rev_id = revs.rev_id
290
                            AND ct_tag_id = (
291
                                SELECT ctd_id
292
                                FROM $tagDefTable
293
                                WHERE ctd_name = 'mw-reverted'
294
                            )
295
                            LIMIT 1
296
                        ) AS reverted
297
                    FROM $revisionTable AS revs
298
                        $ipcJoin
299
                        JOIN $pageTable AS page ON (rev_page = page_id)
300
                        JOIN $actorTable ON (actor_id = revs.rev_actor)
301
                        LEFT JOIN $revisionTable AS parentrevs ON (revs.rev_parent_id = parentrevs.rev_id)
302
                        LEFT OUTER JOIN $commentTable ON revs.rev_comment_id = comment_id
303
                    WHERE $whereClause
304
                        $namespaceCond
305
                        $revDateConditions";
306
            }
307
        }
308
309
        // Re-assemble into UNIONed queries, executing as many per slice as possible.
310
        $revisions = [];
311
        foreach ($queriesBySlice as $slice => $queries) {
312
            $sql = "SELECT * FROM ((\n" . join("\n) UNION (\n", $queries) . ")) a ORDER BY timestamp DESC LIMIT $limit";
313
            $revisions = array_merge($revisions, $this->executeProjectsQuery($slice, $sql)->fetchAll());
0 ignored issues
show
Deprecated Code introduced by
The function Doctrine\DBAL\ForwardCom...lity\Result::fetchAll() has been deprecated: Use fetchAllNumeric(), fetchAllAssociative() or fetchFirstColumn() instead. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-deprecated  annotation

313
            $revisions = array_merge($revisions, /** @scrutinizer ignore-deprecated */ $this->executeProjectsQuery($slice, $sql)->fetchAll());

This function has been deprecated. The supplier of the function has supplied an explanatory message.

The explanatory message should give you some clue as to whether and when the function will be removed and what other function to use instead.

Loading history...
314
        }
315
316
        // If there are more than $limit results, re-sort by timestamp.
317
        if (count($revisions) > $limit) {
318
            usort($revisions, function ($a, $b) {
319
                if ($a['unix_timestamp'] === $b['unix_timestamp']) {
320
                    return 0;
321
                }
322
                return $a['unix_timestamp'] > $b['unix_timestamp'] ? -1 : 1;
323
            });
324
325
            // Truncate size to $limit.
326
            $revisions = array_slice($revisions, 0, $limit);
327
        }
328
329
        // Cache and return.
330
        return $this->setCache($cacheKey, $revisions);
331
    }
332
}
333