Completed
Push — master ( ac3fb6...2d8d7f )
by
unknown
02:49
created

PagesRepository::getWikidataInfo()   B

Complexity

Conditions 2
Paths 2

Size

Total Lines 39
Code Lines 11

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 39
rs 8.8571
c 0
b 0
f 0
cc 2
eloc 11
nc 2
nop 1
1
<?php
2
/**
3
 * This file contains only the PagesRepository class.
4
 */
5
6
namespace Xtools;
7
8
use DateInterval;
9
use Mediawiki\Api\SimpleRequest;
10
11
/**
12
 * A PagesRepository fetches data about Pages, either singularly or for multiple.
13
 */
14
class PagesRepository extends Repository
15
{
16
17
    /**
18
     * Get metadata about a single page from the API.
19
     * @param Project $project The project to which the page belongs.
20
     * @param string $pageTitle Page title.
21
     * @return string[] Array with some of the following keys: pageid, title, missing, displaytitle,
22
     * url.
23
     */
24
    public function getPageInfo(Project $project, $pageTitle)
25
    {
26
        $info = $this->getPagesInfo($project, [$pageTitle]);
27
        return array_shift($info);
28
    }
29
30
    /**
31
     * Get metadata about a set of pages from the API.
32
     * @param Project $project The project to which the pages belong.
33
     * @param string[] $pageTitles Array of page titles.
34
     * @return string[] Array keyed by the page names, each element with some of the
35
     * following keys: pageid, title, missing, displaytitle, url.
36
     */
37
    public function getPagesInfo(Project $project, $pageTitles)
38
    {
39
        // @TODO: Also include 'extlinks' prop when we start checking for dead external links.
40
        $params = [
41
            'prop' => 'info|pageprops',
42
            'inprop' => 'protection|talkid|watched|watchers|notificationtimestamp|subjectid|url|readable|displaytitle',
43
            'converttitles' => '',
44
            // 'ellimit' => 20,
1 ignored issue
show
Unused Code Comprehensibility introduced by
58% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
45
            // 'elexpandurl' => '',
1 ignored issue
show
Unused Code Comprehensibility introduced by
58% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
46
            'titles' => join('|', $pageTitles),
47
            'formatversion' => 2
48
            // 'pageids' => $pageIds // FIXME: allow page IDs
1 ignored issue
show
Unused Code Comprehensibility introduced by
43% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
49
        ];
50
51
        $query = new SimpleRequest('query', $params);
52
        $api = $this->getMediawikiApi($project);
53
        $res = $api->getRequest($query);
54
        $result = [];
55
        if (isset($res['query']['pages'])) {
56
            foreach ($res['query']['pages'] as $pageInfo) {
57
                $result[$pageInfo['title']] = $pageInfo;
58
            }
59
        }
60
        return $result;
61
    }
62
63
    /**
64
     * Get revisions of a single page.
65
     * @param Page $page The page.
66
     * @param User|null $user Specify to get only revisions by the given user.
67
     * @return string[] Each member with keys: id, timestamp, length-
68
     */
69
    public function getRevisions(Page $page, User $user = null)
70
    {
71
        $cacheKey = 'revisions.'.$page->getId();
72
        if ($user) {
73
            $cacheKey .= '.'.$user->getCacheKey();
74
        }
75
76
        if ($this->cache->hasItem($cacheKey)) {
77
            return $this->cache->getItem($cacheKey)->get();
78
        }
79
80
        $this->stopwatch->start($cacheKey, 'XTools');
81
82
        $stmt = $this->getRevisionsStmt($page, $user);
83
        $result = $stmt->fetchAll();
84
85
        // Cache for 10 minutes, and return.
1 ignored issue
show
Unused Code Comprehensibility introduced by
36% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
86
        $cacheItem = $this->cache->getItem($cacheKey)
87
            ->set($result)
88
            ->expiresAfter(new DateInterval('PT10M'));
89
        $this->cache->save($cacheItem);
90
        $this->stopwatch->stop($cacheKey);
91
92
        return $result;
93
    }
94
95
    /**
96
     * Get the statement for a single revision, so that you can iterate row by row.
97
     * @param Page $page The page.
98
     * @param User|null $user Specify to get only revisions by the given user.
99
     * @return Doctrine\DBAL\Driver\PDOStatement
100
     */
101 View Code Duplication
    public function getRevisionsStmt(Page $page, User $user = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
102
    {
103
        $revTable = $this->getTableName($page->getProject()->getDatabaseName(), 'revision');
104
        $userClause = $user ? "revs.rev_user_text in (:username) AND " : "";
105
106
        $sql = "SELECT
107
                    revs.rev_id AS id,
108
                    revs.rev_timestamp AS timestamp,
109
                    revs.rev_minor_edit AS minor,
110
                    revs.rev_len AS length,
111
                    (CAST(revs.rev_len AS SIGNED) - IFNULL(parentrevs.rev_len, 0)) AS length_change,
112
                    revs.rev_user AS user_id,
113
                    revs.rev_user_text AS username,
114
                    revs.rev_comment AS comment
115
                FROM $revTable AS revs
116
                LEFT JOIN $revTable AS parentrevs ON (revs.rev_parent_id = parentrevs.rev_id)
117
                WHERE $userClause revs.rev_page = :pageid
118
                ORDER BY revs.rev_timestamp ASC";
119
120
        $params = ['pageid' => $page->getId()];
121
        if ($user) {
122
            $params['username'] = $user->getUsername();
123
        }
124
125
        $conn = $this->getProjectsConnection();
126
        return $conn->executeQuery($sql, $params);
127
    }
128
129
    /**
130
     * Get a count of the number of revisions of a single page
131
     * @param Page $page The page.
132
     * @param User|null $user Specify to only count revisions by the given user.
133
     * @return int
134
     */
135 View Code Duplication
    public function getNumRevisions(Page $page, User $user = null)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
136
    {
137
        $revTable = $this->getTableName($page->getProject()->getDatabaseName(), 'revision');
138
        $userClause = $user ? "rev_user_text in (:username) AND " : "";
139
140
        $sql = "SELECT COUNT(*)
141
                FROM $revTable
142
                WHERE $userClause rev_page = :pageid";
143
        $params = ['pageid' => $page->getId()];
144
        if ($user) {
145
            $params['username'] = $user->getUsername();
146
        }
147
        $conn = $this->getProjectsConnection();
148
        return $conn->executeQuery($sql, $params)->fetchColumn(0);
149
    }
150
151
    /**
152
     * Get assessment data for the given pages
153
     * @param Project   $project The project to which the pages belong.
154
     * @param  int[]    $pageIds Page IDs
155
     * @return string[] Assessment data as retrieved from the database.
156
     */
157
    public function getAssessments(Project $project, $pageIds)
158
    {
159
        if (!$project->hasPageAssessments()) {
160
            return [];
161
        }
162
        $paTable = $this->getTableName($project->getDatabaseName(), 'page_assessments');
163
        $papTable = $this->getTableName($project->getDatabaseName(), 'page_assessments_projects');
164
        $pageIds = implode($pageIds, ',');
165
166
        $query = "SELECT pap_project_title AS wikiproject, pa_class AS class, pa_importance AS importance
167
                  FROM $paTable
168
                  LEFT JOIN $papTable ON pa_project_id = pap_project_id
169
                  WHERE pa_page_id IN ($pageIds)";
170
171
        $conn = $this->getProjectsConnection();
172
        return $conn->executeQuery($query)->fetchAll();
173
    }
174
175
    /**
176
     * Get any CheckWiki errors of a single page
177
     * @param Page $page
178
     * @return array Results from query
179
     */
180
    public function getCheckWikiErrors(Page $page)
181
    {
182
        // Only support mainspace on Labs installations
183
        if ($page->getNamespace() !== 0 || !$this->isLabs()) {
184
            return [];
185
        }
186
187
        $sql = "SELECT error, notice, found, name_trans AS name, prio, text_trans AS explanation
188
                FROM s51080__checkwiki_p.cw_error a
189
                JOIN s51080__checkwiki_p.cw_overview_errors b
190
                WHERE a.project = b.project
191
                AND a.project = :dbName
192
                AND a.title = :title
193
                AND a.error = b.id
194
                AND a.ok = 0";
195
196
        // remove _p if present
197
        $dbName = preg_replace('/_p$/', '', $page->getProject()->getDatabaseName());
198
199
        // Page title without underscores (str_replace just to be sure)
200
        $pageTitle = str_replace('_', ' ', $page->getTitle());
201
202
        $resultQuery = $this->getToolsConnection()->prepare($sql);
203
        $resultQuery->bindParam(':dbName', $dbName);
204
        $resultQuery->bindParam(':title', $pageTitle);
205
        $resultQuery->execute();
206
207
        return $resultQuery->fetchAll();
208
    }
209
210
    /**
211
     * Get basic wikidata on the page: label and description.
212
     * @param Page $page
213
     * @return string[] In the format:
214
     *    [[
215
     *         'term' => string such as 'label',
216
     *         'term_text' => string (value for 'label'),
217
     *     ], ... ]
218
     */
219
    public function getWikidataInfo(Page $page)
220
    {
221
        if (empty($page->getWikidataId())) {
222
            return [];
223
        }
224
225
        $wikidataId = ltrim($page->getWikidataId(), 'Q');
226
        $lang = $page->getProject()->getLang();
227
228
        $sql = "SELECT IF(term_type = 'label', 'label', 'description') AS term, term_text
229
                FROM wikidatawiki_p.wb_entity_per_page
230
                JOIN wikidatawiki_p.page ON epp_page_id = page_id
231
                JOIN wikidatawiki_p.wb_terms ON term_entity_id = epp_entity_id
232
                    AND term_language = :lang
233
                    AND term_type IN ('label', 'description')
234
                WHERE epp_entity_id = :wikidataId
235
236
                UNION
237
238
                SELECT pl_title AS term, wb_terms.term_text
239
                FROM wikidatawiki_p.pagelinks
240
                JOIN wikidatawiki_p.wb_terms ON term_entity_id = SUBSTRING(pl_title, 2)
241
                    AND term_entity_type = (IF(SUBSTRING(pl_title, 1, 1) = 'Q', 'item', 'property'))
242
                    AND term_language = :lang
243
                    AND term_type = 'label'
244
                WHERE pl_namespace IN (0, 120)
245
                    AND pl_from = (
246
                        SELECT page_id FROM wikidatawiki_p.page
247
                        WHERE page_namespace = 0
248
                            AND page_title = 'Q:wikidataId'
249
                    )";
250
251
        $resultQuery = $this->getProjectsConnection()->prepare($sql);
252
        $resultQuery->bindParam(':lang', $lang);
253
        $resultQuery->bindParam(':wikidataId', $wikidataId);
254
        $resultQuery->execute();
255
256
        return $resultQuery->fetchAll();
257
    }
258
259
    /**
260
     * Get or count all wikidata items for the given page,
261
     *     not just languages of sister projects
262
     * @param Page $page
263
     * @param bool $count Set to true to get only a COUNT
264
     * @return string[]|int Records as returend by the DB,
265
     *                      or raw COUNT of the records.
266
     */
267
    public function getWikidataItems(Page $page, $count = false)
268
    {
269
        if (!$page->getWikidataId()) {
270
            return $count ? 0 : [];
271
        }
272
273
        $wikidataId = ltrim($page->getWikidataId(), 'Q');
274
275
        $sql = "SELECT " . ($count ? 'COUNT(*) AS count' : '*') . "
276
                FROM wikidatawiki_p.wb_items_per_site
277
                WHERE ips_item_id = :wikidataId";
278
279
        $resultQuery = $this->getProjectsConnection()->prepare($sql);
280
        $resultQuery->bindParam(':wikidataId', $wikidataId);
281
        $resultQuery->execute();
282
283
        $result = $resultQuery->fetchAll();
284
285
        return $count ? (int) $result[0]['count'] : $result;
286
    }
287
288
    /**
289
     * Get number of in and outgoing links and redirects to the given page.
290
     * @param Page $page
291
     * @return string[] Counts with the keys 'links_ext_count', 'links_out_count',
292
     *                  'links_in_count' and 'redirects_count'
293
     */
294
    public function countLinksAndRedirects(Page $page)
295
    {
296
        $externalLinksTable = $this->getTableName($page->getProject()->getDatabaseName(), 'externallinks');
297
        $pageLinksTable = $this->getTableName($page->getProject()->getDatabaseName(), 'pagelinks');
298
        $redirectTable = $this->getTableName($page->getProject()->getDatabaseName(), 'redirect');
299
300
        $sql = "SELECT COUNT(*) AS value, 'links_ext' AS type
301
                FROM $externalLinksTable WHERE el_from = :id
302
                UNION
303
                SELECT COUNT(*) AS value, 'links_out' AS type
304
                FROM $pageLinksTable WHERE pl_from = :id
305
                UNION
306
                SELECT COUNT(*) AS value, 'links_in' AS type
307
                FROM $pageLinksTable WHERE pl_namespace = :namespace AND pl_title = :title
308
                UNION
309
                SELECT COUNT(*) AS value, 'redirects' AS type
310
                FROM $redirectTable WHERE rd_namespace = :namespace AND rd_title = :title";
311
        $statement = $this->getProjectsConnection()->prepare($sql);
0 ignored issues
show
Unused Code introduced by
$statement is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
312
313
        $params = [
314
            'id' => $page->getId(),
315
            'title' => str_replace(' ', '_', $page->getTitle()),
316
            'namespace' => $page->getNamespace(),
317
        ];
318
319
        $conn = $this->getProjectsConnection();
320
        $res = $conn->executeQuery($sql, $params);
321
322
        $data = [];
323
324
        // Transform to associative array by 'type'
325
        foreach ($res as $row) {
326
            $data[$row['type'] . '_count'] = $row['value'];
327
        }
328
329
        return $data;
330
    }
331
332
    /**
333
     * Count wikidata items for the given page, not just languages of sister projects
334
     * @param Page $page
335
     * @return int Number of records.
336
     */
337
    public function countWikidataItems(Page $page)
338
    {
339
        return $this->getWikidataItems($page, true);
340
    }
341
}
342