Issues (196)

Security Analysis    6 potential vulnerabilities

This project does not seem to handle request data directly as such no vulnerable execution paths were found.

  File Inclusion
File Inclusion enables an attacker to inject custom files into PHP's file loading mechanism, either explicitly passed to include, or for example via PHP's auto-loading mechanism.
  Regex Injection
Regex Injection enables an attacker to execute arbitrary code in your PHP process.
  SQL Injection (4)
SQL Injection enables an attacker to execute arbitrary SQL code on your database server gaining access to user data, or manipulating user data.
  Response Splitting
Response Splitting can be used to send arbitrary responses.
  File Manipulation
File Manipulation enables an attacker to write custom data to files. This potentially leads to injection of arbitrary code on the server.
  Object Injection
Object Injection enables an attacker to inject an object into PHP code, and can lead to arbitrary code execution, file exposure, or file manipulation attacks.
  File Exposure
File Exposure allows an attacker to gain access to local files that he should not be able to access. These files can for example include database credentials, or other configuration files.
  XML Injection
XML Injection enables an attacker to read files on your local filesystem including configuration files, or can be abused to freeze your web-server process.
  Code Injection
Code Injection enables an attacker to execute arbitrary code on the server.
  Variable Injection (1)
Variable Injection enables an attacker to overwrite program variables with custom data, and can lead to further vulnerabilities.
  XPath Injection
XPath Injection enables an attacker to modify the parts of XML document that are read. If that XML document is for example used for authentication, this can lead to further vulnerabilities similar to SQL Injection.
  Other Vulnerability
This category comprises other attack vectors such as manipulating the PHP runtime, loading custom extensions, freezing the runtime, or similar.
  Command Injection
Command Injection enables an attacker to inject a shell command that is execute with the privileges of the web-server. This can be used to expose sensitive data, or gain access of your server.
  LDAP Injection
LDAP Injection enables an attacker to inject LDAP statements potentially granting permission to run unauthorized queries, or modify content inside the LDAP tree.
  Cross-Site Scripting
Cross-Site Scripting enables an attacker to inject code into the response of a web-request that is viewed by other users. It can for example be used to bypass access controls, or even to take over other users' accounts.
Unfortunately, the security analysis is currently not available for your project. If you are a non-commercial open-source project, please contact support to gain access.

src/Model/Authorship.php (3 issues)

Labels
Severity
1
<?php
2
3
declare(strict_types=1);
4
5
namespace App\Model;
6
7
use App\Repository\AuthorshipRepository;
8
use DateTime;
9
use GuzzleHttp\Exception\RequestException;
10
11
class Authorship extends Model
12
{
13
    /** @const string[] Domain names of wikis supported by WikiWho. */
14
    public const SUPPORTED_PROJECTS = [
15
        'ar.wikipedia.org',
16
        'de.wikipedia.org',
17
        'en.wikipedia.org',
18
        'es.wikipedia.org',
19
        'eu.wikipedia.org',
20
        'fr.wikipedia.org',
21
        'hu.wikipedia.org',
22
        'id.wikipedia.org',
23
        'it.wikipedia.org',
24
        'ja.wikipedia.org',
25
        'nl.wikipedia.org',
26
        'pl.wikipedia.org',
27
        'pt.wikipedia.org',
28
        'tr.wikipedia.org',
29
    ];
30
31
    /** @var int|null Target revision ID. Null for latest revision. */
32
    protected ?int $target;
33
34
    /** @var array List of editors and the percentage of the current content that they authored. */
35
    protected array $data;
36
37
    /** @var array Revision that the data pertains to, with keys 'id' and 'timestamp'. */
38
    protected array $revision;
39
40
    /**
41
     * Authorship constructor.
42
     * @param AuthorshipRepository $repository
43
     * @param Page $page The page to process.
44
     * @param string|null $target Either a revision ID or date in YYYY-MM-DD format. Null to use latest revision.
45
     * @param int|null $limit Max number of results.
46
     */
47
    public function __construct(
48
        AuthorshipRepository $repository,
49
        Page $page,
50
        ?string $target = null,
51
        ?int $limit = null
52
    ) {
53
        $this->repository = $repository;
54
        $this->page = $page;
55
        $this->limit = $limit;
56
        $this->target = $this->getTargetRevId($target);
57
    }
58
59
    private function getTargetRevId(?string $target): ?int
60
    {
61
        if (null === $target) {
62
            return null;
63
        }
64
65
        if (preg_match('/\d{4}-\d{2}-\d{2}/', $target)) {
66
            $date = DateTime::createFromFormat('Y-m-d', $target);
67
            return $this->page->getRevisionIdAtDate($date);
0 ignored issues
show
The method getRevisionIdAtDate() does not exist on null. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

67
            return $this->page->/** @scrutinizer ignore-call */ getRevisionIdAtDate($date);

This check looks for calls to methods that do not seem to exist on a given type. It looks for the method on the type itself as well as in inherited classes or implemented interfaces.

This is most likely a typographical error or the method has been renamed.

Loading history...
68
        }
69
70
        return (int)$target;
71
    }
72
73
    /**
74
     * Domains of supported wikis.
75
     * @return string[]
76
     */
77
    public function getSupportedWikis(): array
78
    {
79
        return self::SUPPORTED_PROJECTS;
80
    }
81
82
    /**
83
     * Get the target revision ID. Null for latest revision.
84
     * @return int|null
85
     */
86
    public function getTarget(): ?int
87
    {
88
        return $this->target;
89
    }
90
91
    /**
92
     * Authorship information for the top $this->limit authors.
93
     * @return array
94
     */
95
    public function getList(): array
96
    {
97
        return $this->data['list'] ?? [];
98
    }
99
100
    /**
101
     * Get error thrown when preparing the data, or null if no error occurred.
102
     * @return string|null
103
     */
104
    public function getError(): ?string
105
    {
106
        return $this->data['error'] ?? null;
107
    }
108
109
    /**
110
     * Get the total number of authors.
111
     * @return int
112
     */
113
    public function getTotalAuthors(): int
114
    {
115
        return $this->data['totalAuthors'];
116
    }
117
118
    /**
119
     * Get the total number of characters added.
120
     * @return int
121
     */
122
    public function getTotalCount(): int
123
    {
124
        return $this->data['totalCount'];
125
    }
126
127
    /**
128
     * Get summary data on the 'other' authors who are not in the top $this->limit.
129
     * @return array|null
130
     */
131
    public function getOthers(): ?array
132
    {
133
        return $this->data['others'] ?? null;
134
    }
135
136
    /**
137
     * Get the revision the authorship data pertains to, with keys 'id' and 'timestamp'.
138
     * @return array|null
139
     */
140
    public function getRevision(): ?array
141
    {
142
        return $this->revision;
143
    }
144
145
    /**
146
     * Is the given page supported by the Authorship tool?
147
     * @param Page $page
148
     * @return bool
149
     */
150
    public static function isSupportedPage(Page $page): bool
151
    {
152
        return in_array($page->getProject()->getDomain(), self::SUPPORTED_PROJECTS) &&
153
            0 === $page->getNamespace();
154
    }
155
156
    /**
157
     * Get the revision data from the WikiWho API and set $this->revision with basic info.
158
     * If there are errors, they are placed in $this->data['error'] and null will be returned.
159
     * @param bool $returnRevId Whether or not to include revision IDs in the response.
160
     * @return array|null null if there were errors.
161
     */
162
    protected function getRevisionData(bool $returnRevId = false): ?array
163
    {
164
        try {
165
            $ret = $this->repository->getData($this->page, $this->target, $returnRevId);
0 ignored issues
show
The method getData() does not exist on App\Repository\Repository. It seems like you code against a sub-type of App\Repository\Repository such as App\Repository\LargestPagesRepository or App\Repository\AuthorshipRepository. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

165
            /** @scrutinizer ignore-call */ 
166
            $ret = $this->repository->getData($this->page, $this->target, $returnRevId);
Loading history...
166
        } catch (RequestException $e) {
167
            $this->data = [
168
                'error' => 'unknown',
169
            ];
170
            return null;
171
        }
172
173
        // If revision can't be found, return error message.
174
        if (!isset($ret['revisions'][0])) {
175
            $this->data = [
176
                'error' => $ret['Error'] ?? 'Unknown',
177
            ];
178
            return null;
179
        }
180
181
        $revId = array_keys($ret['revisions'][0])[0];
182
        $revisionData = $ret['revisions'][0][$revId];
183
184
        $this->revision = [
185
            'id' => $revId,
186
            'timestamp' => $revisionData['time'],
187
        ];
188
189
        return $revisionData;
190
    }
191
192
    /**
193
     * Get authorship attribution from the WikiWho API.
194
     * @see https://www.f-squared.org/wikiwho/
195
     */
196
    public function prepareData(): void
197
    {
198
        if (isset($this->data)) {
199
            return;
200
        }
201
202
        // Set revision data. self::setRevisionData() returns null if there are errors.
203
        $revisionData = $this->getRevisionData();
204
        if (null === $revisionData) {
205
            return;
206
        }
207
208
        [$counts, $totalCount, $userIds] = $this->countTokens($revisionData['tokens']);
209
        $usernameMap = $this->getUsernameMap($userIds);
210
211
        if (null !== $this->limit) {
212
            $countsToProcess = array_slice($counts, 0, $this->limit, true);
213
        } else {
214
            $countsToProcess = $counts;
215
        }
216
217
        $data = [];
218
219
        // Used to get the character count and percentage of the remaining N editors, after the top $this->limit.
220
        $percentageSum = 0;
221
        $countSum = 0;
222
        $numEditors = 0;
223
224
        // Loop through once more, creating an array with the user names (or IP addresses)
225
        // as the key, and the count and percentage as the value.
226
        foreach ($countsToProcess as $editor => $count) {
227
            $index = $usernameMap[$editor] ?? $editor;
228
229
            $percentage = round(100 * ($count / $totalCount), 1);
230
231
            // If we are showing > 10 editors in the table, we still only want the top 10 for the chart.
232
            if ($numEditors < 10) {
233
                $percentageSum += $percentage;
234
                $countSum += $count;
235
                $numEditors++;
236
            }
237
238
            $data[$index] = [
239
                'count' => $count,
240
                'percentage' => $percentage,
241
            ];
242
        }
243
244
        $this->data = [
245
            'list' => $data,
246
            'totalAuthors' => count($counts),
247
            'totalCount' => $totalCount,
248
        ];
249
250
        // Record character count and percentage for the remaining editors.
251
        if ($percentageSum < 100) {
252
            $this->data['others'] = [
253
                'count' => $totalCount - $countSum,
254
                'percentage' => round(100 - $percentageSum, 1),
255
                'numEditors' => count($counts) - $numEditors,
256
            ];
257
        }
258
    }
259
260
    /**
261
     * Get a map of user IDs to usernames, given the IDs.
262
     * @param int[] $userIds
263
     * @return array IDs as keys, usernames as values.
264
     */
265
    private function getUsernameMap(array $userIds): array
266
    {
267
        if (empty($userIds)) {
268
            return [];
269
        }
270
271
        $userIdsNames = $this->repository->getUsernamesFromIds(
0 ignored issues
show
The method getUsernamesFromIds() does not exist on App\Repository\Repository. It seems like you code against a sub-type of App\Repository\Repository such as App\Repository\AuthorshipRepository. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

271
        /** @scrutinizer ignore-call */ 
272
        $userIdsNames = $this->repository->getUsernamesFromIds(
Loading history...
272
            $this->page->getProject(),
273
            $userIds
274
        );
275
276
        $usernameMap = [];
277
        foreach ($userIdsNames as $userIdName) {
278
            $usernameMap[$userIdName['user_id']] = $userIdName['user_name'];
279
        }
280
281
        return $usernameMap;
282
    }
283
284
    /**
285
     * Get counts of token lengths for each author. Used in self::prepareData()
286
     * @param array $tokens
287
     * @return array [counts by user, total count, IDs of accounts]
288
     */
289
    private function countTokens(array $tokens): array
290
    {
291
        $counts = [];
292
        $userIds = [];
293
        $totalCount = 0;
294
295
        // Loop through the tokens, keeping totals (token length) for each author.
296
        foreach ($tokens as $token) {
297
            $editor = $token['editor'];
298
299
            // IPs are prefixed with '0|', otherwise it's the user ID.
300
            if ('0|' === substr($editor, 0, 2)) {
301
                $editor = substr($editor, 2);
302
            } else {
303
                $userIds[] = $editor;
304
            }
305
306
            if (!isset($counts[$editor])) {
307
                $counts[$editor] = 0;
308
            }
309
310
            $counts[$editor] += strlen($token['str']);
311
            $totalCount += strlen($token['str']);
312
        }
313
314
        // Sort authors by count.
315
        arsort($counts);
316
317
        return [$counts, $totalCount, $userIds];
318
    }
319
}
320