Issues (4122)

Security Analysis    not enabled

This project does not seem to handle request data directly as such no vulnerable execution paths were found.

  Cross-Site Scripting
Cross-Site Scripting enables an attacker to inject code into the response of a web-request that is viewed by other users. It can for example be used to bypass access controls, or even to take over other users' accounts.
  File Exposure
File Exposure allows an attacker to gain access to local files that he should not be able to access. These files can for example include database credentials, or other configuration files.
  File Manipulation
File Manipulation enables an attacker to write custom data to files. This potentially leads to injection of arbitrary code on the server.
  Object Injection
Object Injection enables an attacker to inject an object into PHP code, and can lead to arbitrary code execution, file exposure, or file manipulation attacks.
  Code Injection
Code Injection enables an attacker to execute arbitrary code on the server.
  Response Splitting
Response Splitting can be used to send arbitrary responses.
  File Inclusion
File Inclusion enables an attacker to inject custom files into PHP's file loading mechanism, either explicitly passed to include, or for example via PHP's auto-loading mechanism.
  Command Injection
Command Injection enables an attacker to inject a shell command that is execute with the privileges of the web-server. This can be used to expose sensitive data, or gain access of your server.
  SQL Injection
SQL Injection enables an attacker to execute arbitrary SQL code on your database server gaining access to user data, or manipulating user data.
  XPath Injection
XPath Injection enables an attacker to modify the parts of XML document that are read. If that XML document is for example used for authentication, this can lead to further vulnerabilities similar to SQL Injection.
  LDAP Injection
LDAP Injection enables an attacker to inject LDAP statements potentially granting permission to run unauthorized queries, or modify content inside the LDAP tree.
  Header Injection
  Other Vulnerability
This category comprises other attack vectors such as manipulating the PHP runtime, loading custom extensions, freezing the runtime, or similar.
  Regex Injection
Regex Injection enables an attacker to execute arbitrary code in your PHP process.
  XML Injection
XML Injection enables an attacker to read files on your local filesystem including configuration files, or can be abused to freeze your web-server process.
  Variable Injection
Variable Injection enables an attacker to overwrite program variables with custom data, and can lead to further vulnerabilities.
Unfortunately, the security analysis is currently not available for your project. If you are a non-commercial open-source project, please contact support to gain access.

includes/site/MediaWikiPageNameNormalizer.php (2 issues)

Labels

Upgrade to new PHP Analysis Engine

These results are based on our legacy PHP analysis, consider migrating to our new PHP analysis engine instead. Learn more

1
<?php
2
3
namespace MediaWiki\Site;
4
5
use FormatJson;
6
use Http;
7
use UtfNormal\Validator;
8
9
/**
10
 * Service for normalizing a page name using a MediaWiki api.
11
 *
12
 * This program is free software; you can redistribute it and/or modify
13
 * it under the terms of the GNU General Public License as published by
14
 * the Free Software Foundation; either version 2 of the License, or
15
 * (at your option) any later version.
16
 *
17
 * This program is distributed in the hope that it will be useful,
18
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
19
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20
 * GNU General Public License for more details.
21
 *
22
 * You should have received a copy of the GNU General Public License along
23
 * with this program; if not, write to the Free Software Foundation, Inc.,
24
 * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
25
 * http://www.gnu.org/copyleft/gpl.html
26
 *
27
 * @since 1.27
28
 *
29
 * @license GNU GPL v2+
30
 * @author John Erling Blad < [email protected] >
31
 * @author Daniel Kinzler
32
 * @author Jeroen De Dauw < [email protected] >
33
 * @author Marius Hoch
34
 */
35
class MediaWikiPageNameNormalizer {
36
37
	/**
38
	 * @var Http
39
	 */
40
	private $http;
41
42
	/**
43
	 * @param Http|null $http
44
	 */
45
	public function __construct( Http $http = null ) {
46
		if ( !$http ) {
47
			$http = new Http();
48
		}
49
50
		$this->http = $http;
51
	}
52
53
	/**
54
	 * Returns the normalized form of the given page title, using the
55
	 * normalization rules of the given site. If the given title is a redirect,
56
	 * the redirect weill be resolved and the redirect target is returned.
57
	 *
58
	 * @note This actually makes an API request to the remote site, so beware
59
	 *   that this function is slow and depends on an external service.
60
	 *
61
	 * @see Site::normalizePageName
62
	 *
63
	 * @since 1.27
64
	 *
65
	 * @param string $pageName
66
	 * @param string $apiUrl
67
	 *
68
	 * @return string
69
	 * @throws \MWException
70
	 */
71
	public function normalizePageName( $pageName, $apiUrl ) {
72
73
		// Check if we have strings as arguments.
74
		if ( !is_string( $pageName ) ) {
75
			throw new \MWException( '$pageName must be a string' );
76
		}
77
78
		// Go on call the external site
79
80
		// Make sure the string is normalized into NFC (due to T42017)
81
		// but do nothing to the whitespaces, that should work appropriately.
82
		// @see https://phabricator.wikimedia.org/T42017
83
		$pageName = Validator::cleanUp( $pageName );
84
85
		// Build the args for the specific call
86
		$args = [
87
			'action' => 'query',
88
			'prop' => 'info',
89
			'redirects' => true,
90
			'converttitles' => true,
91
			'format' => 'json',
92
			'titles' => $pageName,
93
			// @todo options for maxlag and maxage
94
			// Note that maxlag will lead to a long delay before a reply is made,
95
			// but that maxage can avoid the extreme delay. On the other hand
96
			// maxage could be nice to use anyhow as it stops unnecessary requests.
97
			// Also consider smaxage if maxage is used.
98
		];
99
100
		$url = wfAppendQuery( $apiUrl, $args );
101
102
		// Go on call the external site
103
		// @todo we need a good way to specify a timeout here.
104
		$ret = $this->http->get( $url, [], __METHOD__ );
105
106
		if ( $ret === false ) {
107
			wfDebugLog( "MediaWikiSite", "call to external site failed: $url" );
108
			return false;
109
		}
110
111
		$data = FormatJson::decode( $ret, true );
0 ignored issues
show
It seems like $ret defined by $this->http->get($url, array(), __METHOD__) on line 104 can also be of type boolean; however, FormatJson::decode() does only seem to accept string, maybe add an additional type check?

If a method or function can return multiple different values and unless you are sure that you only can receive a single value in this context, we recommend to add an additional type check:

/**
 * @return array|string
 */
function returnsDifferentValues($x) {
    if ($x) {
        return 'foo';
    }

    return array();
}

$x = returnsDifferentValues($y);
if (is_array($x)) {
    // $x is an array.
}

If this a common case that PHP Analyzer should handle natively, please let us know by opening an issue.

Loading history...
112
113
		if ( !is_array( $data ) ) {
114
			wfDebugLog( "MediaWikiSite", "call to <$url> returned bad json: " . $ret );
115
			return false;
116
		}
117
118
		$page = static::extractPageRecord( $data, $pageName );
0 ignored issues
show
Since extractPageRecord() is declared private, calling it with static will lead to errors in possible sub-classes. You can either use self, or increase the visibility of extractPageRecord() to at least protected.

Let’s assume you have a class which uses late-static binding:

class YourClass
{
    private static function getTemperature() {
        return "3422 °C";
}

public static function getSomeVariable()
{
    return static::getTemperature();
}

}

The code above will run fine in your PHP runtime. However, if you now create a sub-class and call the getSomeVariable() on that sub-class, you will receive a runtime error:

class YourSubClass extends YourClass {
      private static function getTemperature() {
        return "-182 °C";
    }
}

print YourSubClass::getSomeVariable(); // Will cause an access error.

In the case above, it makes sense to update SomeClass to use self instead:

class YourClass
{
    private static function getTemperature() {
        return "3422 °C";
    }

    public static function getSomeVariable()
    {
        return self::getTemperature();
    }
}
Loading history...
119
120
		if ( isset( $page['missing'] ) ) {
121
			wfDebugLog( "MediaWikiSite", "call to <$url> returned a marker for a missing page title! "
122
				. $ret );
123
			return false;
124
		}
125
126
		if ( isset( $page['invalid'] ) ) {
127
			wfDebugLog( "MediaWikiSite", "call to <$url> returned a marker for an invalid page title! "
128
				. $ret );
129
			return false;
130
		}
131
132
		if ( !isset( $page['title'] ) ) {
133
			wfDebugLog( "MediaWikiSite", "call to <$url> did not return a page title! " . $ret );
134
			return false;
135
		}
136
137
		return $page['title'];
138
	}
139
140
	/**
141
	 * Get normalization record for a given page title from an API response.
142
	 *
143
	 * @param array $externalData A reply from the API on a external server.
144
	 * @param string $pageTitle Identifies the page at the external site, needing normalization.
145
	 *
146
	 * @return array|bool A 'page' structure representing the page identified by $pageTitle.
147
	 */
148
	private static function extractPageRecord( $externalData, $pageTitle ) {
149
		// If there is a special case with only one returned page
150
		// we can cheat, and only return
151
		// the single page in the "pages" substructure.
152
		if ( isset( $externalData['query']['pages'] ) ) {
153
			$pages = array_values( $externalData['query']['pages'] );
154
			if ( count( $pages ) === 1 ) {
155
				return $pages[0];
156
			}
157
		}
158
		// This is only used during internal testing, as it is assumed
159
		// a more optimal (and lossfree) storage.
160
		// Make initial checks and return if prerequisites are not meet.
161
		if ( !is_array( $externalData ) || !isset( $externalData['query'] ) ) {
162
			return false;
163
		}
164
		// Loop over the tree different named structures, that otherwise are similar
165
		$structs = [
166
			'normalized' => 'from',
167
			'converted' => 'from',
168
			'redirects' => 'from',
169
			'pages' => 'title'
170
		];
171
		foreach ( $structs as $listId => $fieldId ) {
172
			// Check if the substructure exist at all.
173
			if ( !isset( $externalData['query'][$listId] ) ) {
174
				continue;
175
			}
176
			// Filter the substructure down to what we actually are using.
177
			$collectedHits = array_filter(
178
				array_values( $externalData['query'][$listId] ),
179
				function ( $a ) use ( $fieldId, $pageTitle ) {
180
					return $a[$fieldId] === $pageTitle;
181
				}
182
			);
183
			// If still looping over normalization, conversion or redirects,
184
			// then we need to keep the new page title for later rounds.
185
			if ( $fieldId === 'from' && is_array( $collectedHits ) ) {
186
				switch ( count( $collectedHits ) ) {
187
					case 0:
188
						break;
189
					case 1:
190
						$pageTitle = $collectedHits[0]['to'];
191
						break;
192
					default:
193
						return false;
194
				}
195
			} elseif ( $fieldId === 'title' && is_array( $collectedHits ) ) {
196
				// If on the pages structure we should prepare for returning.
197
198
				switch ( count( $collectedHits ) ) {
199
					case 0:
200
						return false;
201
					case 1:
202
						return array_shift( $collectedHits );
203
					default:
204
						return false;
205
				}
206
			}
207
		}
208
		// should never be here
209
		return false;
210
	}
211
212
}
213