Test Setup Failed
Push — master ( 56d675...807e0e )
by
unknown
34:34
created

LingoParser::shouldParse()   B

Complexity

Conditions 7
Paths 5

Size

Total Lines 24
Code Lines 12

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 2
CRAP Score 7

Importance

Changes 0
Metric Value
cc 7
eloc 12
nc 5
nop 1
dl 0
loc 24
ccs 2
cts 2
cp 1
crap 7
rs 8.8333
c 0
b 0
f 0
1
<?php
2
3
/**
4
 * File holding the Lingo\LingoParser class.
5
 *
6
 * This file is part of the MediaWiki extension Lingo.
7
 *
8
 * @copyright 2011 - 2018, Stephan Gambke
9
 * @license GPL-2.0-or-later
10
 *
11
 * The Lingo extension is free software: you can redistribute it and/or modify
12
 * it under the terms of the GNU General Public License as published by the Free
13
 * Software Foundation; either version 2 of the License, or (at your option) any
14
 * later version.
15
 *
16
 * The Lingo extension is distributed in the hope that it will be useful, but
17
 * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
18
 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
19
 * details.
20
 *
21
 * You should have received a copy of the GNU General Public License along
22
 * with this program. If not, see <http://www.gnu.org/licenses/>.
23
 *
24
 * @author Stephan Gambke
25
 *
26
 * @file
27
 * @ingroup Lingo
28
 */
29
namespace Lingo;
30
31
use DOMDocument;
32
use DOMXPath;
33
use ObjectCache;
0 ignored issues
show
Bug introduced by
The type ObjectCache was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
34
use Parser;
0 ignored issues
show
Bug introduced by
The type Parser was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
35
use StubObject;
0 ignored issues
show
Bug introduced by
The type StubObject was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
36
use Title;
0 ignored issues
show
Bug introduced by
The type Title was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
37
38
/**
39
 * This class parses the given text and enriches it with definitions for defined
40
 * terms.
41
 *
42
 * Contains a static function to initiate the parsing.
43
 *
44
 * @ingroup Lingo
45
 */
46
class LingoParser {
47
48
	const WORD_VALUE = 0;
49
	const WORD_OFFSET = 1;
50
51
	private $mLingoTree = null;
52
53
	/**
54
	 * @var Backend
55
	 */
56
	private $mLingoBackend = null;
57
	private static $parserSingleton = null;
58
59
	// Api params passed in from ApiMakeParserOptions Hook
60
	private $mApiParams = null;
61
62
	// The RegEx to split a chunk of text into words
63
	public $regex = null;
64
65 8
	/**
66
	 * Lingo\LingoParser constructor.
67
	 * @param MessageLog|null $messages
68 8
	 */
69 8
	public function __construct( MessageLog &$messages = null ) {
0 ignored issues
show
Unused Code introduced by
The parameter $messages is not used and could be removed. ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-unused  annotation

69
	public function __construct( /** @scrutinizer ignore-unused */ MessageLog &$messages = null ) {

This check looks for parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
70
		// The RegEx to split a chunk of text into words
71
		// Words are: placeholders for stripped items, sequences of letters and numbers, single characters that are neither letter nor number
72
		$this->regex = '/' . preg_quote( Parser::MARKER_PREFIX, '/' ) . '.*?' . preg_quote( Parser::MARKER_SUFFIX, '/' ) . '|[\p{L}\p{N}]+|[^\p{L}\p{N}]/u';
73
	}
74
75
	/**
76 7
	 * @param Parser $mwParser
77 7
	 *
78 3
	 * @return Boolean
79
	 */
80
	public function parse( $mwParser ) {
81 7
		if ( $this->shouldParse( $mwParser ) ) {
82
			$this->realParse( $mwParser );
83
		}
84
85
		return true;
86
	}
87
88
	/**
89
	 * @return LingoParser
90
	 * @since 2.0.1
91
	 */
92
	public static function getInstance() {
93
		if ( !self::$parserSingleton ) {
94
			self::$parserSingleton = new LingoParser();
95
96
		}
97
98
		return self::$parserSingleton;
99
	}
100
101
	/**
102
	 * @return string
103
	 */
104
	private function getCacheKey() {
105
		// FIXME: If Lingo ever stores the glossary tree per user, then the cache key also needs to include the user id (see T163608)
0 ignored issues
show
Coding Style introduced by
Comment refers to a FIXME task "If Lingo ever stores the glossary tree per user, then the cache key also needs to include the user id (see T163608"
Loading history...
106
		return ObjectCache::getLocalClusterInstance()->makeKey( 'ext', 'lingo', 'lingotree', Tree::TREE_VERSION, get_class( $this->getBackend() ) );
107
	}
108
109
	/**
110
	 * @return Backend the backend used by the parser
111
	 * @throws \MWException
112
	 */
113
	public function getBackend() {
114
		if ( $this->mLingoBackend === null ) {
115
			throw new \MWException( 'No Lingo backend available!' );
0 ignored issues
show
Bug introduced by
The type MWException was not found. Maybe you did not declare it correctly or list all dependencies?

The issue could also be caused by a filter entry in the build configuration. If the path has been excluded in your configuration, e.g. excluded_paths: ["lib/*"], you can move it to the dependency path list as follows:

filter:
    dependency_paths: ["lib/*"]

For further information see https://scrutinizer-ci.com/docs/tools/php/php-scrutinizer/#list-dependency-paths

Loading history...
116
		}
117
118
		return $this->mLingoBackend;
119
	}
120
121
	/**
122
	 * Returns the list of terms in the glossary
123
	 *
124
	 * @return array an array mapping terms (keys) to descriptions (values)
125
	 */
126
	public function getLingoArray() {
127
		return $this->getLingoTree()->getTermList();
128
	}
129
130
	/**
131
	 * Returns the list of terms in the glossary as a Lingo\Tree
132
	 *
133
	 * @return Tree a Lingo\Tree mapping terms (keys) to descriptions (values)
134
	 */
135
	public function getLingoTree() {
136
		// build glossary array only once per request
137
		if ( !$this->mLingoTree ) {
138
139
			// use cache if enabled
140
			if ( $this->getBackend()->useCache() ) {
141
142
				// Try cache first
143
				global $wgexLingoCacheType;
144
				$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
0 ignored issues
show
Bug introduced by
The function wfGetCache was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

144
				$cache = ( $wgexLingoCacheType !== null ) ? /** @scrutinizer ignore-call */ wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
Loading history...
Bug introduced by
The function wfGetMainCache was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

144
				$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : /** @scrutinizer ignore-call */ wfGetMainCache();
Loading history...
145
				$cachekey = $this->getCacheKey();
146
				$cachedLingoTree = $cache->get( $cachekey );
147
148
				// cache hit?
149
				if ( $cachedLingoTree !== false && $cachedLingoTree !== null ) {
150
151
					wfDebug( "Cache hit: Got lingo tree from cache.\n" );
0 ignored issues
show
Bug introduced by
The function wfDebug was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

151
					/** @scrutinizer ignore-call */ 
152
     wfDebug( "Cache hit: Got lingo tree from cache.\n" );
Loading history...
152
					$this->mLingoTree = &$cachedLingoTree;
153
154
					wfDebug( "Re-cached lingo tree.\n" );
155
				} else {
156
157
					wfDebug( "Cache miss: Lingo tree not found in cache.\n" );
158
					$this->mLingoTree =& $this->buildLingo();
159
					wfDebug( "Cached lingo tree.\n" );
160
				}
161
162
				// Keep for one month
163
				// Limiting the cache validity will allow to purge stale cache
164
				// entries inserted by older versions after one month
165
				$cache->set( $cachekey, $this->mLingoTree, 60 * 60 * 24 * 30 );
166
167
			} else {
168
				wfDebug( "Caching of lingo tree disabled.\n" );
169
				$this->mLingoTree =& $this->buildLingo();
170
			}
171
172
		}
173
174
		return $this->mLingoTree;
175
	}
176
177
	/**
178
	 * @return Tree
179
	 */
180
	protected function &buildLingo() {
181
		$lingoTree = new Tree();
182
		$backend = &$this->mLingoBackend;
183
184
		// assemble the result array
185
		while ( $elementData = $backend->next() ) {
186
			$lingoTree->addTerm( $elementData[ Element::ELEMENT_TERM ], $elementData );
187
		}
188
189
		return $lingoTree;
190
	}
191
192
	/**
193
	 * Parses the given text and enriches applicable terms
194
	 *
195
	 * This method currently only recognizes terms consisting of max one word
196
	 *
197 3
	 * @param Parser $parser
198 3
	 *
199
	 * @return Boolean
200 3
	 */
201 3
	protected function realParse( &$parser ) {
202
		// Parse text identical to options used in includes/api/ApiParse.php
203
		$params = $this->mApiParams;
204
		$text = is_null( $params ) ? $parser->getOutput()->getText() : $parser->getOutput()->getText( [
205
			'allowTOC' => !$params['disabletoc'],
206
			'enableSectionEditLinks' => !$params['disableeditsection'],
207
			'wrapperDivClass' => $params['wrapoutputclass'],
208
			'deduplicateStyles' => !$params['disablestylededuplication'],
209
		] );
210
211
		if ( $text === null || $text === '' ) {
212
			return true;
213
		}
214
215
		// Get array of terms
216
		$glossary = $this->getLingoTree();
217
218
		if ( $glossary == null ) {
0 ignored issues
show
Coding Style introduced by
Operator == prohibited; use === instead
Loading history...
219
			return true;
220
		}
221
222
		// Parse HTML from page
223
		\MediaWiki\suppressWarnings();
0 ignored issues
show
Bug introduced by
The function suppressWarnings was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

223
		/** @scrutinizer ignore-call */ 
224
  \MediaWiki\suppressWarnings();
Loading history...
224
225
		$doc = new DOMDocument( '1.0', 'utf-8' );
226
		$doc->loadHTML( '<html><head><meta http-equiv="content-type" content="charset=utf-8"/></head><body>' . $text . '</body></html>' );
227
228
		\MediaWiki\restoreWarnings();
0 ignored issues
show
Bug introduced by
The function restoreWarnings was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

228
		/** @scrutinizer ignore-call */ 
229
  \MediaWiki\restoreWarnings();
Loading history...
229
230
		// Find all text in HTML.
231
		$xpath = new DOMXPath( $doc );
232
		$textElements = $xpath->query(
233
			"//*[not(ancestor-or-self::*[@class='noglossary'] or ancestor-or-self::a)][text()!=' ']/text()"
234
		);
235
236
		// Iterate all HTML text matches
237
		$numberOfTextElements = $textElements->length;
238
239
		$definitions = [];
240
241
		for ( $textElementIndex = 0; $textElementIndex < $numberOfTextElements; $textElementIndex++ ) {
242
			$textElement = $textElements->item( $textElementIndex );
243
244
			if ( strlen( $textElement->nodeValue ) < $glossary->getMinTermLength() ) {
245
				continue;
246
			}
247
248
			$matches = [];
249
			preg_match_all(
250
				$this->regex,
251
				$textElement->nodeValue,
252
				$matches,
253
				PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER
254
			);
255
256
			if ( count( $matches ) === 0 || count( $matches[ 0 ] ) === 0 ) {
257
				continue;
258
			}
259
260
			$wordDescriptors = &$matches[ 0 ]; // See __construct() for definition of "word"
261
			$numberOfWordDescriptors = count( $wordDescriptors );
262
263
			$parentNode = &$textElement->parentNode;
264
265
			$wordDescriptorIndex = 0;
266
			$changedElem = false;
267
268
			while ( $wordDescriptorIndex < $numberOfWordDescriptors ) {
269
270
				/** @var \Lingo\Element $definition */
271
				list( $skippedWords, $usedWords, $definition ) =
272
					$glossary->findNextTerm( $wordDescriptors, $wordDescriptorIndex, $numberOfWordDescriptors );
273
274
				if ( $usedWords > 0 ) { // found a term
275
276
					if ( $skippedWords > 0 ) { // skipped some text, insert it as is
277
278
						$start = $wordDescriptors[ $wordDescriptorIndex ][ self::WORD_OFFSET ];
279
						$length = $wordDescriptors[ $wordDescriptorIndex + $skippedWords ][ self::WORD_OFFSET ] - $start;
280
281
						$parentNode->insertBefore(
282
							$doc->createTextNode(
283
								substr( $textElement->nodeValue, $start, $length )
284
							),
285
							$textElement
286
						);
287
					}
288
289
					$parentNode->insertBefore( $definition->getFormattedTerm( $doc ), $textElement );
290
291
					$definitions[ $definition->getId() ] = $definition->getFormattedDefinitions();
292
293
					$changedElem = true;
294
295
				} else { // did not find any term, just use the rest of the text
296
297
					// If we found no term now and no term before, there was no
298
					// term in the whole element. Might as well not change the
299
					// element at all.
300
301
					// Only change element if found term before
302
					if ( $changedElem === true ) {
303
304
						$start = $wordDescriptors[ $wordDescriptorIndex ][ self::WORD_OFFSET ];
305
306
						$parentNode->insertBefore(
307
							$doc->createTextNode(
308
								substr( $textElement->nodeValue, $start )
309
							),
310
							$textElement
311
						);
312
313
					}
314
315
					// In principle superfluous, the loop would run out anyway. Might save a bit of time.
316
					break;
317
				}
318
319
				$wordDescriptorIndex += $usedWords + $skippedWords;
320
			}
321
322
			if ( $changedElem ) {
323
				$parentNode->removeChild( $textElement );
324
			}
325
		}
326
327
		if ( count( $definitions ) > 0 ) {
328
329
			$this->loadModules( $parser );
330
331
			// U - Ungreedy, D - dollar matches only end of string, s - dot matches newlines
332
			$text = preg_replace( '%(^.*<body>)|(</body>.*$)%UDs', '', $doc->saveHTML() );
333
			$text .= $parser->recursiveTagParseFully( implode( $definitions ) );
334
335
			$parser->getOutput()->setText( $text );
336
		}
337
338
		return true;
339
	}
340
341
	/**
342
	 * @param Parser $parser
343
	 */
344
	protected function loadModules( &$parser ) {
345
		global $wgOut;
346
347
		$parserOutput = $parser->getOutput();
348
349
		// load scripts
350
		$parserOutput->addModules( 'ext.Lingo.Scripts' );
351
352
		if ( !$wgOut->isArticle() ) {
353
			$wgOut->addModules( 'ext.Lingo.Scripts' );
354
		}
355
356
		// load styles
357
		$parserOutput->addModuleStyles( 'ext.Lingo.Styles' );
358
359
		if ( !$wgOut->isArticle() ) {
360
			$wgOut->addModuleStyles( 'ext.Lingo.Styles' );
361
		}
362
	}
363
364
	/**
365
	 * Purges the lingo tree from the cache.
366
	 *
367
	 * @deprecated 2.0.2
368
	 */
369
	public static function purgeCache() {
370
		self::getInstance()->purgeGlossaryFromCache();
371
	}
372
373
	/**
374
	 * Purges the lingo tree from the cache.
375
	 *
376
	 * @since 2.0.2
377
	 */
378
	public function purgeGlossaryFromCache() {
379
		global $wgexLingoCacheType;
380
		$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
0 ignored issues
show
Bug introduced by
The function wfGetCache was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

380
		$cache = ( $wgexLingoCacheType !== null ) ? /** @scrutinizer ignore-call */ wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
Loading history...
Bug introduced by
The function wfGetMainCache was not found. Maybe you did not declare it correctly or list all dependencies? ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-call  annotation

380
		$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : /** @scrutinizer ignore-call */ wfGetMainCache();
Loading history...
381
		$cache->delete( $this->getCacheKey() );
382
	}
383
384
	/**
385
	 * @since 2.0.1
386 7
	 * @param Backend $backend
387 7
	 */
388
	public function setBackend( Backend $backend ) {
389 7
		$this->mLingoBackend = $backend;
390 1
		$backend->setLingoParser( $this );
391
	}
392
393 6
	/**
394 1
	 * Set parser options from API
395
	 *
396
	 * @param array $params
397 5
	 */
398
	public function setApiParams( array $params ) {
399 5
		$this->mApiParams = $params;
400 1
	}
401
402
	/**
403 4
	 * @param Parser $parser
404
	 * @return bool
405 4
	 */
406 1
	protected function shouldParse( &$parser ) {
407
		global $wgexLingoUseNamespaces;
408
409 3
		if ( !( $parser instanceof Parser || $parser instanceof StubObject) ) {
410
			return false;
411
		}
412
413
		if ( isset( $parser->mDoubleUnderscores[ 'noglossary' ] ) ) { // __NOGLOSSARY__ found in wikitext
414
			return false;
415
		}
416
417
		$title = $parser->getTitle();
418
419
		if ( !( $title instanceof Title ) ) {
420
			return false;
421
		}
422
423
		$namespace = $title->getNamespace();
424
425
		if ( isset( $wgexLingoUseNamespaces[ $namespace ] ) && $wgexLingoUseNamespaces[ $namespace ] === false ) {
426
			return false;
427
		};
428
429
		return true;
430
	}
431
}
432