Completed
Push — master ( 85d1e0...3dc773 )
by
unknown
04:22
created

LingoParser::realParse()   D

Complexity

Conditions 17
Paths 16

Size

Total Lines 121
Code Lines 67

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 0
CRAP Score 306
Metric Value
dl 0
loc 121
ccs 0
cts 75
cp 0
rs 4.8361
cc 17
eloc 67
nc 16
nop 2
crap 306

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
/**
4
 * File holding the Lingo\LingoParser class.
5
 *
6
 * This file is part of the MediaWiki extension Lingo.
7
 *
8
 * @copyright 2011 - 2016, Stephan Gambke
9
 * @license   GNU General Public License, version 2 (or any later version)
10
 *
11
 * The Lingo extension is free software: you can redistribute it and/or modify
12
 * it under the terms of the GNU General Public License as published by the Free
13
 * Software Foundation; either version 2 of the License, or (at your option) any
14
 * later version.
15
 *
16
 * The Lingo extension is distributed in the hope that it will be useful, but
17
 * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
18
 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
19
 * details.
20
 *
21
 * You should have received a copy of the GNU General Public License along
22
 * with this program. If not, see <http://www.gnu.org/licenses/>.
23
 *
24
 * @author Stephan Gambke
25
 *
26
 * @file
27
 * @ingroup Lingo
28
 */
29
namespace Lingo;
30
31
use DOMDocument;
32
use DOMXPath;
33
use Parser;
34
35
/**
36
 * This class parses the given text and enriches it with definitions for defined
37
 * terms.
38
 *
39
 * Contains a static function to initiate the parsing.
40
 *
41
 * @ingroup Lingo
42
 */
43
class LingoParser {
44
45
	private $mLingoTree = null;
46
47
	/**
48
	 * @var Backend
49
	 */
50
	private $mLingoBackend = null;
51
	private static $parserSingleton = null;
52
53
	// The RegEx to split a chunk of text into words
54
	public static $regex = null;
55
56
	/**
57
	 * Lingo\LingoParser constructor.
58
	 * @param MessageLog|null $messages
59
	 */
60 1
	public function __construct( MessageLog &$messages = null ) {
61 1
		global $wgexLingoBackend;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
62
63 1
		$this->mLingoBackend = new $wgexLingoBackend( $messages );
64 1
	}
65
66
	/**
67
	 *
68
	 * @param Parser $parser
69
	 * @param string $text
70
	 * @return Boolean
71
	 */
72
	public static function parse( Parser &$parser, &$text ) {
73
74
		if ( !self::$parserSingleton ) {
75
			self::$parserSingleton = new LingoParser();
76
77
			// The RegEx to split a chunk of text into words
78
			// Words are: placeholders for stripped items, sequences of letters and numbers, single characters that are neither letter nor number
79
			self::$regex = '/' . preg_quote( Parser::MARKER_PREFIX, '/' ) . '.*?' . preg_quote( Parser::MARKER_SUFFIX, '/' ) . '|[\p{L}\p{N}]+|[^\p{L}\p{N}]/u';
80
		}
81
82
		self::$parserSingleton->realParse( $parser, $text );
83
84
		return true;
85
	}
86
87
	/**
88
	 *
89
	 * @return Backend the backend used by the parser
90
	 */
91
	public function getBackend() {
92
		return $this->mLingoBackend;
93
	}
94
95
	/**
96
	 * Returns the list of terms in the glossary
97
	 *
98
	 * @return Array an array mapping terms (keys) to descriptions (values)
99
	 */
100
	public function getLingoArray() {
101
102
		// build glossary array only once per request
103
		if ( !$this->mLingoTree ) {
104
			$this->buildLingo();
105
		}
106
107
		return $this->mLingoTree->getTermList();
108
	}
109
110
	/**
111
	 * Returns the list of terms in the glossary as a Lingo\Tree
112
	 *
113
	 * @return Tree a Lingo\Tree mapping terms (keys) to descriptions (values)
114
	 */
115
	public function getLingoTree() {
116
117
		// build glossary array only once per request
118
		if ( !$this->mLingoTree ) {
119
120
			// use cache if enabled
121
			if ( $this->mLingoBackend->useCache() ) {
122
123
				// Try cache first
124
				global $wgexLingoCacheType;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
125
				$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
126
				$cachekey = wfMemcKey( 'ext', 'lingo', 'lingotree' );
127
				$cachedLingoTree = $cache->get( $cachekey );
128
129
				// cache hit?
130
				if ( $cachedLingoTree !== false && $cachedLingoTree !== null ) {
131
132
					wfDebug( "Cache hit: Got lingo tree from cache.\n" );
133
					$this->mLingoTree = &$cachedLingoTree;
134
135
				} else {
136
137
					wfDebug( "Cache miss: Lingo tree not found in cache.\n" );
138
					$this->mLingoTree =& $this->buildLingo();
139
					$cache->set( $cachekey, $this->mLingoTree );
140
					wfDebug( "Cached lingo tree.\n" );
141
				}
142
			} else {
143
				wfDebug( "Caching of lingo tree disabled.\n" );
144
				$this->mLingoTree =& $this->buildLingo();
145
			}
146
147
		}
148
149
		return $this->mLingoTree;
150
	}
151
152
	/**
153
	 * @return Tree
154
	 */
155
	protected function &buildLingo() {
156
157
		$lingoTree = new Tree();
158
		$backend = &$this->mLingoBackend;
159
160
		// assemble the result array
161
		while ( $elementData = $backend->next() ) {
162
			$lingoTree->addTerm( $elementData[ Element::ELEMENT_TERM ], $elementData );
163
		}
164
165
		return $lingoTree;
166
	}
167
168
	/**
169
	 * Parses the given text and enriches applicable terms
170
	 *
171
	 * This method currently only recognizes terms consisting of max one word
172
	 *
173
	 * @param $parser
174
	 * @param $text
175
	 * @return Boolean
176
	 */
177
	protected function realParse( &$parser, &$text ) {
0 ignored issues
show
Coding Style introduced by
realParse uses the super-global variable $_POST which is generally not recommended.

Instead of super-globals, we recommend to explicitly inject the dependencies of your class. This makes your code less dependent on global state and it becomes generally more testable:

// Bad
class Router
{
    public function generate($path)
    {
        return $_SERVER['HOST'].$path;
    }
}

// Better
class Router
{
    private $host;

    public function __construct($host)
    {
        $this->host = $host;
    }

    public function generate($path)
    {
        return $this->host.$path;
    }
}

class Controller
{
    public function myAction(Request $request)
    {
        // Instead of
        $page = isset($_GET['page']) ? intval($_GET['page']) : 1;

        // Better (assuming you use the Symfony2 request)
        $page = $request->query->get('page', 1);
    }
}
Loading history...
178
		global $wgRequest;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
179
180
		$action = $wgRequest->getVal( 'action', 'view' );
181
182
		if ( $text === null ||
183
			$text === '' ||
184
			$action === 'edit' ||
185
			$action === 'ajax' ||
186
			isset( $_POST[ 'wpPreview' ] )
187
		) {
188
189
			return true;
190
		}
191
192
		// Get array of terms
193
		$glossary = $this->getLingoTree();
194
195
		if ( $glossary == null ) {
196
			return true;
197
		}
198
199
		// Parse HTML from page
200
		wfSuppressWarnings();
201
202
		$doc = new DOMDocument( '1.0', 'utf-8' );
203
		$doc->loadHTML( '<html><head><meta http-equiv="content-type" content="charset=utf-8"/></head><body>' . $text . '</body></html>' );
204
205
		wfRestoreWarnings();
206
207
		// Find all text in HTML.
208
		$xpath = new DOMXpath( $doc );
209
		$elements = $xpath->query(
210
			"//*[not(ancestor-or-self::*[@class='noglossary'] or ancestor-or-self::a)][text()!=' ']/text()"
211
		);
212
213
		// Iterate all HTML text matches
214
		$nb = $elements->length;
215
		$changedDoc = false;
216
217
		for ( $pos = 0; $pos < $nb; $pos++ ) {
218
			$el = $elements->item( $pos );
219
220
			if ( strlen( $el->nodeValue ) < $glossary->getMinTermLength() ) {
221
				continue;
222
			}
223
224
			$matches = array();
225
			preg_match_all(
226
				self::$regex,
227
				$el->nodeValue,
228
				$matches,
229
				PREG_OFFSET_CAPTURE | PREG_PATTERN_ORDER
230
			);
231
232
			if ( count( $matches ) == 0 || count( $matches[ 0 ] ) == 0 ) {
233
				continue;
234
			}
235
236
			$lexemes = &$matches[ 0 ];
237
			$countLexemes = count( $lexemes );
238
			$parent = &$el->parentNode;
239
			$index = 0;
240
			$changedElem = false;
241
242
			while ( $index < $countLexemes ) {
243
				list( $skipped, $used, $definition ) =
244
					$glossary->findNextTerm( $lexemes, $index, $countLexemes );
245
246
				if ( $used > 0 ) { // found a term
247
					if ( $skipped > 0 ) { // skipped some text, insert it as is
248
						$parent->insertBefore(
249
							$doc->createTextNode(
250
								substr( $el->nodeValue,
251
									$currLexIndex = $lexemes[ $index ][ 1 ],
252
									$lexemes[ $index + $skipped ][ 1 ] - $currLexIndex )
253
							),
254
							$el
255
						);
256
					}
257
258
					$parent->insertBefore( $definition->getFullDefinition( $doc ), $el );
259
260
					$changedElem = true;
261
				} else { // did not find term, just use the rest of the text
262
					// If we found no term now and no term before, there was no
263
					// term in the whole element. Might as well not change the
264
					// element at all.
265
					// Only change element if found term before
266
					if ( $changedElem ) {
267
						$parent->insertBefore(
268
							$doc->createTextNode(
269
								substr( $el->nodeValue, $lexemes[ $index ][ 1 ] )
270
							),
271
							$el
272
						);
273
					} else {
274
						// In principle superfluous, the loop would run out
275
						// anyway. Might save a bit of time.
276
						break;
277
					}
278
				}
279
280
				$index += $used + $skipped;
281
			}
282
283
			if ( $changedElem ) {
284
				$parent->removeChild( $el );
285
				$changedDoc = true;
286
			}
287
		}
288
289
		if ( $changedDoc ) {
290
			$this->loadModules( $parser );
291
292
			// U - Ungreedy, D - dollar matches only end of string, s - dot matches newlines
293
			$text = preg_replace( '%(^.*<body>)|(</body>.*$)%UDs', '', $doc->saveHTML() );
294
		}
295
296
		return true;
297
	}
298
299
	/**
300
	 * @param $parser
301
	 */
302
	protected function loadModules( &$parser ) {
303
		global $wgOut, $wgScriptPath;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
304
305
		$parserOutput = $parser->getOutput();
306
307
		// load scripts
308
		if ( defined( 'MW_SUPPORTS_RESOURCE_MODULES' ) ) {
309
			$parserOutput->addModules( 'ext.Lingo.Scripts' );
310
311
			if ( !$wgOut->isArticle() ) {
312
				$wgOut->addModules( 'ext.Lingo.Scripts' );
313
			}
314
		} else {
315
			global $wgStylePath;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
316
			$parserOutput->addHeadItem( "<script src='$wgStylePath/common/jquery.min.js'></script>\n", 'ext.Lingo.jq' );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgStylePath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
317
			$parserOutput->addHeadItem( "<script src='$wgScriptPath/extensions/Lingo/libs/Lingo.js'></script>\n", 'ext.Lingo.Scripts' );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgScriptPath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
318
319
			if ( !$wgOut->isArticle() ) {
320
				$wgOut->addHeadItem( 'ext.Lingo.jq', "<script src='$wgStylePath/common/jquery.min.js'></script>\n" );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgStylePath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
321
				$wgOut->addHeadItem( 'ext.Lingo.Scripts', "<script src='$wgScriptPath/extensions/Lingo/libs/Lingo.js'></script>\n" );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgScriptPath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
322
			}
323
		}
324
325
		// load styles
326
		if ( method_exists( $parserOutput, 'addModuleStyles' ) ) {
327
			$parserOutput->addModuleStyles( 'ext.Lingo.Styles' );
328
			if ( !$wgOut->isArticle() ) {
329
				$wgOut->addModuleStyles( 'ext.Lingo.Styles' );
330
			}
331
		} else {
332
			$parserOutput->addHeadItem( "<link rel='stylesheet' href='$wgScriptPath/extensions/Lingo/styles/Lingo.css' />\n", 'ext.Lingo.Styles' );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgScriptPath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
333
			if ( !$wgOut->isArticle() ) {
334
				$wgOut->addHeadItem( 'ext.Lingo.Styles', "<link rel='stylesheet' href='$wgScriptPath/extensions/Lingo/styles/Lingo.css' />\n" );
0 ignored issues
show
Coding Style Best Practice introduced by
As per coding-style, please use concatenation or sprintf for the variable $wgScriptPath instead of interpolation.

It is generally a best practice as it is often more readable to use concatenation instead of interpolation for variables inside strings.

// Instead of
$x = "foo $bar $baz";

// Better use either
$x = "foo " . $bar . " " . $baz;
$x = sprintf("foo %s %s", $bar, $baz);
Loading history...
335
			}
336
		}
337
	}
338
339
	/**
340
	 * Purges the lingo tree from the cache.
341
	 */
342
	public static function purgeCache() {
343
344
		global $wgexLingoCacheType;
0 ignored issues
show
Compatibility Best Practice introduced by
Use of global functionality is not recommended; it makes your code harder to test, and less reusable.

Instead of relying on global state, we recommend one of these alternatives:

1. Pass all data via parameters

function myFunction($a, $b) {
    // Do something
}

2. Create a class that maintains your state

class MyClass {
    private $a;
    private $b;

    public function __construct($a, $b) {
        $this->a = $a;
        $this->b = $b;
    }

    public function myFunction() {
        // Do something
    }
}
Loading history...
345
		$cache = ( $wgexLingoCacheType !== null ) ? wfGetCache( $wgexLingoCacheType ) : wfGetMainCache();
346
		$cache->delete( wfMemcKey( 'ext', 'lingo', 'lingotree' ) );
347
348
	}
349
}
350
351