Completed
Push — dev ( 12acf4...3bce75 )
by Auke
13:27
created

htmlcleaner   F

Complexity

Total Complexity 69

Size/Duplication

Total Lines 241
Duplicated Lines 12.45 %

Coupling/Cohesion

Components 0
Dependencies 1

Test Coverage

Coverage 42.33%
Metric Value
wmc 69
lcom 0
cbo 1
dl 30
loc 241
ccs 80
cts 189
cp 0.4233
rs 2.8302

3 Methods

Rating   Name   Duplication   Size   Complexity  
A version() 0 4 1
C dessicate() 0 65 16
F cleanup() 30 165 52

How to fix   Duplicated Code    Complexity   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

Complex Class

 Tip:   Before tackling complexity, make sure that you eliminate any duplication first. This often can reduce the size of classes significantly.

Complex classes like htmlcleaner often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use htmlcleaner, and based on these observations, apply Extract Interface, too.

1
<?php
2
/*
3
 *    changed : 10. oct. 03
4
 *    author  : [email protected]
5
 *    additional : Martin B. Vestergaard, Adrian Cope
6
 *    download: http://www.phpclasses.org/browse.html/package/1020.html
7
 *
8
 *    description :
9
 *        a script aimed at cleaning up after mshtml. use it in your wysiwyg html-editor,
10
 *        to strip messy code resulting from a copy-paste from word.
11
 *        this script doesnt come anything near htmltidy, but its pure php. if you have
12
 *        access to install binaries on your server, you might want to try using htmltidy.
13
 *    note :
14
 *        you might want to allow fonttags or even style tags. in that case, modify the
15
 *        function htmlcleaner::cleanup()
16
 *    usage :
17
 *        $body = htmlcleaner::cleanup($_POST['htmlCode']);
18
 *
19
 *    disclaimer :
20
 *        this piece of code is freely usable by anyone. if it makes your life better,
21
 *        remember me in your eveningprayer. if it makes your life worse, try doing it any
22
 *        better yourself.
23
 *
24
 *    todo/bugs :
25
 *        the script seems to remove textnodes in the root area. (eg. with no enclosing tags)
26
 */
27
define ('HTML_CLEANER_NODE_CLOSINGSTYLE_NORMAL',0);
28
define ('HTML_CLEANER_NODE_CLOSINGSTYLE_NONE',1);
29
define ('HTML_CLEANER_NODE_CLOSINGSTYLE_XHTMLSINGLE',2);
30
define ('HTML_CLEANER_NODE_CLOSINGSTYLE_HTMLSINGLE',3);
31
define ('HTML_CLEANER_NODE_NODETYPE_NODE',0);
32
define ('HTML_CLEANER_NODE_NODETYPE_CLOSINGNODE',1);
33
define ('HTML_CLEANER_NODE_NODETYPE_TEXT',2);
34
define ('HTML_CLEANER_NODE_NODETYPE_SPECIAL',3);
35
class htmlcleanertag {
36
	public $nodeType;
37
	public $nodeName;
38
	public $nodeValue;
39
	public $attributes = array();
40
	public $closingStyle;
41
42 32
	public function __construct($str)
43
	{
44 32
		if ($str[0]=='<') {
45 32
			$this->nodeType = HTML_CLEANER_NODE_NODETYPE_NODE;
46 32
			if (isset($str[1]) && ($str[1]=='?' || $str[1]=='!')) {
47
				$this->nodeType = HTML_CLEANER_NODE_NODETYPE_SPECIAL;
48
				$this->nodeValue = $str;
49
			} else {
50 32
				$this->parseFromString($str);
51
			}
52 32
		} else {
53 32
			$this->nodeType = HTML_CLEANER_NODE_NODETYPE_TEXT;
54 32
			$this->nodeValue = $str;
55
		}
56
57 32
	}
58
59 32
	function parseFromString($str)
60
	{
61 32
		$str = str_replace("\n"," ", $str);
62 32
		$offset=1;
63 32
		$endset=strlen($str)-2;
64 32
		if ($str[0] != '<' || $str[$endset+1] !== '>'){
65
			trigger_error('tag syntax error', E_USER_ERROR);
66
		}
67 32
		if ($str[$endset]=='/') {
68
			$endset--;
69
			$this->closingStyle = HTML_CLEANER_NODE_CLOSINGSTYLE_XHTMLSINGLE;
70
		}
71 32
		if ($str[1]=='/') {
72 32
			$offset=2;
73 32
			$this->nodeType = HTML_CLEANER_NODE_NODETYPE_CLOSINGNODE;
74 32
		}
75
76 32
		preg_match("|</?([a-zA-Z0-9:]+)|",$str,$matches);
77 32
		$tagname = $matches[1];
78 32
		$offset += strlen($tagname);
79
80 32
		$tagattr = substr($str,$offset,$endset-$offset+1);
81
82 32
		$this->nodeName = strtolower($tagname);
83 32
		$this->attributes = $this->parseAttributes($tagattr);
84 32
	}
85
86 32
	function parseAttributes($str)
87
	{
88 32
		$str = trim($str);
89 32
		if(strlen($str) == 0) {
90 32
			return array();
91
		}
92
93
		//echo "{{".$str."}}\n";
0 ignored issues
show
Unused Code Comprehensibility introduced by
63% of this comment could be valid code. Did you maybe forget this after debugging?

Sometimes obsolete code just ends up commented out instead of removed. In this case it is better to remove the code once you have checked you do not need it.

The code might also have been commented out for debugging purposes. In this case it is vital that someone uncomments it again or your project may behave in very unexpected ways in production.

This check looks for comments that seem to be mostly valid code and reports them.

Loading history...
94 28
		$i=0;
95 28
		$return = array();
96 28
		$_state = -1;
97 28
		$_name = '';
98 28
		$_quote = '';
0 ignored issues
show
Unused Code introduced by
$_quote is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
99 28
		$_value = '';
100 28
		$strlen = strlen($str);
101
102 28
		while ($i<$strlen) {
103 28
			$chr = $str[$i];
104
105 28
			if ($_state == -1) {		// reset buffers
106 28
				$_name = '';
107 28
				$_quote = '';
0 ignored issues
show
Unused Code introduced by
$_quote is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
108 28
				$_value = '';
109 28
				$_state = 0;		// parse from here
110 28
			}
111 28
			if ($_state == 0) {		// state 0 : looking for name
112 28
				if (ctype_space($chr)) { // whitespace, NEXT
113 4
					$i++;
114 4
					continue;
115
				}
116 28
				preg_match("/([a-zA-Z][a-zA-Z0-9_:.-]*)/",$str,$matches,0,$i);
117
118 28
				$_name = $matches[1];
119 28
				$i += strlen($_name);
120 28
				$chr = $str[$i];
121
122 28
				if ($chr == '=') {
123 24
					$_state = 3;
124 24
				} else {
125 4
					$_state = 2;
126
				}
127 28
			} else if ($_state == 2) { // state 2: looking for equal
128
				if (!ctype_space($chr)) {
129
					if ($chr == '=') {
130
						$_state = 3;
131
					} else {
132
						// end of attribute
133
						$return[] = $_name;
134
						$_state = -1;
135
						continue; // Don't up the counter, this char is the first char for the next attribute.
136
					}
137
				}
138 24
			} else if ($_state == 3) {	// state 3 : looking for quote
139 24
				if ($chr == '"' || $chr == "'" ) {
140
					// fastforward til next quot
141 24
					$regexp = '|^'.$chr.'(.*?)'.$chr.'|';
142 24
					$skip = 1;
143 24
				} else if (!ctype_space($chr)) {
144
					// fastforward til next space
145
					$regexp = '|^(.*?) ?|';
146
					$skip = 0;
147
				}
148
149 24
				preg_match($regexp,substr($str,$i),$matches);
0 ignored issues
show
Bug introduced by
The variable $regexp does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
150 24
				$_value = $matches[1];
151 24
				$i += strlen($_value) + $skip ;
0 ignored issues
show
Bug introduced by
The variable $skip does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
152
153 24
				$return[strtolower($_name)] = $_value;
154 24
				$_state = -1;
155
156 24
			}
157 28
			$i++;
158 28
		}
159 28
		if($_state != -1 ) {
160 4
			if ($_value!='') {
161
				$return[strtolower($_name)] = $_value;
162 4
			} else if ($_name!='') {
163 4
				$return[] = $_name;
164 4
			}
165 4
		}
166
167 28
		return $return;
168
	}
169
170
	public function _toString() {
171
		return $this->toString();
172
	}
173
174 32
	public function toString()
175
	{
176 32
		$src = '';
0 ignored issues
show
Unused Code introduced by
$src is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
177 32
		if ( ($this->nodeName == 'link' ||
178 32
			$this->nodeName == 'img' ||
179 32
			$this->nodeName == 'br' ||
180 32
			$this->nodeName == 'hr')
181 32
			&& $this->closingStyle != HTML_CLEANER_NODE_CLOSINGSTYLE_XHTMLSINGLE
182 32
		) {
183
			$this->closingStyle = HTML_CLEANER_NODE_CLOSINGSTYLE_HTMLSINGLE;
184
		}
185 32
		if ($this->nodeType == HTML_CLEANER_NODE_NODETYPE_TEXT || $this->nodeType == HTML_CLEANER_NODE_NODETYPE_SPECIAL) {
186 32
			return $this->nodeValue;
187
		}
188 32
		if ($this->nodeType == HTML_CLEANER_NODE_NODETYPE_NODE) {
189 32
			$str = '<'.$this->nodeName;
190 32
		} else if ($this->nodeType == HTML_CLEANER_NODE_NODETYPE_CLOSINGNODE) {
191 32
			return '</'.$this->nodeName.">";
192
		}
193 32
		foreach ($this->attributes as $attkey => $attvalue) {
194 28
			if (is_numeric($attkey)) {
195 4
				$str .= ' '.$attvalue;
0 ignored issues
show
Bug introduced by
The variable $str does not seem to be defined for all execution paths leading up to this point.

If you define a variable conditionally, it can happen that it is not defined for all execution paths.

Let’s take a look at an example:

function myFunction($a) {
    switch ($a) {
        case 'foo':
            $x = 1;
            break;

        case 'bar':
            $x = 2;
            break;
    }

    // $x is potentially undefined here.
    echo $x;
}

In the above example, the variable $x is defined if you pass “foo” or “bar” as argument for $a. However, since the switch statement has no default case statement, if you pass any other value, the variable $x would be undefined.

Available Fixes

  1. Check for existence of the variable explicitly:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        if (isset($x)) { // Make sure it's always set.
            echo $x;
        }
    }
    
  2. Define a default value for the variable:

    function myFunction($a) {
        $x = ''; // Set a default which gets overridden for certain paths.
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
        }
    
        echo $x;
    }
    
  3. Add a value for the missing path:

    function myFunction($a) {
        switch ($a) {
            case 'foo':
                $x = 1;
                break;
    
            case 'bar':
                $x = 2;
                break;
    
            // We add support for the missing case.
            default:
                $x = '';
                break;
        }
    
        echo $x;
    }
    
Loading history...
196 4
			} else {
197 24
				$str .= ' '.$attkey.'="'.str_replace('"','&quot;',$attvalue).'"';
198
			}
199 32
		}
200 32
		if ($this->closingStyle == HTML_CLEANER_NODE_CLOSINGSTYLE_XHTMLSINGLE) {
201
			$str .= ' />';
202
		} else {
203 32
			$str .= '>';
204
		}
205 32
		return $str;
206
	}
207
208
}
209
210
class htmlcleaner
0 ignored issues
show
Coding Style Compatibility introduced by
PSR1 recommends that each class should be in its own file to aid autoloaders.

Having each class in a dedicated file usually plays nice with PSR autoloaders and is therefore a well established practice. If you use other autoloaders, you might not want to follow this rule.

Loading history...
211
{
212
	public static function version()
213
	{
214
		return 'mshtml cleanup v.0.9.2 by [email protected]';
215
	}
216
217 32
	public static function dessicate($str)
218
	{
219 32
		$i=0;
220 32
		$parts = array();
221 32
		$_state = 0;
222 32
		$_buffer = '';
223 32
		$_quote = '';
0 ignored issues
show
Unused Code introduced by
$_quote is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
224 32
		$str_len = strlen($str);
225 32
		while ($i<$str_len) {
226 32
			$chr = $str[$i];
227 32
			if ($_state == -1) {	// reset buffers
228 32
				$_buffer = '';
229 32
				$_quote = '';
0 ignored issues
show
Unused Code introduced by
$_quote is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
230 32
				$_state = 0;
231 32
			}
232 32
			if ($_state == 0) {	// state 0 : looking for <
233 32
				$pos = strpos($str,'<',$i);
234 32
				if( $pos === false) {
235
					// no more
236
					$_buffer = substr($str,$i);
237
					$i = $str_len;
238 32
				} else if($str[$pos] === '<') {
239 32
					$chr = '<';
0 ignored issues
show
Unused Code introduced by
$chr is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
240 32
					$_buffer = substr($str,$i,$pos-$i);
241 32
					if ($_buffer!='') {
242
						// store part
243 32
						array_push($parts,new htmlcleanertag($_buffer));
244 32
					}
245 32
					$_buffer = '<';
246 32
					$i = $pos;
247 32
					if (($i+3 < $str_len) && $str[$i+1] == '!' && $str[$i+2] == '-' && $str[$i+3] == '-') {
248
249
						// cheating, fast forward to end of comment
250
						$end = strpos($str,'-->',$i+3); // start looking 3 steps ahead
251
						if($end !== false) {
252
							$comment = substr($str,$i,$end-$i+3);
253
							array_push($parts,new htmlcleanertag($comment)); // Remove this line to make the cleaner leave out HTML comments from the parts.
254
							$_state = -1;
255
							$i = $end+2;
256
						} else {
257
							$_buffer = substr($str,$i);
258
							$i = $str_len;
259
						}
260
					} else {
261 32
						$_state = 1;
262
					}
263 32
				}
264 32
			} else if ($_state == 1) {	// state 1 : in tag looking for >
265 32
				$_buffer .= $chr;
266 32
				if ($chr == '"' || $chr == "'") {
267
268 24
					$regexp = '|'.$chr.'(.*?)'.$chr.'|';
269 24
					preg_match($regexp,$str,$matches,0,$i);
270
271 24
					$_buffer .= $matches[1] . $chr;
272 24
					$i += strlen($matches[1]) + 1 ;
273 32
				} else if ($chr == '>') {
274 32
					array_push($parts,new htmlcleanertag($_buffer));
275 32
					$_state = -1;
276 32
				}
277 32
			}
278 32
			$i++;
279 32
		}
280 32
		return $parts;
281
	}
282
283
284
	// removes the worst mess from word.
285 32
	public static function cleanup($body, $config)
286
	{
287
288 32
		$scriptParts = array();
289
290
		do {
291 32
			$prefix = md5(rand());
292 32
		} while (strpos($body, $prefix) !== false);
293
294 32
		$callback = function($matches) use ($prefix, &$scriptParts) {
295
			$scriptPartKey = '----'.$prefix . '-' . count($scriptParts).'----';
296
			$scriptParts[$scriptPartKey] = $matches[0];
297
			return $scriptPartKey;
298 32
		};
299
300 32
		$newbody = preg_replace_callback('!<script[^>]*>(.|[\r\n])*?</[^>]*script[^>]*>!i', $callback, $body);
301
302 32
		if($newbody) {
303 32
			$body = $newbody;
304 32
		}
305
306 32
		$body = "<htmlcleaner>$body</htmlcleaner>";
307 32
		$rewrite_rules = $config["rewrite"];
308 32
		$return = '';
309 32
		$parts = htmlcleaner::dessicate($body);
310
311
		// flip emtied rules so we can use it as indexes
312 32
		if (is_array($config["delete_emptied"])) {
313
			$config["delete_emptied"] = array_flip($config["delete_emptied"]);
314
		}
315 32
		if (isset($config["delete_empty_containers"]) && is_array($config["delete_empty_containers"])) {
316
			$config["delete_empty_containers"] = array_flip($config["delete_empty_containers"]);
317
		}
318 32
		$delete_stack = Array();
319 32
		$skipNodes = 0;
320 32
		if(is_array($rewrite_rules)) {
321
			foreach ($rewrite_rules as $tag_rule=> $attrib_rules) {
322
				$escaped_rule = str_replace('/','\/',$tag_rule);
323
				if($tag_rule !== $escaped_rule) {
324
					$rewrite_rules[$escaped_rule] = $attrib_rules;
325
					unset($rewrite_rules[$tag_rule]);
326
					$tag_rule = $escaped_rule;
327
				}
328
329
				if (is_array($attrib_rules)) {
330
					foreach ($attrib_rules as $attrib_rule=> $value_rules) {
331
						$escaped_rule = str_replace('/','\/',$attrib_rule);
332
						if ($attrib_rule !== $escaped_rule) {
333
							$rewrite_rules[$tag_rule][$escaped_rule] = $value_rules;
334
							unset($rewrite_rules[$tag_rule][$attrib_rule]);
335
							$attrib_rule = $escaped_rule;
336
						}
337
338
						if (is_array($value_rules)) {
339
							foreach ($value_rules as $value_rule=>$value) {
340
								$escaped_rule = str_replace('/','\/',$value_rule);
341
								if ($value_rule !== $escaped_rule) {
342
									$rewrite_rules[$tag_rule][$attrib_rule][$escaped_rule] = $value;
343
									unset($rewrite_rules[$tag_rule][$attrib_rule][$value_rule]);
344
								}
345
							}
346
						} 
347
					}
348
				}
349
			}
350
		}
351
352 32
		foreach ($parts as $i => $part) {
353 32
			if ($skipNodes > 0) {
354
				$skipNodes--;
355
				continue;
356
			}
357 32
			if ($part->nodeType == HTML_CLEANER_NODE_CLOSINGSTYLE_NONE) {
358 32
				if (isset($config["delete_emptied"][$part->nodeName])
359 32
						&& count($delete_stack)) {
360
					do {
361
						$closed = array_pop($delete_stack);
362
					} while ($closed["tag"] && $closed["tag"] != $part->nodeName);
363
					if ($closed["delete"]) {
364
						unset($part);
365
					}
366
				}
367 32
			} else
368 32
			if ($part->nodeType == HTML_CLEANER_NODE_NODETYPE_NODE) {
369 32
				if (isset($config["delete_emptied"][$part->nodeName])
370 32
					&& count($delete_stack)) {
371
						array_push($delete_stack, Array("tag" => $part->nodeName));
372 32
				} else if (isset($config["delete_empty_containers"][$part->nodeName])) {
373
					if ($part->nodeName != 'a' || !$part->attributes['name']) {	// named anchor objects are not containers
374
						if (isset($parts[$i+1]) && $parts[$i+1]->nodeName == $part->nodeName && $parts[$i+1]->nodeType == HTML_CLEANER_NODE_NODETYPE_CLOSINGNODE) {
375
							$skipNodes = 1;
376
							continue;
377
						}
378
					}
379
				}
380 32
			}
381
382
383 32
			if ($part && is_array($rewrite_rules)) {
384
				foreach ($rewrite_rules as $tag_rule=>$attrib_rules) {
385
					if (preg_match('/'.$tag_rule.'/is', $part->nodeName)) {
386
						if (is_array($attrib_rules)) {
387
							foreach ($attrib_rules as $attrib_rule=>$value_rules) {
388
								foreach ($part->attributes as $attrib_key=>$attrib_val) {
389
									if (preg_match('/'.$attrib_rule.'/is', $attrib_key)) {
390
										if (is_array($value_rules)) {
391
											foreach ($value_rules as $value_rule=>$value) {
392
												if (preg_match('/'.$value_rule.'/is', $attrib_val)) {
393 View Code Duplication
													if ($value === false) {
394
														unset($part->attributes[$attrib_key]);
395
														if (!count($part->attributes)) {
396
															if (isset($config["delete_emptied"][$part->nodeName])) {
397
																// remove previous config
398
																@array_pop($delete_stack);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
399
																array_push($delete_stack, Array("tag" => $part->nodeName, "delete" => true));
400
																unset($part);
401
															}
402
															break 3;
403
														}
404
													} else {
405
														$part->attributes[$attrib_key] = preg_replace('/^'.$value_rule.'$/is', $value, $part->attributes[$attrib_key]);
406
													}
407
												}
408
											}
409 View Code Duplication
										} else
410
										if ($value_rules === false) {
411
											unset($part->attributes[$attrib_key]);
412
											if (!count($part->attributes)) {
413
												if (isset($config["delete_emptied"][$part->nodeName])) {
414
													// remove previous config
415
													@array_pop($delete_stack);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
416
													array_push($delete_stack, Array("tag" => $part->nodeName, "delete" => true));
417
													unset($part);
418
												}
419
												break 2;
420
											}
421
										} else {
422
											$part->attributes[preg_replace('/^'.$attrib_rule.'$/is', $value_rules, $attrib_key)] = $part->attributes[$attrib_key];
423
											unset($part->attributes[$attrib_key]);
424
										}
425
									}
426
								}
427
							}
428
						} else if ($attrib_rules === false) {
429
							unset($part);
430
						} else {
431
							$part->nodeName = $attrib_rules;
432
						}
433
						break; // tag matched, so skip next rules.
434
					}
435
				}
436
			}
437 32
			if ($part && strstr($part->nodeValue,'<?xml:namespace')===false) {
438 32
				$return .= $part->toString();
439 32
			}
440 32
		}
441
442 32
		$return = str_replace(array_keys($scriptParts), array_values($scriptParts), $return);
443
444
		//FIXME: htmlcleaner removes the '<' in '</htmlcleaner>' if the html code is broken
0 ignored issues
show
Coding Style introduced by
Comment refers to a FIXME task "htmlcleaner removes the '<' in '</htmlcleaner>' if the html code is broken"
Loading history...
445
		// ie: if the last tag in the input isn't properly closed... it should instead
446
		// close any broken tag properly (add quotes and a '>')
447
448 32
		return str_replace('<htmlcleaner>', '', str_replace('</htmlcleaner>', '', $return));
449
	}
450
}
451
452
class pinp_htmlcleaner extends htmlcleaner {
0 ignored issues
show
Coding Style Compatibility introduced by
PSR1 recommends that each class should be in its own file to aid autoloaders.

Having each class in a dedicated file usually plays nice with PSR autoloaders and is therefore a well established practice. If you use other autoloaders, you might not want to follow this rule.

Loading history...
453
454
	public static function _dessicate($str) {
455
		return parent::dessicate($str);
0 ignored issues
show
Comprehensibility Bug introduced by
It seems like you call parent on a different method (dessicate() instead of _dessicate()). Are you sure this is correct? If so, you might want to change this to $this->dessicate().

This check looks for a call to a parent method whose name is different than the method from which it is called.

Consider the following code:

class Daddy
{
    protected function getFirstName()
    {
        return "Eidur";
    }

    protected function getSurName()
    {
        return "Gudjohnsen";
    }
}

class Son
{
    public function getFirstName()
    {
        return parent::getSurname();
    }
}

The getFirstName() method in the Son calls the wrong method in the parent class.

Loading history...
456
	}
457
	public static function _cleanup($str,$config) {
458
		return parent::cleanup($str,$config);
0 ignored issues
show
Comprehensibility Bug introduced by
It seems like you call parent on a different method (cleanup() instead of _cleanup()). Are you sure this is correct? If so, you might want to change this to $this->cleanup().

This check looks for a call to a parent method whose name is different than the method from which it is called.

Consider the following code:

class Daddy
{
    protected function getFirstName()
    {
        return "Eidur";
    }

    protected function getSurName()
    {
        return "Gudjohnsen";
    }
}

class Son
{
    public function getFirstName()
    {
        return parent::getSurname();
    }
}

The getFirstName() method in the Son calls the wrong method in the parent class.

Loading history...
459
	}
460
461
}
462