Issues (381)

Security Analysis    not enabled

This project does not seem to handle request data directly as such no vulnerable execution paths were found.

  Cross-Site Scripting
Cross-Site Scripting enables an attacker to inject code into the response of a web-request that is viewed by other users. It can for example be used to bypass access controls, or even to take over other users' accounts.
  File Exposure
File Exposure allows an attacker to gain access to local files that he should not be able to access. These files can for example include database credentials, or other configuration files.
  File Manipulation
File Manipulation enables an attacker to write custom data to files. This potentially leads to injection of arbitrary code on the server.
  Object Injection
Object Injection enables an attacker to inject an object into PHP code, and can lead to arbitrary code execution, file exposure, or file manipulation attacks.
  Code Injection
Code Injection enables an attacker to execute arbitrary code on the server.
  Response Splitting
Response Splitting can be used to send arbitrary responses.
  File Inclusion
File Inclusion enables an attacker to inject custom files into PHP's file loading mechanism, either explicitly passed to include, or for example via PHP's auto-loading mechanism.
  Command Injection
Command Injection enables an attacker to inject a shell command that is execute with the privileges of the web-server. This can be used to expose sensitive data, or gain access of your server.
  SQL Injection
SQL Injection enables an attacker to execute arbitrary SQL code on your database server gaining access to user data, or manipulating user data.
  XPath Injection
XPath Injection enables an attacker to modify the parts of XML document that are read. If that XML document is for example used for authentication, this can lead to further vulnerabilities similar to SQL Injection.
  LDAP Injection
LDAP Injection enables an attacker to inject LDAP statements potentially granting permission to run unauthorized queries, or modify content inside the LDAP tree.
  Header Injection
  Other Vulnerability
This category comprises other attack vectors such as manipulating the PHP runtime, loading custom extensions, freezing the runtime, or similar.
  Regex Injection
Regex Injection enables an attacker to execute arbitrary code in your PHP process.
  XML Injection
XML Injection enables an attacker to read files on your local filesystem including configuration files, or can be abused to freeze your web-server process.
  Variable Injection
Variable Injection enables an attacker to overwrite program variables with custom data, and can lead to further vulnerabilities.
Unfortunately, the security analysis is currently not available for your project. If you are a non-commercial open-source project, please contact support to gain access.

include/html2text/html2text.php (3 issues)

Upgrade to new PHP Analysis Engine

These results are based on our legacy PHP analysis, consider migrating to our new PHP analysis engine instead. Learn more

1
<?php
2
/******************************************************************************
3
 * Copyright (c) 2010 Jevon Wright and others.
4
 * All rights reserved. This program and the accompanying materials
5
 * are made available under the terms of the Eclipse Public License v1.0
6
 * which accompanies this distribution, and is available at
7
 * http://www.eclipse.org/legal/epl-v10.html
8
 *
9
 * Contributors:
10
 *    Jevon Wright - initial API and implementation
11
 ****************************************************************************/
12
13
/**
14
 * Tries to convert the given HTML into a plain text format - best suited for
15
 * e-mail display, etc.
16
 *
17
 * <p>In particular, it tries to maintain the following features:
18
 * <ul>
19
 *   <li>Links are maintained, with the 'href' copied over
20
 *   <li>Information in the &lt;head&gt; is lost
21
 * </ul>
22
 *
23
 * @param string $html
24
 *
25
 * @throws Html2TextException
26
 * @internal param \the $html input HTML
27
 * @return string HTML converted, as best as possible, to text
28
 */
29
function convert_html_to_text($html)
30
{
31
    $html = fix_newlines($html);
32
33
    $doc = new DOMDocument();
34
    if (!$doc->loadHTML($html)) {
35
        throw new Html2TextException('Could not load HTML - badly formed?', $html);
36
    }
37
38
    $output = iterate_over_node($doc);
39
40
    // remove leading and trailing spaces on each line
41
    $output = preg_replace("/[ \t]*\n[ \t]*/im", "\n", $output);
42
43
    // remove leading and trailing whitespace
44
    $output = trim($output);
45
46
    return $output;
47
}
48
49
/**
50
 * Unify newlines; in particular, \r\n becomes \n, and
51
 * then \r becomes \n. This means that all newlines (Unix, Windows, Mac)
52
 * all become \ns.
53
 *
54
 * @param mixed $text
55
 *
56
 * @return string fixed text
57
 */
58
function fix_newlines($text)
59
{
60
    // replace \r\n to \n
61
    $text = str_replace("\r\n", "\n", $text);
62
    // remove \rs
63
    $text = str_replace("\r", "\n", $text);
64
65
    return $text;
66
}
67
68
/**
69
 * @param $node
70
 *
71
 * @return null|string
72
 */
73 View Code Duplication
function next_child_name($node)
0 ignored issues
show
This function seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
74
{
75
    // get the next child
76
    $nextNode = $node->nextSibling;
77
    while (null !== $nextNode) {
78
        if ($nextNode instanceof DOMElement) {
79
            break;
80
        }
81
        $nextNode = $nextNode->nextSibling;
82
    }
83
    $nextName = null;
84
    if ($nextNode instanceof DOMElement && null !== $nextNode) {
85
        $nextName = mb_strtolower($nextNode->nodeName);
86
    }
87
88
    return $nextName;
89
}
90
91
/**
92
 * @param $node
93
 *
94
 * @return null|string
95
 */
96 View Code Duplication
function prev_child_name($node)
0 ignored issues
show
This function seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
97
{
98
    // get the previous child
99
    $nextNode = $node->previousSibling;
100
    while (null !== $nextNode) {
101
        if ($nextNode instanceof DOMElement) {
102
            break;
103
        }
104
        $nextNode = $nextNode->previousSibling;
105
    }
106
    $nextName = null;
107
    if ($nextNode instanceof DOMElement && null !== $nextNode) {
108
        $nextName = mb_strtolower($nextNode->nodeName);
109
    }
110
111
    return $nextName;
112
}
113
114
/**
115
 * @param DOMNode $node
116
 *
117
 * @return string
118
 */
119
function iterate_over_node($node)
120
{
121
    if ($node instanceof DOMText) {
122
        return preg_replace('/\\s+/im', ' ', $node->wholeText);
123
    }
124
    if ($node instanceof DOMDocumentType) {
125
        // ignore
126
        return '';
127
    }
128
129
    $nextName = next_child_name($node);
130
    $prevName = prev_child_name($node);
0 ignored issues
show
$prevName is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
131
132
    $name = mb_strtolower($node->nodeName);
133
134
    // start whitespace
135
    switch ($name) {
136
        case 'hr':
137
            return "------\n";
138
        case 'style':
139
        case 'head':
140
        case 'title':
141
        case 'meta':
142
        case 'script':
143
            // ignore these tags
144
            return '';
145
        case 'h1':
146
        case 'h2':
147
        case 'h3':
148
        case 'h4':
149
        case 'h5':
150
        case 'h6':
151
            // add two newlines
152
            $output = "\n";
153
            break;
154
        case 'p':
155
        case 'div':
156
            // add one line
157
            $output = "\n";
158
            break;
159
        default:
160
            // print out contents of unknown tags
161
            $output = '';
162
            break;
163
    }
164
165
    // debug
166
    //$output .= "[$name,$nextName]";
167
168
    for ($i = 0; $i < $node->childNodes->length; ++$i) {
169
        $n = $node->childNodes->item($i);
170
171
        $text = iterate_over_node($n);
172
173
        $output .= $text;
174
    }
175
176
    // end whitespace
177
    switch ($name) {
178
        case 'style':
179
        case 'head':
180
        case 'title':
181
        case 'meta':
182
        case 'script':
183
            // ignore these tags
184
            return '';
185
        case 'h1':
186
        case 'h2':
187
        case 'h3':
188
        case 'h4':
189
        case 'h5':
190
        case 'h6':
191
            $output .= "\n";
192
            break;
193
        case 'p':
194
        case 'br':
195
            // add one line
196
            if ('div' !== $nextName) {
197
                $output .= "\n";
198
            }
199
            break;
200
        case 'div':
201
            // add one line only if the next child isn't a div
202
            if ('div' !== $nextName && null !== $nextName) {
203
                $output .= "\n";
204
            }
205
            break;
206
        case 'a':
207
            // links are returned in [text](link) format
208
            $href = $node->getAttribute('href');
209
            if (null === $href) {
210
                // it doesn't link anywhere
211
                if (null !== $node->getAttribute('name')) {
212
                    $output = "[$output]";
213
                }
214
            } else {
215
                if ($href == $output) {
216
                    // link to the same address: just use link
217
                    $output;
218
                } else {
219
                    // replace it
220
                    $output = "[$output]($href)";
221
                }
222
            }
223
224
            // does the next node require_once additional whitespace?
225
            switch ($nextName) {
226
                case 'h1':
227
                case 'h2':
228
                case 'h3':
229
                case 'h4':
230
                case 'h5':
231
                case 'h6':
232
                    $output .= "\n";
233
                    break;
234
            }
235
236
        // no break
237
        default:
238
            // do nothing
239
    }
240
241
    return $output;
242
}
243
244
/**
245
 * Class Html2TextException
246
 */
247
class Html2TextException extends Exception
248
{
249
    public $more_info;
250
251
    /**
252
     * @param string $message
253
     * @param string $more_info
254
     */
255
    public function __construct($message = '', $more_info = '')
256
    {
257
        parent::__construct($message);
258
        $this->more_info = $more_info;
259
    }
260
}
261