Passed
Push — master ( d9b37c...9a8532 )
by Johnny
02:05
created

Node   F

Complexity

Total Complexity 66

Size/Duplication

Total Lines 356
Duplicated Lines 0 %

Importance

Changes 2
Bugs 0 Features 0
Metric Value
eloc 112
dl 0
loc 356
rs 3.12
c 2
b 0
f 0
wmc 66

16 Methods

Rating   Name   Duplication   Size   Complexity  
A isEmpty() 0 3 1
A determineCommand() 0 9 2
A isComment() 0 3 1
A command() 0 3 1
A determineEmpty() 0 3 1
A source() 0 3 1
A value() 0 3 1
B escape() 0 23 7
D checkSyntax() 0 107 39
A number() 0 3 1
A __construct() 0 11 1
A matchesPattern() 0 5 1
A determineValue() 0 3 1
A setAllowUtf8() 0 3 1
A determineComment() 0 14 6
A isInterrupted() 0 3 1

How to fix   Complexity   

Complex Class

Complex classes like Node often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

While breaking up the class, it is a good idea to analyze how other classes use Node, and based on these observations, apply Extract Interface, too.

1
<?php
2
/*
3
 * This file is part of Rivescript-php
4
 *
5
 * (c) Shea Lewis <[email protected]>
6
 *
7
 * For the full copyright and license information, please view the LICENSE
8
 * file that was distributed with this source code.
9
 */
10
11
namespace Axiom\Rivescript\Cortex;
12
13
/**
14
 * Node class
15
 *
16
 * The Node class stores information about a parsed
17
 * line from the script.
18
 *
19
 * PHP version 7.4 and higher.
20
 *
21
 * @category Core
22
 * @package  Cortext
23
 * @author   Shea Lewis <[email protected]>
24
 * @license  https://opensource.org/licenses/MIT MIT
25
 * @link     https://github.com/axiom-labs/rivescript-php
26
 * @since    0.3.0
27
 */
28
class Node
29
{
30
    /**
31
     * The source string for the node.
32
     *
33
     * @var string
34
     */
35
    protected string $source = '';
36
37
    /**
38
     * The line number of the node source.
39
     *
40
     * @var int
41
     */
42
    protected int $number = 0;
43
44
    /**
45
     * The command symbol.
46
     *
47
     * @var string
48
     */
49
    protected string $command = '';
50
51
    /**
52
     * The source without the command symbol.
53
     *
54
     * @var string
55
     */
56
    protected string $value = '';
57
58
    /**
59
     * Is this line part of a docblock.
60
     *
61
     * @var bool
62
     */
63
    protected bool $isInterrupted = false;
64
65
    /**
66
     * Is this node a comment.
67
     *
68
     * @var bool
69
     */
70
    protected bool $isComment = false;
71
72
    /**
73
     * Is this an empty line.
74
     *
75
     * @var bool
76
     */
77
    public bool $isEmpty = false;
78
79
    /**
80
     * Is UTF8 modes enabled.
81
     *
82
     * @var bool
83
     */
84
    protected bool $allowUtf8 = false;
85
86
    /**
87
     * Create a new Source instance.
88
     *
89
     * @param string $source The source line.
90
     * @param int    $number The line number in the script.
91
     */
92
    public function __construct(string $source, int $number)
93
    {
94
        $this->source = remove_whitespace($source);
95
        $this->number = $number;
96
97
      //  $this->source = $this->escape($this->source);
98
99
        $this->determineEmpty();
100
        $this->determineComment();
101
        $this->determineCommand();
102
        $this->determineValue();
103
    }
104
105
    private function escape(string $source) {
0 ignored issues
show
Unused Code introduced by
The method escape() is not used, and could be removed.

This check looks for private methods that have been defined, but are not used inside the class.

Loading history...
106
        $knownTags = synapse()->memory->tags()->keys()->all();
107
108
        $pattern = '/<(\S*?)*>.*?<\/\1>|<^.*?\/>/';
109
110
        preg_match_all($pattern, $source, $matches);
111
112
        $index = 0;
113
        if (is_array($matches[$index]) && isset($matches[$index][0]) && is_null($knownTags) === false && count($matches) == 2) {
114
             $matches = $matches[$index];
115
116
            foreach ($matches as $match) {
117
                $str = str_replace(['<', '>'], "", $match);
118
                $parts = explode(' ', $str);
119
                $tag = $parts[0] ?? "";
120
121
                if (in_array($tag, $knownTags, true) === false) {
0 ignored issues
show
Bug introduced by
$knownTags of type null is incompatible with the type array expected by parameter $haystack of in_array(). ( Ignorable by Annotation )

If this is a false-positive, you can also ignore this issue in your code via the ignore-type  annotation

121
                if (in_array($tag, /** @scrutinizer ignore-type */ $knownTags, true) === false) {
Loading history...
122
                    $source = str_replace($match, "\x00{$tag}\x01", $source);
123
                }
124
            }
125
        }
126
127
        return $source;
128
    }
129
    /**
130
     * Returns the node's command trigger.
131
     *
132
     * @return string
133
     */
134
    public function command(): string
135
    {
136
        return $this->command;
137
    }
138
139
    /**
140
     * Returns the node's value.
141
     *
142
     * @return string
143
     */
144
    public function value(): string
145
    {
146
        return $this->value;
147
    }
148
149
    /**
150
     * Returns the node's source.
151
     *
152
     * @return string
153
     */
154
    public function source(): string
155
    {
156
        return $this->source;
157
    }
158
159
    /**
160
     * Returns the node's line number.
161
     *
162
     * @return int
163
     */
164
    public function number(): int
165
    {
166
        return $this->number;
167
    }
168
169
    public function isEmpty(): bool
170
    {
171
        return $this->isEmpty;
172
    }
173
174
    /**
175
     * Returns true if node is a comment.
176
     *
177
     * @return bool
178
     */
179
    public function isComment(): bool
180
    {
181
        return $this->isComment;
182
    }
183
184
    /**
185
     * Returns true is node has been interrupted.
186
     *
187
     * @return bool
188
     */
189
    public function isInterrupted(): bool
190
    {
191
        return $this->isInterrupted;
192
    }
193
194
    /**
195
     * Determine the command type of the node.
196
     *
197
     * @return void
198
     */
199
    protected function determineCommand(): void
200
    {
201
        if ($this->source === '') {
202
            $this->isInterrupted = true;
203
204
            return;
205
        }
206
207
        $this->command = mb_substr($this->source, 0, 1);
208
    }
209
210
    protected function determineEmpty(): void
211
    {
212
        $this->isEmpty = empty(trim($this->source()));
213
    }
214
215
    /**
216
     * Determine if the current node source is a comment.
217
     *
218
     * @return void
219
     */
220
    protected function determineComment(): void
221
    {
222
        if (starts_with($this->source, '//')) {
223
            $this->isInterrupted = true;
224
        } elseif (starts_with($this->source, '#')) {
225
            log_warning('Using the # symbol for comments is deprecated');
226
            $this->isInterrupted = true;
227
        } elseif (starts_with($this->source, '/*')) {
228
            if (ends_with($this->source, '*/')) {
229
                return;
230
            }
231
            $this->isComment = true;
232
        } elseif (ends_with($this->source, '*/')) {
233
            $this->isComment = false;
234
        }
235
    }
236
237
    /**
238
     * Determine the value of the node.
239
     *
240
     * @return void
241
     */
242
    protected function determineValue(): void
243
    {
244
        $this->value = trim(mb_substr($this->source, 1));
245
    }
246
247
    /**
248
     * Enable the UTF8 mode.
249
     *
250
     * @param bool $allowUtf8 True of false.
251
     */
252
    public function setAllowUtf8(bool $allowUtf8): void
253
    {
254
        $this->allowUtf8 = $allowUtf8;
255
    }
256
257
    /**
258
     * Check the syntax
259
     *
260
     * @return string
261
     */
262
    public function checkSyntax(): string
263
    {
264
        if (starts_with($this->source, '!')) {
265
            # ! Definition
266
            #   - Must be formatted like this:
267
            #     ! type name = value
268
            #     OR
269
            #     ! type = value
270
            #   - Type options are NOT enforceable, for future compatibility; if RiveScript
271
            #     encounters a new type that it can't handle, it can safely warn and skip it.
272
            if ($this->matchesPattern("/^.+(?:\s+.+|)\s*=\s*.+?$/", $this->source) === false) {
273
                return "Invalid format for !Definition line: must be '! type name = value' OR '! type = value'";
274
            }
275
        } elseif (starts_with($this->source, '>')) {
276
            # > Label
277
            #   - The "begin" label must have only one argument ("begin")
278
            #   - "topic" labels must be lowercase but can inherit other topics ([A-Za-z0-9_\s])
279
            #   - "object" labels follow the same rules as "topic" labels, but don't need be lowercase
280
            if ($this->matchesPattern("/^begin/", $this->value) === true
281
                && $this->matchesPattern("/^begin$/", $this->value) === false) {
282
                return "The 'begin' label takes no additional arguments, should be verbatim '> begin'";
283
            } elseif ($this->matchesPattern("/^topic/", $this->value) === true
284
                && $this->matchesPattern("/[^a-z0-9_\-\s]/", $this->value) === true) {
285
                return "Topics should be lowercased and contain only numbers and letters!";
286
            } elseif ($this->matchesPattern("/^object/", $this->value) === true
287
                && $this->matchesPattern("/[^a-z0-9_\-\s]/", $this->value) === true) {
288
                return "Objects can only contain numbers and lowercase letters!";
289
            }
290
        } elseif (starts_with($this->source, '+') || starts_with($this->source, '%')
291
            || starts_with($this->source, '@')) {
292
            # + Trigger, % Previous, @ Redirect
293
            #   This one is strict. The triggers are to be run through Perl's regular expression
294
            #   engine. Therefore, it should be acceptable by the regexp engine.
295
            #   - Entirely lowercase
296
            #   - No symbols except: ( | ) [ ] * _ # @ { } < > =
297
            #   - All brackets should be matched
298
299
            if ($this->allowUtf8 === true) {
300
                if ($this->matchesPattern("/[A-Z\\.]/", $this->value) === true) {
301
                    return "Triggers can't contain uppercase letters, backslashes or dots in UTF-8 mode.";
302
                }
303
            } elseif ($this->matchesPattern("/[^a-z0-9(\|)\[\]*_#\@{}<>=\s]/", $this->value) === true) {
304
                return "Triggers may only contain lowercase letters, numbers, and these symbols: ( | ) [ ] * _ # @ { } < > =";
305
            }
306
307
            $parens = 0; # Open parenthesis
308
            $square = 0; # Open square brackets
309
            $curly = 0; # Open curly brackets
310
            $chevron = 0; # Open angled brackets
311
            $len = strlen($this->value);
312
313
            for ($i = 0; $i < $len; $i++) {
314
                $chr = $this->value[$i];
315
316
                # Count brackets.
317
                if ($chr === '(') {
318
                    $parens++;
319
                }
320
                if ($chr === ')') {
321
                    $parens--;
322
                }
323
                if ($chr === '[') {
324
                    $square++;
325
                }
326
                if ($chr === ']') {
327
                    $square--;
328
                }
329
                if ($chr === '{') {
330
                    $curly++;
331
                }
332
                if ($chr === '}') {
333
                    $curly--;
334
                }
335
                if ($chr === '<') {
336
                    $chevron++;
337
                }
338
                if ($chr === '>') {
339
                    $chevron--;
340
                }
341
            }
342
343
            if ($parens) {
344
                return "Unmatched " . ($parens > 0 ? "left" : "right") . " parenthesis bracket ()";
345
            }
346
            if ($square) {
347
                return "Unmatched " . ($square > 0 ? "left" : "right") . " square bracket []";
348
            }
349
            if ($curly) {
350
                return "Unmatched " . ($curly > 0 ? "left" : "right") . " curly bracket {}";
351
            }
352
            if ($chevron) {
353
                return "Unmatched " . ($chevron > 0 ? "left" : "right") . " angled bracket <>";
354
            }
355
        } elseif (starts_with($this->source, '-') || starts_with($this->source, '^')
356
            || starts_with($this->source, '/')) {
357
            # - Trigger, ^ Continue, / Comment
358
            # These commands take verbatim arguments, so their syntax is loose.
359
        } elseif (starts_with($this->source, '*') === true && $this->isComment() === false) {
360
            # * Condition
361
            #   Syntax for a conditional is as follows:
362
            #   * value symbol value => response
363
            if ($this->matchesPattern("/.+?\s(==|eq|!=|ne|<>|<|<=|>|>=)\s.+?=>.+?$/", $this->value) === false) {
364
                return "Invalid format for !Condition: should be like `* value symbol value => response`";
365
            }
366
        }
367
368
        return "";
369
    }
370
371
    /**
372
     * Check for patterns in a given string.
373
     *
374
     * @param string $regex  The pattern to detect.
375
     * @param string $string The string that could contain the pattern.
376
     *
377
     * @return bool
378
     */
379
    private function matchesPattern(string $regex = '', string $string = ''): bool
380
    {
381
        preg_match_all($regex, $string, $matches);
382
383
        return isset($matches[0][0]);
384
    }
385
}
386