Completed
Pull Request — master (#179)
by Jaap
08:55
created

DescriptionFactory::lex()   A

Complexity

Conditions 2
Paths 2

Size

Total Lines 41

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 12
CRAP Score 2

Importance

Changes 0
Metric Value
dl 0
loc 41
ccs 12
cts 12
cp 1
rs 9.264
c 0
b 0
f 0
cc 2
nc 2
nop 1
crap 2
1
<?php
2
3
declare(strict_types=1);
4
5
/**
6
 * This file is part of phpDocumentor.
7
 *
8
 * For the full copyright and license information, please view the LICENSE
9
 * file that was distributed with this source code.
10
 *
11
 * @link      http://phpdoc.org
12
 */
13
14
namespace phpDocumentor\Reflection\DocBlock;
15
16
use phpDocumentor\Reflection\Types\Context as TypeContext;
17
use Webmozart\Assert\Assert;
18
use const PREG_SPLIT_DELIM_CAPTURE;
19
use function count;
20
use function explode;
21
use function implode;
22
use function ltrim;
23
use function min;
24
use function preg_split;
25
use function str_replace;
26
use function strlen;
27
use function strpos;
28
use function substr;
29
use function trim;
30
31
/**
32
 * Creates a new Description object given a body of text.
33
 *
34
 * Descriptions in phpDocumentor are somewhat complex entities as they can contain one or more tags inside their
35
 * body that can be replaced with a readable output. The replacing is done by passing a Formatter object to the
36
 * Description object's `render` method.
37
 *
38
 * In addition to the above does a Description support two types of escape sequences:
39
 *
40
 * 1. `{@}` to escape the `@` character to prevent it from being interpreted as part of a tag, i.e. `{{@}link}`
41
 * 2. `{}` to escape the `}` character, this can be used if you want to use the `}` character in the description
42
 *    of an inline tag.
43
 *
44 9
 * If a body consists of multiple lines then this factory will also remove any superfluous whitespace at the beginning
45
 * of each line while maintaining any indentation that is used. This will prevent formatting parsers from tripping
46 9
 * over unexpected spaces as can be observed with tag descriptions.
47 9
 */
48
class DescriptionFactory
49
{
50
    /** @var TagFactory */
51
    private $tagFactory;
52
53
    /**
54
     * Initializes this factory with the means to construct (inline) tags.
55
     */
56
    public function __construct(TagFactory $tagFactory)
57 9
    {
58
        $this->tagFactory = $tagFactory;
59 9
    }
60
61 9
    /**
62
     * Returns the parsed text of this description.
63
     */
64
    public function create(string $contents, ?TypeContext $context = null) : Description
65
    {
66
        $tokens   = $this->lex($contents);
67
        $count    = count($tokens);
68
        $tagCount = 0;
69
        $tags     = [];
70
71 9
        for ($i = 1; $i < $count; $i += 2) {
72
            $tag = $this->tagFactory->create($tokens[$i], $context);
73 9
            if ($tag !== null) {
74
                $tags[] = $tag;
75
            }
76 9
            $tokens[$i] = '%' . ++$tagCount . '$s';
77 6
        }
78
79
        //In order to allow "literal" inline tags, the otherwise invalid
80 3
        //sequence "{@}" is changed to "@", and "{}" is changed to "}".
81 3
        //"%" is escaped to "%%" because of vsprintf.
82
        //See unit tests for examples.
83
        for ($i = 0; $i < $count; $i += 2) {
84
            $tokens[$i] = str_replace(['{@}', '{}', '%'], ['@', '}', '%%'], $tokens[$i]);
85
        }
86
87
        return new Description(implode('', $tokens), $tags);
88
    }
89
90
    /**
91
     * Strips the contents from superfluous whitespace and splits the description into a series of tokens.
92
     *
93
     * @return string[] A series of tokens of which the description text is composed.
94
     */
95
    private function lex(string $contents) : array
96
    {
97
        $contents = $this->removeSuperfluousStartingWhitespace($contents);
98
99
        // performance optimalization; if there is no inline tag, don't bother splitting it up.
100
        if (strpos($contents, '{@') === false) {
101
            return [$contents];
102
        }
103
104
        $parts =  preg_split(
105 3
            '/\{
106 3
                # "{@}" is not a valid inline tag. This ensures that we do not treat it as one, but treat it literally.
107 3
                (?!@\})
108
                # We want to capture the whole tag line, but without the inline tag delimiters.
109
                (\@
110
                    # Match everything up to the next delimiter.
111
                    [^{}]*
112
                    # Nested inline tag content should not be captured, or it will appear in the result separately.
113
                    (?:
114
                        # Match nested inline tags.
115
                        (?:
116
                            # Because we did not catch the tag delimiters earlier, we must be explicit with them here.
117
                            # Notice that this also matches "{}", as a way to later introduce it as an escape sequence.
118
                            \{(?1)?\}
119 9
                            |
120
                            # Make sure we match hanging "{".
121 9
                            \{
122 9
                        )
123 9
                        # Match content after the nested inline tag.
124
                        [^{}]*
125 9
                    )* # If there are more inline tags, match them as well. We use "*" since there may not be any
126 2
                       # nested inline tags.
127 2
                )
128
            \}/Sux',
129
            $contents,
130
            0,
131
            PREG_SPLIT_DELIM_CAPTURE
132
        );
133
        Assert::isArray($parts);
134 9
        return $parts;
135 9
    }
136
137
    /**
138 9
     * Removes the superfluous from a multi-line description.
139
     *
140
     * When a description has more than one line then it can happen that the second and subsequent lines have an
141
     * additional indentation. This is commonly in use with tags like this:
142
     *
143
     *     {@}since 1.1.0 This is an example
144
     *         description where we have an
145
     *         indentation in the second and
146
     *         subsequent lines.
147
     *
148
     * If we do not normalize the indentation then we have superfluous whitespace on the second and subsequent
149
     * lines and this may cause rendering issues when, for example, using a Markdown converter.
150
     */
151
    private function removeSuperfluousStartingWhitespace(string $contents) : string
152
    {
153
        $lines = explode("\n", $contents);
154
155
        // if there is only one line then we don't have lines with superfluous whitespace and
156
        // can use the contents as-is
157
        if (count($lines) <= 1) {
158
            return $contents;
159 9
        }
160
161 9
        // determine how many whitespace characters need to be stripped
162
        $startingSpaceCount = 9999999;
163
        for ($i = 1; $i < count($lines); ++$i) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
164
            // lines with a no length do not count as they are not indented at all
165 9
            if (strlen(trim($lines[$i])) === 0) {
166 8
                continue;
167
            }
168
169
            // determine the number of prefixing spaces by checking the difference in line length before and after
170 1
            // an ltrim
171 1
            $startingSpaceCount = min($startingSpaceCount, strlen($lines[$i]) - strlen(ltrim($lines[$i])));
172
        }
173 1
174 1
        // strip the number of spaces from each line
175
        if ($startingSpaceCount > 0) {
176
            for ($i = 1; $i < count($lines); ++$i) {
0 ignored issues
show
Performance Best Practice introduced by
It seems like you are calling the size function count() as part of the test condition. You might want to compute the size beforehand, and not on each iteration.

If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration:

for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
Loading history...
177
                $lines[$i] = substr($lines[$i], $startingSpaceCount);
178
            }
179 1
        }
180
181
        return implode("\n", $lines);
182
    }
183
}
184