DescriptionFactory::lex() - Code Metrics - Inspection of "Cleanup" - phpDocumentor/ReflectionDocBlock - Measure and Improve Code Quality continuously with Scrutinizer

Completed

Pull Request — master (#179)

by Jaap

created 2019-12-13 14:32 UTC

DescriptionFactory::lex() A

↳ Parent: DescriptionFactory

Complexity

Conditions	2
Paths	2

Size

Total Lines

Duplication

Lines	0
Ratio	0 %

Code Coverage

Tests	12
CRAP Score	2

Importance

Changes

Metric	Value
dl	0
loc	41
ccs	12
cts	12
cp	1
rs	9.264
c	0
b	0
f	0
cc	2
nc	2
nop	1
crap	2

<?php

declare(strict_types=1);

/**
 * This file is part of phpDocumentor.
 *
 * For the full copyright and license information, please view the LICENSE
 * file that was distributed with this source code.
 *
 * @link      http://phpdoc.org
 */

namespace phpDocumentor\Reflection\DocBlock;

use phpDocumentor\Reflection\Types\Context as TypeContext;
use Webmozart\Assert\Assert;
use const PREG_SPLIT_DELIM_CAPTURE;
use function count;
use function explode;
use function implode;
use function ltrim;
use function min;
use function preg_split;
use function str_replace;
use function strlen;
use function strpos;
use function substr;
use function trim;

/**
 * Creates a new Description object given a body of text.
 *
 * Descriptions in phpDocumentor are somewhat complex entities as they can contain one or more tags inside their
 * body that can be replaced with a readable output. The replacing is done by passing a Formatter object to the
 * Description object's `render` method.
 *
 * In addition to the above does a Description support two types of escape sequences:
 *
 * 1. `{@}` to escape the `@` character to prevent it from being interpreted as part of a tag, i.e. `{{@}link}`
 * 2. `{}` to escape the `}` character, this can be used if you want to use the `}` character in the description
 *    of an inline tag.
 *
 * If a body consists of multiple lines then this factory will also remove any superfluous whitespace at the beginning
 * of each line while maintaining any indentation that is used. This will prevent formatting parsers from tripping
 * over unexpected spaces as can be observed with tag descriptions.
 */
class DescriptionFactory
{
    /** @var TagFactory */
    private $tagFactory;

    /**
     * Initializes this factory with the means to construct (inline) tags.
     */
    public function __construct(TagFactory $tagFactory)
    {
        $this->tagFactory = $tagFactory;
    }

    /**
     * Returns the parsed text of this description.
     */
    public function create(string $contents, ?TypeContext $context = null) : Description
    {
        $tokens   = $this->lex($contents);
        $count    = count($tokens);
        $tagCount = 0;
        $tags     = [];

        for ($i = 1; $i < $count; $i += 2) {
            $tag = $this->tagFactory->create($tokens[$i], $context);
            if ($tag !== null) {
                $tags[] = $tag;
            }
            $tokens[$i] = '%' . ++$tagCount . '$s';
        }

        //In order to allow "literal" inline tags, the otherwise invalid
        //sequence "{@}" is changed to "@", and "{}" is changed to "}".
        //"%" is escaped to "%%" because of vsprintf.
        //See unit tests for examples.
        for ($i = 0; $i < $count; $i += 2) {
            $tokens[$i] = str_replace(['{@}', '{}', '%'], ['@', '}', '%%'], $tokens[$i]);
        }

        return new Description(implode('', $tokens), $tags);
    }

    /**
     * Strips the contents from superfluous whitespace and splits the description into a series of tokens.
     *
     * @return string[] A series of tokens of which the description text is composed.
     */
    private function lex(string $contents) : array
    {
        $contents = $this->removeSuperfluousStartingWhitespace($contents);

        // performance optimalization; if there is no inline tag, don't bother splitting it up.
        if (strpos($contents, '{@') === false) {
            return [$contents];
        }

        $parts =  preg_split(
            '/\{
                # "{@}" is not a valid inline tag. This ensures that we do not treat it as one, but treat it literally.
                (?!@\})
                # We want to capture the whole tag line, but without the inline tag delimiters.
                (\@
                    # Match everything up to the next delimiter.
                    [^{}]*
                    # Nested inline tag content should not be captured, or it will appear in the result separately.
                    (?:
                        # Match nested inline tags.
                        (?:
                            # Because we did not catch the tag delimiters earlier, we must be explicit with them here.
                            # Notice that this also matches "{}", as a way to later introduce it as an escape sequence.
                            \{(?1)?\}
                            |
                            # Make sure we match hanging "{".
                            \{
                        )
                        # Match content after the nested inline tag.
                        [^{}]*
                    )* # If there are more inline tags, match them as well. We use "*" since there may not be any
                       # nested inline tags.
                )
            \}/Sux',
            $contents,
            0,
            PREG_SPLIT_DELIM_CAPTURE
        );
        Assert::isArray($parts);
        return $parts;
    }

    /**
     * Removes the superfluous from a multi-line description.
     *
     * When a description has more than one line then it can happen that the second and subsequent lines have an
     * additional indentation. This is commonly in use with tags like this:
     *
     *     {@}since 1.1.0 This is an example
     *         description where we have an
     *         indentation in the second and
     *         subsequent lines.
     *
     * If we do not normalize the indentation then we have superfluous whitespace on the second and subsequent
     * lines and this may cause rendering issues when, for example, using a Markdown converter.
     */
    private function removeSuperfluousStartingWhitespace(string $contents) : string
    {
        $lines = explode("\n", $contents);

        // if there is only one line then we don't have lines with superfluous whitespace and
        // can use the contents as-is
        if (count($lines) <= 1) {
            return $contents;
        }

        // determine how many whitespace characters need to be stripped
        $startingSpaceCount = 9999999;
        for ($i = 1; $i < count($lines); ++$i) {
for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
            // lines with a no length do not count as they are not indented at all
            if (strlen(trim($lines[$i])) === 0) {
                continue;
            }

            // determine the number of prefixing spaces by checking the difference in line length before and after
            // an ltrim
            $startingSpaceCount = min($startingSpaceCount, strlen($lines[$i]) - strlen(ltrim($lines[$i])));
        }

        // strip the number of spaces from each line
        if ($startingSpaceCount > 0) {
            for ($i = 1; $i < count($lines); ++$i) {
for ($i=0; $i<count($array); $i++) { // calls count() on each iteration
}

// Better
for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once
}
                $lines[$i] = substr($lines[$i], $startingSpaceCount);
            }
        }

        return implode("\n", $lines);
    }
}


1		<?php
2
3		declare(strict_types=1);
4
5		/**
6		* This file is part of phpDocumentor.
7		*
8		* For the full copyright and license information, please view the LICENSE
9		* file that was distributed with this source code.
10		*
11		* @link http://phpdoc.org
12		*/
13
14		namespace phpDocumentor\Reflection\DocBlock;
15
16		use phpDocumentor\Reflection\Types\Context as TypeContext;
17		use Webmozart\Assert\Assert;
18		use const PREG_SPLIT_DELIM_CAPTURE;
19		use function count;
20		use function explode;
21		use function implode;
22		use function ltrim;
23		use function min;
24		use function preg_split;
25		use function str_replace;
26		use function strlen;
27		use function strpos;
28		use function substr;
29		use function trim;
30
31		/**
32		* Creates a new Description object given a body of text.
33		*
34		* Descriptions in phpDocumentor are somewhat complex entities as they can contain one or more tags inside their
35		* body that can be replaced with a readable output. The replacing is done by passing a Formatter object to the
36		* Description object's `render` method.
37		*
38		* In addition to the above does a Description support two types of escape sequences:
39		*
40		* 1. `{@}` to escape the `@` character to prevent it from being interpreted as part of a tag, i.e. `{{@}link}`
41		* 2. `{}` to escape the `}` character, this can be used if you want to use the `}` character in the description
42		* of an inline tag.
43		*
44	9	* If a body consists of multiple lines then this factory will also remove any superfluous whitespace at the beginning
45		* of each line while maintaining any indentation that is used. This will prevent formatting parsers from tripping
46	9	* over unexpected spaces as can be observed with tag descriptions.
47	9	*/
48		class DescriptionFactory
49		{
50		/** @var TagFactory */
51		private $tagFactory;
52
53		/**
54		* Initializes this factory with the means to construct (inline) tags.
55		*/
56		public function __construct(TagFactory $tagFactory)
57	9	{
58		$this->tagFactory = $tagFactory;
59	9	}
60
61	9	/**
62		* Returns the parsed text of this description.
63		*/
64		public function create(string $contents, ?TypeContext $context = null) : Description
65		{
66		$tokens = $this->lex($contents);
67		$count = count($tokens);
68		$tagCount = 0;
69		$tags = [];
70
71	9	for ($i = 1; $i < $count; $i += 2) {
72		$tag = $this->tagFactory->create($tokens[$i], $context);
73	9	if ($tag !== null) {
74		$tags[] = $tag;
75		}
76	9	$tokens[$i] = '%' . ++$tagCount . '$s';
77	6	}
78
79		//In order to allow "literal" inline tags, the otherwise invalid
80	3	//sequence "{@}" is changed to "@", and "{}" is changed to "}".
81	3	//"%" is escaped to "%%" because of vsprintf.
82		//See unit tests for examples.
83		for ($i = 0; $i < $count; $i += 2) {
84		$tokens[$i] = str_replace(['{@}', '{}', '%'], ['@', '}', '%%'], $tokens[$i]);
85		}
86
87		return new Description(implode('', $tokens), $tags);
88		}
89
90		/**
91		* Strips the contents from superfluous whitespace and splits the description into a series of tokens.
92		*
93		* @return string[] A series of tokens of which the description text is composed.
94		*/
95		private function lex(string $contents) : array
96		{
97		$contents = $this->removeSuperfluousStartingWhitespace($contents);
98
99		// performance optimalization; if there is no inline tag, don't bother splitting it up.
100		if (strpos($contents, '{@') === false) {
101		return [$contents];
102		}
103
104		$parts = preg_split(
105	3	'/\{
106	3	# "{@}" is not a valid inline tag. This ensures that we do not treat it as one, but treat it literally.
107	3	(?!@\})
108		# We want to capture the whole tag line, but without the inline tag delimiters.
109		(\@
110		# Match everything up to the next delimiter.
111		[^{}]*
112		# Nested inline tag content should not be captured, or it will appear in the result separately.
113		(?:
114		# Match nested inline tags.
115		(?:
116		# Because we did not catch the tag delimiters earlier, we must be explicit with them here.
117		# Notice that this also matches "{}", as a way to later introduce it as an escape sequence.
118		\{(?1)?\}
119	9	\|
120		# Make sure we match hanging "{".
121	9	\{
122	9	)
123	9	# Match content after the nested inline tag.
124		[^{}]*
125	9	)* # If there are more inline tags, match them as well. We use "*" since there may not be any
126	2	# nested inline tags.
127	2	)
128		\}/Sux',
129		$contents,
130		0,
131		PREG_SPLIT_DELIM_CAPTURE
132		);
133		Assert::isArray($parts);
134	9	return $parts;
135	9	}
136
137		/**
138	9	* Removes the superfluous from a multi-line description.
139		*
140		* When a description has more than one line then it can happen that the second and subsequent lines have an
141		* additional indentation. This is commonly in use with tags like this:
142		*
143		* {@}since 1.1.0 This is an example
144		* description where we have an
145		* indentation in the second and
146		* subsequent lines.
147		*
148		* If we do not normalize the indentation then we have superfluous whitespace on the second and subsequent
149		* lines and this may cause rendering issues when, for example, using a Markdown converter.
150		*/
151		private function removeSuperfluousStartingWhitespace(string $contents) : string
152		{
153		$lines = explode("\n", $contents);
154
155		// if there is only one line then we don't have lines with superfluous whitespace and
156		// can use the contents as-is
157		if (count($lines) <= 1) {
158		return $contents;
159	9	}
160
161	9	// determine how many whitespace characters need to be stripped
162		$startingSpaceCount = 9999999;
163		for ($i = 1; $i < count($lines); ++$i) {
		0 ignored issues – show Performance Best Practice introduced 2016-08-23 06:20 UTC by Report Bug Copy Issue Report It seems like you are calling the size function `count()` as part of the test condition. You might want to compute the size beforehand, and not on each iteration. If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration: for ($i=0; $i<count($array); $i++) { // calls count() on each iteration } // Better for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once } Loading history...
164		// lines with a no length do not count as they are not indented at all
165	9	if (strlen(trim($lines[$i])) === 0) {
166	8	continue;
167		}
168
169		// determine the number of prefixing spaces by checking the difference in line length before and after
170	1	// an ltrim
171	1	$startingSpaceCount = min($startingSpaceCount, strlen($lines[$i]) - strlen(ltrim($lines[$i])));
172		}
173	1
174	1	// strip the number of spaces from each line
175		if ($startingSpaceCount > 0) {
176		for ($i = 1; $i < count($lines); ++$i) {
		0 ignored issues – show Performance Best Practice introduced 2016-08-23 06:20 UTC by Report Bug Copy Issue Report It seems like you are calling the size function `count()` as part of the test condition. You might want to compute the size beforehand, and not on each iteration. If the size of the collection does not change during the iteration, it is generally a good practice to compute it beforehand, and not on each iteration: for ($i=0; $i<count($array); $i++) { // calls count() on each iteration } // Better for ($i=0, $c=count($array); $i<$c; $i++) { // calls count() just once } Loading history...
177		$lines[$i] = substr($lines[$i], $startingSpaceCount);
178		}
179	1	}
180
181		return implode("\n", $lines);
182		}
183		}
184

phpDocumentor / ReflectionDocBlock

Pull Request — master (#179)

DescriptionFactory::lex() A

Complexity

Size

Duplication

Code Coverage

Importance

Duplication Side-by-Side

Filter issues like