Completed
Push — renovate/css-loader-5.x ( 70942c...c43e3d )
by
unknown
124:26 queued 114:47
created

KeepAChangelogParser::parse()   F

Complexity

Conditions 26
Paths 124

Size

Total Lines 156

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 26
nc 124
nop 1
dl 0
loc 156
rs 3.1732
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php // phpcs:ignore WordPress.Files.FileName.NotHyphenatedLowercase
2
/**
3
 * Parser for a keepachangelog.com format changelog.
4
 *
5
 * @package automattic/jetpack-changelogger
6
 */
7
8
// phpcs:disable WordPress.WP.AlternativeFunctions, WordPress.NamingConventions.ValidFunctionName.MethodNameInvalid, WordPress.NamingConventions.ValidVariableName
9
10
namespace Automattic\Jetpack\Changelog;
11
12
use DateTime;
13
use DateTimeZone;
14
use InvalidArgumentException;
15
16
/**
17
 * Parser for a keepachangelog.com format changelog.
18
 */
19
class KeepAChangelogParser extends Parser {
20
21
	/**
22
	 * Bullet for changes.
23
	 *
24
	 * @var string
25
	 */
26
	private $bullet = '-';
27
28
	/**
29
	 * Output date format.
30
	 *
31
	 * @var string
32
	 */
33
	private $dateFormat = 'Y-m-d';
34
35
	/**
36
	 * If true, try to parse authors from entries.
37
	 *
38
	 * @var bool
39
	 */
40
	private $parseAuthors = false;
41
42
	/**
43
	 * Constructor.
44
	 *
45
	 * @param array $config Configuration.
46
	 *  - bullet: (string) Bullet for changes. Default '-'.
47
	 *  - dateFormat: (string) Date format to use in output. Default 'Y-m-d'.
48
	 *  - parseAuthors: (bool) Try to parse authors out of change entries. Default false.
49
	 */
50
	public function __construct( array $config = array() ) {
51
		if ( ! empty( $config['bullet'] ) ) {
52
			$this->bullet = $config['bullet'];
53
		}
54
		if ( ! empty( $config['dateFormat'] ) ) {
55
			$this->dateFormat = $config['dateFormat'];
56
		}
57
		$this->parseAuthors = ! empty( $config['parseAuthors'] );
58
	}
59
60
	/**
61
	 * Test if there's a link at the end of a string.
62
	 *
63
	 * @param string $s String.
64
	 * @return array|null Match data.
65
	 */
66
	private function endsInLink( $s ) {
67
		if ( preg_match( '/^\[([^]]+)\]: *(\S+(?: +(?:"(?:[^"]|\\\\.)*"|\'(?:[^\']|\\\\.)*\'|\((?:[^()]|\\\\.)*\)))?)\s*\Z/m', $s, $m ) ) {
68
			return array(
69
				'match' => $m[0],
70
				'id'    => $m[1],
71
				'link'  => $m[2],
72
			);
73
		}
74
		return null;
75
	}
76
77
	/**
78
	 * Split a string in two at the first occurrence of a substring.
79
	 *
80
	 * @param string   $haystack String to split.
81
	 * @param string[] ...$needles Strings to split on. Earliest match in $haystack wins.
82
	 * @return string[] Two elements: The part before $needles and the part after, both trimmed.
83
	 */
0 ignored issues
show
Documentation introduced by
Should the type for parameter $needles not be string[][]?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
84
	private function split( $haystack, ...$needles ) {
85
		$i = false;
86
		foreach ( $needles as $needle ) {
87
			$j = strpos( $haystack, $needle );
88
			if ( false !== $j ) {
89
				$i = false === $i ? $j : min( $i, $j );
90
			}
91
		}
92
		if ( false === $i ) {
93
			return array( trim( $haystack, "\n" ), '' );
94
		}
95
		return array(
96
			trim( substr( $haystack, 0, $i ), "\n" ),
97
			trim( substr( $haystack, $i ), "\n" ),
98
		);
99
	}
100
101
	/**
102
	 * Parse changelog data into a Changelog object.
103
	 *
104
	 * This does not handle all markdown! In particular, it makes the following assumptions:
105
	 *
106
	 * - All level-2 ATX headings with no indentation are changelog entry headings.
107
	 * - Changelog entry headings consist of either a bare version number or a version number as
108
	 *   link text with no destination or title, followed by a spaced ASCII hyphen, followed by a timestamp.
109
	 * - All level-3 ATX headings with no indentation are changelog entry subheadings.
110
	 * - All change entries are formatted as lists starting with the configured bullet followed by a space,
111
	 *   and do not make use of lazy continuation. Indentation of continued
112
	 *   lines is equal to the length of the bullet plus the space.
113
	 * - All link definitions come at the end of the document, with no intervening blank lines or
114
	 *   other content, and are not indented and do not contain newlines. Link definitions for
115
	 *   changelog entries have no titles.
116
	 *
117
	 * @param string $changelog Changelog contents.
118
	 * @return Changelog
119
	 * @throws InvalidArgumentException If the changelog data cannot be parsed.
120
	 */
121
	public function parse( $changelog ) {
122
		$ret = new Changelog();
123
124
		$bullet = $this->bullet . ' ';
125
		$len    = strlen( $bullet );
126
		$indent = str_repeat( ' ', $len );
127
128
		// Fix newlines and expand tabs.
129
		$changelog = strtr( $changelog, array( "\r\n" => "\n" ) );
130
		$changelog = strtr( $changelog, array( "\r" => "\n" ) );
131
		while ( strpos( $changelog, "\t" ) !== false ) {
132
			$changelog = preg_replace_callback(
133
				'/^([^\t\n]*)\t/m',
134
				function ( $m ) {
135
					return $m[1] . str_repeat( ' ', 4 - ( mb_strlen( $m[1] ) % 4 ) );
136
				},
137
				$changelog
138
			);
139
		}
140
141
		// Extract link definitions.
142
		$links     = array();
143
		$usedlinks = array();
144
		while ( ( $m = $this->endsInLink( $changelog ) ) ) { // phpcs:ignore WordPress.CodeAnalysis.AssignmentInCondition.FoundInWhileCondition
145
			$links[ $m['id'] ]     = $m['link'];
146
			$usedlinks[ $m['id'] ] = false;
147
			$changelog             = (string) substr( $changelog, 0, -strlen( $m['match'] ) );
148
		}
149
		$links = array_reverse( $links );
150
151
		// Everything up to the first level-2 ATX heading is the prologue.
152
		list( $prologue, $changelog ) = $this->split( "\n$changelog", "\n## " );
153
		$ret->setPrologue( $prologue );
154
155
		// Entries make up the rest of the document.
156
		$entries = array();
157
		while ( '' !== $changelog ) {
158
			// Extract the first entry from the changelog file, then extract the heading from it.
159
			list( $content, $changelog ) = $this->split( $changelog, "\n## " );
160
			list( $heading, $content )   = $this->split( $content, "\n" );
161
162
			// Parse the heading and create a ChangelogEntry for it.
163
			if ( ! preg_match( '/^## +(\[?[^] ]+\]?) - (.+?) *$/', $heading, $m ) ) {
164
				throw new InvalidArgumentException( "Invalid heading: $heading" );
165
			}
166
			$link      = null;
167
			$version   = $m[1];
168
			$timestamp = $m[2];
169
			if ( '[' === $version[0] && ']' === substr( $version, -1 ) ) {
170
				$version = substr( $version, 1, -1 );
171
				if ( ! isset( $links[ $version ] ) ) {
172
					throw new InvalidArgumentException( "Heading seems to have a linked version, but link was not found: $heading" );
173
				}
174
				$link                  = $links[ $version ];
175
				$usedlinks[ $version ] = true;
176
			}
177
			try {
178
				$timestamp = new DateTime( $timestamp, new DateTimeZone( 'UTC' ) );
179
			} catch ( \Exception $ex ) {
180
				throw new InvalidArgumentException( "Heading has an invalid timestamp: $heading", 0, $ex );
181
			}
182
			if ( strtotime( $m[2], 0 ) !== strtotime( $m[2], 1000000000 ) ) {
183
				throw new InvalidArgumentException( "Heading has a relative timestamp: $heading" );
184
			}
185
			$entry     = $this->newChangelogEntry(
186
				$version,
187
				array(
188
					'link'      => $link,
189
					'timestamp' => $timestamp,
190
				)
191
			);
192
			$entries[] = $entry;
193
194
			// Extract the prologue, if any.
195
			list( $prologue, $content ) = $this->split( "\n$content", "\n### ", "\n$bullet" );
196
			$entry->setPrologue( $prologue );
197
198
			if ( '' === $content ) {
199
				// Huh, no changes.
200
				continue;
201
			}
202
203
			// Inject an empty heading if necessary so the change parsing can be more straightforward.
204
			if ( '#' !== $content[0] ) {
205
				$content = "### \n$content";
206
			}
207
208
			// Now parse all the subheadings and changes.
209
			while ( '' !== $content ) {
210
				list( $section, $content )    = $this->split( $content, "\n### " );
211
				list( $subheading, $section ) = $this->split( $section, "\n" );
212
				$subheading                   = trim( substr( $subheading, 4 ) );
213
				$changes                      = array();
214
				$cur                          = '';
215
				$section                      = explode( "\n", $section );
216
				while ( $section ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $section of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
217
					$line   = array_shift( $section );
218
					$prefix = substr( $line, 0, $len );
219
					if ( $prefix === $bullet ) {
220
						$cur = trim( $cur );
221
						if ( '' !== $cur ) {
222
							$changes[] = $cur;
223
						}
224
						$cur = substr( $line, $len ) . "\n";
225
					} elseif ( $prefix === $indent ) {
226
						$cur .= substr( $line, $len ) . "\n";
227
					} elseif ( '' === $line ) {
228
						$cur .= "\n";
229
					} else {
230
						// If there are no more subsections and the rest of the lines don't contain
231
						// bullets, assume it's an epilogue. Otherwise, assume it's an error.
232
						$section = $line . "\n" . implode( "\n", $section );
233
						if ( '' === $content && strpos( $section, "\n$bullet" ) === false ) {
234
							$entry->setEpilogue( $section );
235
							break;
236
						} else {
237
							throw new InvalidArgumentException( "Malformatted changes list near \"$line\"" );
238
						}
239
					}
240
				}
241
				$cur = trim( $cur );
242
				if ( '' !== $cur ) {
243
					$changes[] = $cur;
244
				}
245
				foreach ( $changes as $change ) {
246
					$author = '';
247
					if ( $this->parseAuthors && preg_match( '/ \(([^()\n]+)\)$/', $change, $m ) ) {
248
						$author = $m[1];
249
						$change = substr( $change, 0, -strlen( $m[0] ) );
250
					}
251
					$entry->appendChange(
252
						$this->newChangeEntry(
253
							array(
254
								'subheading' => $subheading,
255
								'author'     => $author,
256
								'content'    => $change,
257
								'timestamp'  => $timestamp,
258
							)
259
						)
260
					);
261
				}
262
			}
263
		}
264
		$ret->setEntries( $entries );
265
266
		// Append any unused links to the epilogue.
267
		$epilogue = $ret->getEpilogue();
268
		foreach ( $links as $id => $content ) {
269
			if ( empty( $usedlinks[ $id ] ) ) {
270
				$epilogue .= "\n[$id]: $content";
271
			}
272
		}
273
		$ret->setEpilogue( $epilogue );
274
275
		return $ret;
276
	}
277
278
	/**
279
	 * Write a Changelog object to a string.
280
	 *
281
	 * @param Changelog $changelog Changelog object.
282
	 * @return string
283
	 */
284
	public function format( Changelog $changelog ) {
285
		$ret = '';
286
287
		$bullet = $this->bullet . ' ';
288
		$indent = str_repeat( ' ', strlen( $bullet ) );
289
290
		$prologue = trim( $changelog->getPrologue() );
291
		if ( '' !== $prologue ) {
292
			$ret .= "$prologue\n\n";
293
		}
294
295
		$links = array();
296
		foreach ( $changelog->getEntries() as $entry ) {
297
			$ret .= '## ';
298
			if ( $entry->getLink() !== null ) {
299
				$links[ $entry->getVersion() ] = $entry->getLink();
300
				$ret                          .= "[{$entry->getVersion()}]";
301
			} else {
302
				$ret .= $entry->getVersion();
303
			}
304
			$ret .= ' - ' . $entry->getTimestamp()->format( $this->dateFormat ) . "\n";
305
306
			$prologue = trim( $entry->getPrologue() );
307
			if ( '' !== $prologue ) {
308
				$ret .= "\n$prologue\n\n";
309
			}
310
311
			foreach ( $entry->getChangesBySubheading() as $heading => $changes ) {
312
				if ( '' !== $heading ) {
0 ignored issues
show
Unused Code Bug introduced by
The strict comparison !== seems to always evaluate to true as the types of '' (string) and $heading (integer) can never be identical. Maybe you want to use a loose comparison != instead?
Loading history...
313
					$heading = "### $heading\n";
314
				} else {
315
					$heading = substr( $ret, -2 ) === "\n\n" ? '' : "\n";
316
				}
317
				foreach ( $changes as $change ) {
0 ignored issues
show
Bug introduced by
The expression $changes of type object<Automattic\Jetpac...Changelog\ChangeEntry>> is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
318
					$text = trim( $change->getContent() );
319
					if ( '' !== $text ) {
320
						if ( $change->getAuthor() !== '' ) {
321
							$text .= " ({$change->getAuthor()})";
322
						}
323
						$ret    .= $heading . $bullet . str_replace( "\n", "\n$indent", $text ) . "\n";
324
						$heading = '';
325
					}
326
				}
327
				$ret .= "\n";
328
			}
329
330
			$epilogue = trim( $entry->getEpilogue() );
331
			if ( '' !== $epilogue ) {
332
				$ret = trim( $ret ) . "\n\n$epilogue\n";
333
			}
334
			$ret = trim( $ret ) . "\n\n";
335
		}
336
337
		$epilogue = trim( $changelog->getEpilogue() );
338
		if ( '' !== $epilogue ) {
339
			$ret .= "$epilogue\n";
340
		}
341
342
		$ret = trim( $ret ) . "\n";
343
344
		if ( $links ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $links of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
345
			if ( ! $this->endsInLink( $ret ) ) {
346
				$ret .= "\n";
347
			}
348
			foreach ( $links as $k => $v ) {
349
				$ret .= "[$k]: $v\n";
350
			}
351
		}
352
353
		return $ret;
354
	}
355
356
}
357