Completed
Push — add/contact-form-tests ( 2304f9...9404d5 )
by
unknown
208:00 queued 197:03
created

KeepAChangelogParser::parse()   F

Complexity

Conditions 27
Paths 212

Size

Total Lines 162

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 27
nc 212
nop 1
dl 0
loc 162
rs 2.5466
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php // phpcs:ignore WordPress.Files.FileName.NotHyphenatedLowercase
2
/**
3
 * Parser for a keepachangelog.com format changelog.
4
 *
5
 * @package automattic/jetpack-changelogger
6
 */
7
8
// phpcs:disable WordPress.WP.AlternativeFunctions, WordPress.NamingConventions.ValidFunctionName.MethodNameInvalid, WordPress.NamingConventions.ValidVariableName
9
10
namespace Automattic\Jetpack\Changelog;
11
12
use DateTime;
13
use DateTimeZone;
14
use InvalidArgumentException;
15
16
/**
17
 * Parser for a keepachangelog.com format changelog.
18
 */
19
class KeepAChangelogParser extends Parser {
20
21
	/**
22
	 * Bullet for changes.
23
	 *
24
	 * @var string
25
	 */
26
	private $bullet = '-';
27
28
	/**
29
	 * Output date format.
30
	 *
31
	 * @var string
32
	 */
33
	private $dateFormat = 'Y-m-d';
34
35
	/**
36
	 * String used as the date for an unreleased version.
37
	 *
38
	 * @var string
39
	 */
40
	private $unreleased = 'unreleased';
41
42
	/**
43
	 * If true, try to parse authors from entries.
44
	 *
45
	 * @var bool
46
	 */
47
	private $parseAuthors = false;
48
49
	/**
50
	 * Constructor.
51
	 *
52
	 * @param array $config Configuration.
53
	 *  - bullet: (string) Bullet for changes. Default '-'.
54
	 *  - dateFormat: (string) Date format to use in output. Default 'Y-m-d'.
55
	 *  - parseAuthors: (bool) Try to parse authors out of change entries. Default false.
56
	 *  - unreleased: (string) String used as the date for an unreleased version.
57
	 */
58
	public function __construct( array $config = array() ) {
59
		if ( ! empty( $config['bullet'] ) ) {
60
			$this->bullet = $config['bullet'];
61
		}
62
		if ( ! empty( $config['dateFormat'] ) ) {
63
			$this->dateFormat = $config['dateFormat'];
64
		}
65
		$this->parseAuthors = ! empty( $config['parseAuthors'] );
66
		if ( ! empty( $config['unreleased'] ) ) {
67
			$this->unreleased = $config['unreleased'];
68
		}
69
	}
70
71
	/**
72
	 * Test if there's a link at the end of a string.
73
	 *
74
	 * @param string $s String.
75
	 * @return array|null Match data.
76
	 */
77
	private function endsInLink( $s ) {
78
		if ( preg_match( '/^\[([^]]+)\]: *(\S+(?: +(?:"(?:[^"]|\\\\.)*"|\'(?:[^\']|\\\\.)*\'|\((?:[^()]|\\\\.)*\)))?)\s*\Z/m', $s, $m ) ) {
79
			return array(
80
				'match' => $m[0],
81
				'id'    => $m[1],
82
				'link'  => $m[2],
83
			);
84
		}
85
		return null;
86
	}
87
88
	/**
89
	 * Split a string in two at the first occurrence of a substring.
90
	 *
91
	 * @param string   $haystack String to split.
92
	 * @param string[] ...$needles Strings to split on. Earliest match in $haystack wins.
93
	 * @return string[] Two elements: The part before $needles and the part after, both trimmed.
94
	 */
0 ignored issues
show
Documentation introduced by
Should the type for parameter $needles not be string[][]?

This check looks for @param annotations where the type inferred by our type inference engine differs from the declared type.

It makes a suggestion as to what type it considers more descriptive.

Most often this is a case of a parameter that can be null in addition to its declared types.

Loading history...
95
	private function split( $haystack, ...$needles ) {
96
		$i = false;
97
		foreach ( $needles as $needle ) {
98
			$j = strpos( $haystack, $needle );
99
			if ( false !== $j ) {
100
				$i = false === $i ? $j : min( $i, $j );
101
			}
102
		}
103
		if ( false === $i ) {
104
			return array( trim( $haystack, "\n" ), '' );
105
		}
106
		return array(
107
			trim( substr( $haystack, 0, $i ), "\n" ),
108
			trim( substr( $haystack, $i ), "\n" ),
109
		);
110
	}
111
112
	/**
113
	 * Parse changelog data into a Changelog object.
114
	 *
115
	 * This does not handle all markdown! In particular, it makes the following assumptions:
116
	 *
117
	 * - All level-2 ATX headings with no indentation are changelog entry headings.
118
	 * - Changelog entry headings consist of either a bare version number or a version number as
119
	 *   link text with no destination or title, followed by a spaced ASCII hyphen, followed by a timestamp.
120
	 * - All level-3 ATX headings with no indentation are changelog entry subheadings.
121
	 * - All change entries are formatted as lists starting with the configured bullet followed by a space,
122
	 *   and do not make use of lazy continuation. Indentation of continued
123
	 *   lines is equal to the length of the bullet plus the space.
124
	 * - All link definitions come at the end of the document, with no intervening blank lines or
125
	 *   other content, and are not indented and do not contain newlines. Link definitions for
126
	 *   changelog entries have no titles.
127
	 *
128
	 * @param string $changelog Changelog contents.
129
	 * @return Changelog
130
	 * @throws InvalidArgumentException If the changelog data cannot be parsed.
131
	 */
132
	public function parse( $changelog ) {
133
		$ret = new Changelog();
134
135
		$bullet = $this->bullet . ' ';
136
		$len    = strlen( $bullet );
137
		$indent = str_repeat( ' ', $len );
138
139
		// Fix newlines and expand tabs.
140
		$changelog = strtr( $changelog, array( "\r\n" => "\n" ) );
141
		$changelog = strtr( $changelog, array( "\r" => "\n" ) );
142
		while ( strpos( $changelog, "\t" ) !== false ) {
143
			$changelog = preg_replace_callback(
144
				'/^([^\t\n]*)\t/m',
145
				function ( $m ) {
146
					return $m[1] . str_repeat( ' ', 4 - ( mb_strlen( $m[1] ) % 4 ) );
147
				},
148
				$changelog
149
			);
150
		}
151
152
		// Extract link definitions.
153
		$links     = array();
154
		$usedlinks = array();
155
		while ( ( $m = $this->endsInLink( $changelog ) ) ) { // phpcs:ignore WordPress.CodeAnalysis.AssignmentInCondition.FoundInWhileCondition
156
			$links[ $m['id'] ]     = $m['link'];
157
			$usedlinks[ $m['id'] ] = false;
158
			$changelog             = (string) substr( $changelog, 0, -strlen( $m['match'] ) );
159
		}
160
		$links = array_reverse( $links );
161
162
		// Everything up to the first level-2 ATX heading is the prologue.
163
		list( $prologue, $changelog ) = $this->split( "\n$changelog", "\n## " );
164
		$ret->setPrologue( $prologue );
165
166
		// Entries make up the rest of the document.
167
		$entries = array();
168
		while ( '' !== $changelog ) {
169
			// Extract the first entry from the changelog file, then extract the heading from it.
170
			list( $content, $changelog ) = $this->split( $changelog, "\n## " );
171
			list( $heading, $content )   = $this->split( $content, "\n" );
172
173
			// Parse the heading and create a ChangelogEntry for it.
174
			if ( ! preg_match( '/^## +(\[?[^] ]+\]?) - (.+?) *$/', $heading, $m ) ) {
175
				throw new InvalidArgumentException( "Invalid heading: $heading" );
176
			}
177
			$link      = null;
178
			$version   = $m[1];
179
			$timestamp = $m[2];
180
			if ( '[' === $version[0] && ']' === substr( $version, -1 ) ) {
181
				$version = substr( $version, 1, -1 );
182
				if ( ! isset( $links[ $version ] ) ) {
183
					throw new InvalidArgumentException( "Heading seems to have a linked version, but link was not found: $heading" );
184
				}
185
				$link                  = $links[ $version ];
186
				$usedlinks[ $version ] = true;
187
			}
188
			if ( $timestamp === $this->unreleased ) {
189
				$timestamp      = null;
190
				$entryTimestamp = new DateTime( 'now', new DateTimeZone( 'UTC' ) );
191
			} else {
192
				try {
193
					$timestamp = new DateTime( $timestamp, new DateTimeZone( 'UTC' ) );
194
				} catch ( \Exception $ex ) {
195
					throw new InvalidArgumentException( "Heading has an invalid timestamp: $heading", 0, $ex );
196
				}
197
				if ( strtotime( $m[2], 0 ) !== strtotime( $m[2], 1000000000 ) ) {
198
					throw new InvalidArgumentException( "Heading has a relative timestamp: $heading" );
199
				}
200
				$entryTimestamp = $timestamp;
201
			}
202
			$entry     = $this->newChangelogEntry(
203
				$version,
204
				array(
205
					'link'      => $link,
206
					'timestamp' => $timestamp,
207
				)
208
			);
209
			$entries[] = $entry;
210
211
			// Extract the prologue, if any.
212
			list( $prologue, $content ) = $this->split( "\n$content", "\n### ", "\n$bullet" );
213
			$entry->setPrologue( $prologue );
214
215
			if ( '' === $content ) {
216
				// Huh, no changes.
217
				continue;
218
			}
219
220
			// Inject an empty heading if necessary so the change parsing can be more straightforward.
221
			if ( '#' !== $content[0] ) {
222
				$content = "### \n$content";
223
			}
224
225
			// Now parse all the subheadings and changes.
226
			while ( '' !== $content ) {
227
				list( $section, $content )    = $this->split( $content, "\n### " );
228
				list( $subheading, $section ) = $this->split( $section, "\n" );
229
				$subheading                   = trim( substr( $subheading, 4 ) );
230
				$changes                      = array();
231
				$cur                          = '';
232
				$section                      = explode( "\n", $section );
233
				while ( $section ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $section of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
234
					$line   = array_shift( $section );
235
					$prefix = substr( $line, 0, $len );
236
					if ( $prefix === $bullet ) {
237
						$cur = trim( $cur );
238
						if ( '' !== $cur ) {
239
							$changes[] = $cur;
240
						}
241
						$cur = substr( $line, $len ) . "\n";
242
					} elseif ( $prefix === $indent ) {
243
						$cur .= substr( $line, $len ) . "\n";
244
					} elseif ( '' === $line ) {
245
						$cur .= "\n";
246
					} else {
247
						// If there are no more subsections and the rest of the lines don't contain
248
						// bullets, assume it's an epilogue. Otherwise, assume it's an error.
249
						$section = $line . "\n" . implode( "\n", $section );
250
						if ( '' === $content && strpos( $section, "\n$bullet" ) === false ) {
251
							$entry->setEpilogue( $section );
252
							break;
253
						} else {
254
							throw new InvalidArgumentException( "Malformatted changes list near \"$line\"" );
255
						}
256
					}
257
				}
258
				$cur = trim( $cur );
259
				if ( '' !== $cur ) {
260
					$changes[] = $cur;
261
				}
262
				foreach ( $changes as $change ) {
263
					$author = '';
264
					if ( $this->parseAuthors && preg_match( '/ \(([^()\n]+)\)$/', $change, $m ) ) {
265
						$author = $m[1];
266
						$change = substr( $change, 0, -strlen( $m[0] ) );
267
					}
268
					$entry->appendChange(
269
						$this->newChangeEntry(
270
							array(
271
								'subheading' => $subheading,
272
								'author'     => $author,
273
								'content'    => $change,
274
								'timestamp'  => $entryTimestamp,
275
							)
276
						)
277
					);
278
				}
279
			}
280
		}
281
		$ret->setEntries( $entries );
282
283
		// Append any unused links to the epilogue.
284
		$epilogue = $ret->getEpilogue();
285
		foreach ( $links as $id => $content ) {
286
			if ( empty( $usedlinks[ $id ] ) ) {
287
				$epilogue .= "\n[$id]: $content";
288
			}
289
		}
290
		$ret->setEpilogue( $epilogue );
291
292
		return $ret;
293
	}
294
295
	/**
296
	 * Write a Changelog object to a string.
297
	 *
298
	 * @param Changelog $changelog Changelog object.
299
	 * @return string
300
	 */
301
	public function format( Changelog $changelog ) {
302
		$ret = '';
303
304
		$bullet = $this->bullet . ' ';
305
		$indent = str_repeat( ' ', strlen( $bullet ) );
306
307
		$prologue = trim( $changelog->getPrologue() );
308
		if ( '' !== $prologue ) {
309
			$ret .= "$prologue\n\n";
310
		}
311
312
		$links = array();
313
		foreach ( $changelog->getEntries() as $entry ) {
314
			$ret .= '## ';
315
			if ( $entry->getLink() !== null ) {
316
				$links[ $entry->getVersion() ] = $entry->getLink();
317
				$ret                          .= "[{$entry->getVersion()}]";
318
			} else {
319
				$ret .= $entry->getVersion();
320
			}
321
			$timestamp = $entry->getTimestamp();
322
			$ret      .= ' - ' . ( null === $timestamp ? $this->unreleased : $timestamp->format( $this->dateFormat ) ) . "\n";
323
324
			$prologue = trim( $entry->getPrologue() );
325
			if ( '' !== $prologue ) {
326
				$ret .= "\n$prologue\n\n";
327
			}
328
329
			foreach ( $entry->getChangesBySubheading() as $heading => $changes ) {
330
				if ( '' !== $heading ) {
0 ignored issues
show
Unused Code Bug introduced by
The strict comparison !== seems to always evaluate to true as the types of '' (string) and $heading (integer) can never be identical. Maybe you want to use a loose comparison != instead?
Loading history...
331
					$heading = "### $heading\n";
332
				} else {
333
					$heading = substr( $ret, -2 ) === "\n\n" ? '' : "\n";
334
				}
335
				foreach ( $changes as $change ) {
0 ignored issues
show
Bug introduced by
The expression $changes of type object<Automattic\Jetpac...Changelog\ChangeEntry>> is not guaranteed to be traversable. How about adding an additional type check?

There are different options of fixing this problem.

  1. If you want to be on the safe side, you can add an additional type-check:

    $collection = json_decode($data, true);
    if ( ! is_array($collection)) {
        throw new \RuntimeException('$collection must be an array.');
    }
    
    foreach ($collection as $item) { /** ... */ }
    
  2. If you are sure that the expression is traversable, you might want to add a doc comment cast to improve IDE auto-completion and static analysis:

    /** @var array $collection */
    $collection = json_decode($data, true);
    
    foreach ($collection as $item) { /** .. */ }
    
  3. Mark the issue as a false-positive: Just hover the remove button, in the top-right corner of this issue for more options.

Loading history...
336
					$text = trim( $change->getContent() );
337
					if ( '' !== $text ) {
338
						if ( $change->getAuthor() !== '' ) {
339
							$text .= " ({$change->getAuthor()})";
340
						}
341
						$ret    .= $heading . $bullet . str_replace( "\n", "\n$indent", $text ) . "\n";
342
						$heading = '';
343
					}
344
				}
345
				$ret .= "\n";
346
			}
347
348
			$epilogue = trim( $entry->getEpilogue() );
349
			if ( '' !== $epilogue ) {
350
				$ret = trim( $ret ) . "\n\n$epilogue\n";
351
			}
352
			$ret = trim( $ret ) . "\n\n";
353
		}
354
355
		$epilogue = trim( $changelog->getEpilogue() );
356
		if ( '' !== $epilogue ) {
357
			$ret .= "$epilogue\n";
358
		}
359
360
		$ret = trim( $ret ) . "\n";
361
362
		if ( $links ) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $links of type array is implicitly converted to a boolean; are you sure this is intended? If so, consider using ! empty($expr) instead to make it clear that you intend to check for an array without elements.

This check marks implicit conversions of arrays to boolean values in a comparison. While in PHP an empty array is considered to be equal (but not identical) to false, this is not always apparent.

Consider making the comparison explicit by using empty(..) or ! empty(...) instead.

Loading history...
363
			if ( ! $this->endsInLink( $ret ) ) {
364
				$ret .= "\n";
365
			}
366
			foreach ( $links as $k => $v ) {
367
				$ret .= "[$k]: $v\n";
368
			}
369
		}
370
371
		return $ret;
372
	}
373
374
}
375