Completed
Push — kraftbj-patch-2 ( 82c983...ae9d16 )
by
unknown
513:58 queued 503:20
created

sitemaps.php ➔ jetpack_sitemap_initialize()   C

Complexity

Conditions 17
Paths 81

Size

Total Lines 76
Code Lines 31

Duplication

Lines 48
Ratio 63.16 %

Importance

Changes 0
Metric Value
cc 17
eloc 31
nc 81
nop 0
dl 48
loc 76
rs 5.2392
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
/**
3
 * Generate sitemap files in base XML as well as some namespace extensions.
4
 *
5
 * This module generates two different base sitemaps.
6
 *
7
 * 1. sitemap.xml
8
 *    The basic sitemap is updated regularly by wp-cron. It is stored in the
9
 *    database and retrieved when requested. This sitemap aims to include canonical
10
 *    URLs for all published content and abide by the sitemap spec. This is the root
11
 *    of a tree of sitemap and sitemap index xml files, depending on the number of URLs.
12
 *
13
 *    By default the sitemap contains published posts of type 'post' and 'page', as
14
 *    well as the home url. To include other post types use the 'jetpack_sitemap_post_types'
15
 *    filter.
16
 *
17
 * @link http://sitemaps.org/protocol.php Base sitemaps protocol.
18
 * @link https://support.google.com/webmasters/answer/178636 Image sitemap extension.
19
 * @link https://developers.google.com/webmasters/videosearch/sitemaps Video sitemap extension.
20
 *
21
 * 2. news-sitemap.xml
22
 *    The news sitemap is generated on the fly when requested. It does not aim for
23
 *    completeness, instead including at most 1000 of the most recent published posts
24
 *    from the previous 2 days, per the news-sitemap spec.
25
 *
26
 * @link http://www.google.com/support/webmasters/bin/answer.py?answer=74288 News sitemap extension.
27
 *
28
 * @package Jetpack
29
 * @since 3.9.0
30
 * @since 4.7.0 Remove 1000 post limit.
31
 * @author Automattic
32
 */
33
34
require_once dirname( __FILE__ ) . '/sitemap-constants.php';
35
require_once dirname( __FILE__ ) . '/sitemap-buffer.php';
36
require_once dirname( __FILE__ ) . '/sitemap-stylist.php';
37
require_once dirname( __FILE__ ) . '/sitemap-librarian.php';
38
require_once dirname( __FILE__ ) . '/sitemap-finder.php';
39
require_once dirname( __FILE__ ) . '/sitemap-builder.php';
40
41 View Code Duplication
if ( defined( 'WP_DEBUG' ) && ( true === WP_DEBUG ) ) {
42
	require_once dirname( __FILE__ ) . '/sitemap-logger.php';
43
}
44
45
/**
46
 * Governs the generation, storage, and serving of sitemaps.
47
 *
48
 * @since 4.7.0
49
 */
50
class Jetpack_Sitemap_Manager {
51
52
	/**
53
	 * @see Jetpack_Sitemap_Librarian
54
	 * @since 4.7.0
55
	 * @var Jetpack_Sitemap_Librarian $librarian Librarian object for storing and retrieving sitemap data.
56
	 */
57
	private $librarian;
58
59
	/**
60
	 * @see Jetpack_Sitemap_Logger
61
	 * @since 4.7.0
62
	 * @var Jetpack_Sitemap_Logger $logger Logger object for reporting debug messages.
63
	 */
64
	private $logger;
65
66
	/**
67
	 * @see Jetpack_Sitemap_Finder
68
	 * @since 4.7.0
69
	 * @var Jetpack_Sitemap_Finder $finder Finder object for dealing with sitemap URIs.
70
	 */
71
	private $finder;
72
73
	/**
74
	 * Construct a new Jetpack_Sitemap_Manager.
75
	 *
76
	 * @access public
77
	 * @since 4.7.0
78
	 */
79
	public function __construct() {
80
		$this->librarian = new Jetpack_Sitemap_Librarian();
81
		$this->finder = new Jetpack_Sitemap_Finder();
82
83
		if ( defined( 'WP_DEBUG' ) && ( true === WP_DEBUG ) ) {
84
			$this->logger = new Jetpack_Sitemap_Logger();
85
		}
86
87
		// Add callback for sitemap URL handler.
88
		add_action(
89
			'init',
90
			array( $this, 'callback_action_catch_sitemap_urls' )
91
		);
92
93
		// Add generator to wp_cron task list.
94
		$this->schedule_sitemap_generation();
95
96
		// Add sitemap to robots.txt.
97
		add_action(
98
			'do_robotstxt',
99
			array( $this, 'callback_action_do_robotstxt' ),
100
			20
101
		);
102
103
		// The news sitemap is cached; here we add a callback to
104
		// flush the cached news sitemap when a post is published.
105
		add_action(
106
			'publish_post',
107
			array( $this, 'callback_action_flush_news_sitemap_cache' ),
108
			10
109
		);
110
111
		/*
112
		 * Module parameters are stored as options in the database.
113
		 * This allows us to avoid having to process all of init
114
		 * before serving the sitemap data. The following actions
115
		 * process and store these filters.
116
		 */
117
118
		// Process filters and store location string for sitemap.
119
		add_action(
120
			'init',
121
			array( $this, 'callback_action_filter_sitemap_location' ),
122
			999
123
		);
124
125
		return;
126
	}
127
128
	/**
129
	 * Echo a raw string of given content-type.
130
	 *
131
	 * @access private
132
	 * @since 4.7.0
133
	 *
134
	 * @param string $the_content_type The content type to be served.
135
	 * @param string $the_content The string to be echoed.
136
	 */
137
	private function serve_raw_and_die( $the_content_type, $the_content ) {
138
		header( 'Content-Type: ' . $the_content_type . '; charset=UTF-8' );
139
140
		if ( '' === $the_content ) {
141
			http_response_code( 404 );
142
		}
143
144
		echo $the_content;
145
146
		die();
0 ignored issues
show
Coding Style Compatibility introduced by
The method serve_raw_and_die() contains an exit expression.

An exit expression should only be used in rare cases. For example, if you write a short command line script.

In most cases however, using an exit expression makes the code untestable and often causes incompatibilities with other libraries. Thus, unless you are absolutely sure it is required here, we recommend to refactor your code to avoid its usage.

Loading history...
147
	}
148
149
	/**
150
	 * Callback to intercept sitemap url requests and serve sitemap files.
151
	 *
152
	 * @access public
153
	 * @since 4.7.0
154
	 */
155
	public function callback_action_catch_sitemap_urls() {
156
		// Regular expressions for sitemap URL routing.
157
		$regex = array(
158
			'master'        => '/^sitemap\.xml$/',
159
			'sitemap'       => '/^sitemap-[1-9][0-9]*\.xml$/',
160
			'index'         => '/^sitemap-index-[1-9][0-9]*\.xml$/',
161
			'sitemap-style' => '/^sitemap\.xsl$/',
162
			'index-style'   => '/^sitemap-index\.xsl$/',
163
			'image'         => '/^image-sitemap-[1-9][0-9]*\.xml$/',
164
			'image-index'   => '/^image-sitemap-index-[1-9][0-9]*\.xml$/',
165
			'image-style'   => '/^image-sitemap\.xsl$/',
166
			'video'         => '/^video-sitemap-[1-9][0-9]*\.xml$/',
167
			'video-index'   => '/^video-sitemap-index-[1-9][0-9]*\.xml$/',
168
			'video-style'   => '/^video-sitemap\.xsl$/',
169
			'news'          => '/^news-sitemap\.xml$/',
170
			'news-style'    => '/^news-sitemap\.xsl$/',
171
		);
172
173
		// The raw path(+query) of the requested URI.
174
		if ( isset( $_SERVER['REQUEST_URI'] ) ) { // WPCS: Input var okay.
175
			$raw_uri = sanitize_text_field(
176
				wp_unslash( $_SERVER['REQUEST_URI'] ) // WPCS: Input var okay.
177
			);
178
		} else {
179
			$raw_uri = '';
180
		}
181
182
		$request = $this->finder->recognize_sitemap_uri( $raw_uri );
183
184
		if ( isset( $request['sitemap_name'] ) ) {
185
186
			/**
187
			 * Filter the content type used to serve the sitemap XML files.
188
			 *
189
			 * @module sitemaps
190
			 *
191
			 * @since 3.9.0
192
			 *
193
			 * @param string $xml_content_type By default, it's 'text/xml'.
194
			 */
195
			$xml_content_type = apply_filters( 'jetpack_sitemap_content_type', 'text/xml' );
196
197
			// Catch master sitemap xml.
198
			if ( preg_match( $regex['master'], $request['sitemap_name'] ) ) {
199
				$this->serve_raw_and_die(
200
					$xml_content_type,
201
					$this->librarian->get_sitemap_text(
202
						jp_sitemap_filename( JP_MASTER_SITEMAP_TYPE, 0 ),
203
						JP_MASTER_SITEMAP_TYPE
204
					)
205
				);
206
			}
207
208
			// Catch sitemap xml.
209 View Code Duplication
			if ( preg_match( $regex['sitemap'], $request['sitemap_name'] ) ) {
210
				$this->serve_raw_and_die(
211
					$xml_content_type,
212
					$this->librarian->get_sitemap_text(
213
						$request['sitemap_name'],
214
						JP_PAGE_SITEMAP_TYPE
215
					)
216
				);
217
			}
218
219
			// Catch sitemap index xml.
220 View Code Duplication
			if ( preg_match( $regex['index'], $request['sitemap_name'] ) ) {
221
				$this->serve_raw_and_die(
222
					$xml_content_type,
223
					$this->librarian->get_sitemap_text(
224
						$request['sitemap_name'],
225
						JP_PAGE_SITEMAP_INDEX_TYPE
226
					)
227
				);
228
			}
229
230
			// Catch sitemap xsl.
231
			if ( preg_match( $regex['sitemap-style'], $request['sitemap_name'] ) ) {
232
				$this->serve_raw_and_die(
233
					'application/xml',
234
					Jetpack_Sitemap_Stylist::sitemap_xsl()
235
				);
236
			}
237
238
			// Catch sitemap index xsl.
239
			if ( preg_match( $regex['index-style'], $request['sitemap_name'] ) ) {
240
				$this->serve_raw_and_die(
241
					'application/xml',
242
					Jetpack_Sitemap_Stylist::sitemap_index_xsl()
243
				);
244
			}
245
246
			// Catch image sitemap xml.
247 View Code Duplication
			if ( preg_match( $regex['image'], $request['sitemap_name'] ) ) {
248
				$this->serve_raw_and_die(
249
					$xml_content_type,
250
					$this->librarian->get_sitemap_text(
251
						$request['sitemap_name'],
252
						JP_IMAGE_SITEMAP_TYPE
253
					)
254
				);
255
			}
256
257
			// Catch image sitemap index xml.
258 View Code Duplication
			if ( preg_match( $regex['image-index'], $request['sitemap_name'] ) ) {
259
				$this->serve_raw_and_die(
260
					$xml_content_type,
261
					$this->librarian->get_sitemap_text(
262
						$request['sitemap_name'],
263
						JP_IMAGE_SITEMAP_INDEX_TYPE
264
					)
265
				);
266
			}
267
268
			// Catch image sitemap xsl.
269
			if ( preg_match( $regex['image-style'], $request['sitemap_name'] ) ) {
270
				$this->serve_raw_and_die(
271
					'application/xml',
272
					Jetpack_Sitemap_Stylist::image_sitemap_xsl()
273
				);
274
			}
275
276
			// Catch video sitemap xml.
277 View Code Duplication
			if ( preg_match( $regex['video'], $request['sitemap_name'] ) ) {
278
				$this->serve_raw_and_die(
279
					$xml_content_type,
280
					$this->librarian->get_sitemap_text(
281
						$request['sitemap_name'],
282
						JP_VIDEO_SITEMAP_TYPE
283
					)
284
				);
285
			}
286
287
			// Catch video sitemap index xml.
288 View Code Duplication
			if ( preg_match( $regex['video-index'], $request['sitemap_name'] ) ) {
289
				$this->serve_raw_and_die(
290
					$xml_content_type,
291
					$this->librarian->get_sitemap_text(
292
						$request['sitemap_name'],
293
						JP_VIDEO_SITEMAP_INDEX_TYPE
294
					)
295
				);
296
			}
297
298
			// Catch video sitemap xsl.
299
			if ( preg_match( $regex['video-style'], $request['sitemap_name'] ) ) {
300
				$this->serve_raw_and_die(
301
					'application/xml',
302
					Jetpack_Sitemap_Stylist::video_sitemap_xsl()
303
				);
304
			}
305
306
			// Catch news sitemap xml.
307
			if ( preg_match( $regex['news'], $request['sitemap_name'] ) ) {
308
				$sitemap_builder = new Jetpack_Sitemap_Builder();
309
				$this->serve_raw_and_die(
310
					$xml_content_type,
311
					$sitemap_builder->news_sitemap_xml()
312
				);
313
			}
314
315
			// Catch news sitemap xsl.
316
			if ( preg_match( $regex['news-style'], $request['sitemap_name'] ) ) {
317
				$this->serve_raw_and_die(
318
					'application/xml',
319
					Jetpack_Sitemap_Stylist::news_sitemap_xsl()
320
				);
321
			}
322
		}
323
324
		// URL did not match any sitemap patterns.
325
		return;
326
	}
327
328
	/**
329
	 * Callback for adding sitemap-interval to the list of schedules.
330
	 *
331
	 * @access public
332
	 * @since 4.7.0
333
	 *
334
	 * @param array $schedules The array of WP_Cron schedules.
335
	 *
336
	 * @return array The updated array of WP_Cron schedules.
337
	 */
338
	public function callback_add_sitemap_schedule( $schedules ) {
339
		$schedules['sitemap-interval'] = array(
340
			'interval' => JP_SITEMAP_INTERVAL,
341
			'display'  => __( 'Sitemap Interval', 'jetpack' ),
342
		);
343
		return $schedules;
344
	}
345
346
	/**
347
	 * Add actions to schedule sitemap generation.
348
	 * Should only be called once, in the constructor.
349
	 *
350
	 * @access private
351
	 * @since 4.7.0
352
	 */
353
	private function schedule_sitemap_generation() {
354
		// Add cron schedule.
355
		add_filter( 'cron_schedules', array( $this, 'callback_add_sitemap_schedule' ) );
356
357
		$sitemap_builder = new Jetpack_Sitemap_Builder();
358
359
		add_action(
360
			'jp_sitemap_cron_hook',
361
			array( $sitemap_builder, 'update_sitemap' )
362
		);
363
364
		if ( ! wp_next_scheduled( 'jp_sitemap_cron_hook' ) ) {
365
			wp_schedule_event(
366
				time(),
367
				'sitemap-interval',
368
				'jp_sitemap_cron_hook'
369
			);
370
		}
371
372
		return;
373
	}
374
375
	/**
376
	 * Callback to add sitemap to robots.txt.
377
	 *
378
	 * @access public
379
	 * @since 4.7.0
380
	 */
381
	public function callback_action_do_robotstxt() {
382
383
		/**
384
		 * Filter whether to make the default sitemap discoverable to robots or not. Default true.
385
		 *
386
		 * @module sitemaps
387
		 * @since 3.9.0
388
		 *
389
		 * @param bool $discover_sitemap Make default sitemap discoverable to robots.
390
		 */
391
		$discover_sitemap = apply_filters( 'jetpack_sitemap_generate', true );
392
393 View Code Duplication
		if ( true === $discover_sitemap ) {
394
			$sitemap_url      = $this->finder->construct_sitemap_url( 'sitemap.xml' );
0 ignored issues
show
Coding Style introduced by
Equals sign not aligned correctly; expected 1 space but found 6 spaces

This check looks for improperly formatted assignments.

Every assignment must have exactly one space before and one space after the equals operator.

To illustrate:

$a = "a";
$ab = "ab";
$abc = "abc";

will have no issues, while

$a   = "a";
$ab  = "ab";
$abc = "abc";

will report issues in lines 1 and 2.

Loading history...
395
			echo 'Sitemap: ' . esc_url( $sitemap_url ) . "\n";
396
		}
397
398
		/**
399
		 * Filter whether to make the news sitemap discoverable to robots or not. Default true.
400
		 *
401
		 * @module sitemaps
402
		 * @since 3.9.0
403
		 *
404
		 * @param bool $discover_news_sitemap Make default news sitemap discoverable to robots.
405
		 */
406
		$discover_news_sitemap = apply_filters( 'jetpack_news_sitemap_generate', true );
407
408 View Code Duplication
		if ( true === $discover_news_sitemap ) {
409
			$news_sitemap_url = $this->finder->construct_sitemap_url( 'news-sitemap.xml' );
410
			echo 'Sitemap: ' . esc_url( $news_sitemap_url ) . "\n";
411
		}
412
413
		return;
414
	}
415
416
	/**
417
	 * Callback to delete the news sitemap cache.
418
	 *
419
	 * @access public
420
	 * @since 4.7.0
421
	 */
422
	public function callback_action_flush_news_sitemap_cache() {
423
		delete_transient( 'jetpack_news_sitemap_xml' );
424
	}
425
426
	/**
427
	 * Callback to set the sitemap location.
428
	 *
429
	 * @access public
430
	 * @since 4.7.0
431
	 */
432
	public function callback_action_filter_sitemap_location() {
433
		update_option(
434
			'jetpack_sitemap_location',
435
			/**
436
			 * Additional path for sitemap URIs. Default value is empty.
437
			 *
438
			 * This string is any additional path fragment you want included between
439
			 * the home URL and the sitemap filenames. Exactly how this fragment is
440
			 * interpreted depends on your permalink settings. For example:
441
			 *
442
			 *   Pretty permalinks:
443
			 *     home_url() . jetpack_sitemap_location . '/sitemap.xml'
444
			 *
445
			 *   Plain ("ugly") permalinks:
446
			 *     home_url() . jetpack_sitemap_location . '/?jetpack-sitemap=sitemap.xml'
447
			 *
448
			 *   PATHINFO permalinks:
449
			 *     home_url() . '/index.php' . jetpack_sitemap_location . '/sitemap.xml'
450
			 *
451
			 * where 'sitemap.xml' is the name of a specific sitemap file.
452
			 * The value of this filter must be a valid path fragment per RFC 3986;
453
			 * in particular it must either be empty or begin with a '/'.
454
			 * Also take care that any restrictions on sitemap location imposed by
455
			 * the sitemap protocol are satisfied.
456
			 *
457
			 * The result of this filter is stored in an option, 'jetpack_sitemap_location';
458
			 * that option is what gets read when the sitemap location is needed.
459
			 * This way we don't have to wait for init to finish before building sitemaps.
460
			 *
461
			 * @link https://tools.ietf.org/html/rfc3986#section-3.3 RFC 3986
462
			 * @link http://www.sitemaps.org/ The sitemap protocol
463
			 *
464
			 * @since 4.7.0
465
			 */
466
			apply_filters(
467
				'jetpack_sitemap_location',
468
				''
469
			)
470
		);
471
472
		return;
473
	}
474
475
} // End Jetpack_Sitemap_Manager class.
476
477
new Jetpack_Sitemap_Manager();
478