Completed
Push — master ( 470825...143e87 )
by Joram van den
06:11
created

markdown.php ➔ Markdown()   A

Complexity

Conditions 2
Paths 2

Size

Total Lines 15
Code Lines 6

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 2
eloc 6
nc 2
nop 1
dl 0
loc 15
rs 9.4285
c 0
b 0
f 0
1
<?php
2
3
//
4
// Markdown  -  A text-to-HTML conversion tool for web writers
5
//
6
// PHP Markdown
7
// Copyright (c) 2004-2013 Michel Fortin
8
// <http://michelf.ca/projects/php-markdown/>
9
//
10
// Original Markdown
11
// Copyright (c) 2004-2006 John Gruber
12
// <http://daringfireball.net/projects/markdown/>
13
//
14
15
define('MARKDOWN_VERSION', '1.0.1q'); // 11 Apr 2013
16
17
//
18
// Global default settings:
19
//
20
21
// Change to ">" for HTML output
22
@define('MARKDOWN_EMPTY_ELEMENT_SUFFIX', ' />');
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
23
24
// Define the width of a tab for code blocks.
25
@define('MARKDOWN_TAB_WIDTH', 4);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
26
27
//
28
// WordPress settings:
29
//
30
31
// Change to false to remove Markdown from posts and/or comments.
32
@define('MARKDOWN_WP_POSTS', true);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
33
@define('MARKDOWN_WP_COMMENTS', true);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
34
35
//## Standard Function Interface ###
36
37
@define('MARKDOWN_PARSER_CLASS', 'Markdown_Parser');
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
38
39
function Markdown($text)
40
{
41
    //
42
    // Initialize the parser and return the result of its transform method.
43
    //
44
    // Setup static parser variable.
45
    static $parser;
46
    if (!isset($parser)) {
47
        $parser_class = MARKDOWN_PARSER_CLASS;
48
        $parser = new $parser_class();
49
    }
50
51
    // Transform text using parser.
52
    return $parser->transform($text);
53
}
54
55
//## WordPress Plugin Interface ###
56
57
/*
58
Plugin Name: Markdown
59
Plugin URI: http://michelf.ca/projects/php-markdown/
60
Description: <a href="http://daringfireball.net/projects/markdown/syntax">Markdown syntax</a> allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by <a href="http://daringfireball.net/">John Gruber</a>. <a href="http://michelf.ca/projects/php-markdown/">More...</a>
61
Version: 1.0.1q
62
Author: Michel Fortin
63
Author URI: http://michelf.ca/
64
*/
65
66
if (isset($wp_version)) {
67
    // More details about how it works here:
68
    // <http://michelf.ca/weblog/2005/wordpress-text-flow-vs-markdown/>
69
70
    // Post content and excerpts
71
    // - Remove WordPress paragraph generator.
72
    // - Run Markdown on excerpt, then remove all tags.
73
    // - Add paragraph tag around the excerpt, but remove it for the excerpt rss.
74
    if (MARKDOWN_WP_POSTS) {
75
        remove_filter('the_content', 'wpautop');
76
        remove_filter('the_content_rss', 'wpautop');
77
        remove_filter('the_excerpt', 'wpautop');
78
        add_filter('the_content', 'Markdown', 6);
79
        add_filter('the_content_rss', 'Markdown', 6);
80
        add_filter('get_the_excerpt', 'Markdown', 6);
81
        add_filter('get_the_excerpt', 'trim', 7);
82
        add_filter('the_excerpt', 'mdwp_add_p');
83
        add_filter('the_excerpt_rss', 'mdwp_strip_p');
84
85
        remove_filter('content_save_pre', 'balanceTags', 50);
86
        remove_filter('excerpt_save_pre', 'balanceTags', 50);
87
        add_filter('the_content', 'balanceTags', 50);
88
        add_filter('get_the_excerpt', 'balanceTags', 9);
89
    }
90
91
    // Comments
92
    // - Remove WordPress paragraph generator.
93
    // - Remove WordPress auto-link generator.
94
    // - Scramble important tags before passing them to the kses filter.
95
    // - Run Markdown on excerpt then remove paragraph tags.
96
    if (MARKDOWN_WP_COMMENTS) {
97
        remove_filter('comment_text', 'wpautop', 30);
98
        remove_filter('comment_text', 'make_clickable');
99
        add_filter('pre_comment_content', 'Markdown', 6);
100
        add_filter('pre_comment_content', 'mdwp_hide_tags', 8);
101
        add_filter('pre_comment_content', 'mdwp_show_tags', 12);
102
        add_filter('get_comment_text', 'Markdown', 6);
103
        add_filter('get_comment_excerpt', 'Markdown', 6);
104
        add_filter('get_comment_excerpt', 'mdwp_strip_p', 7);
105
106
        global $mdwp_hidden_tags, $mdwp_placeholders;
107
        $mdwp_hidden_tags = explode(' ',
108
            '<p> </p> <pre> </pre> <ol> </ol> <ul> </ul> <li> </li>');
109
        $mdwp_placeholders = explode(' ', str_rot13(
110
            'pEj07ZbbBZ U1kqgh4w4p pre2zmeN6K QTi31t9pre ol0MP1jzJR '.
111
            'ML5IjmbRol ulANi1NsGY J7zRLJqPul liA8ctl16T K9nhooUHli'));
112
    }
113
114
    function mdwp_add_p($text)
115
    {
116
        if (!preg_match('{^$|^<(p|ul|ol|dl|pre|blockquote)>}i', $text)) {
117
            $text = '<p>'.$text.'</p>';
118
            $text = preg_replace('{\n{2,}}', "</p>\n\n<p>", $text);
119
        }
120
121
        return $text;
122
    }
123
124
    function mdwp_strip_p($t)
125
    {
126
        return preg_replace('{</?p>}i', '', $t);
127
    }
128
129
    function mdwp_hide_tags($text)
130
    {
131
        global $mdwp_hidden_tags, $mdwp_placeholders;
132
133
        return str_replace($mdwp_hidden_tags, $mdwp_placeholders, $text);
134
    }
135
136
    function mdwp_show_tags($text)
137
    {
138
        global $mdwp_hidden_tags, $mdwp_placeholders;
139
140
        return str_replace($mdwp_placeholders, $mdwp_hidden_tags, $text);
141
    }
142
}
143
144
//## bBlog Plugin Info ###
145
146
function identify_modifier_markdown()
147
{
148
    return [
149
        'name'        => 'markdown',
150
        'type'        => 'modifier',
151
        'nicename'    => 'Markdown',
152
        'description' => 'A text-to-HTML conversion tool for web writers',
153
        'authors'     => 'Michel Fortin and John Gruber',
154
        'licence'     => 'BSD-like',
155
        'version'     => MARKDOWN_VERSION,
156
        'help'        => '<a href="http://daringfireball.net/projects/markdown/syntax">Markdown syntax</a> allows you to write using an easy-to-read, easy-to-write plain text format. Based on the original Perl version by <a href="http://daringfireball.net/">John Gruber</a>. <a href="http://michelf.ca/projects/php-markdown/">More...</a>',
157
    ];
158
}
159
160
//## Smarty Modifier Interface ###
161
162
function smarty_modifier_markdown($text)
163
{
164
    return Markdown($text);
165
}
166
167
//## Textile Compatibility Mode ###
168
169
// Rename this file to "classTextile.php" and it can replace Textile everywhere.
170
171
if (strcasecmp(substr(__FILE__, -16), 'classTextile.php') == 0) {
172
    // Try to include PHP SmartyPants. Should be in the same directory.
173
    @include_once 'smartypants.php';
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
174
175
    // Fake Textile class. It calls Markdown instead.
176
    class Textile
177
    {
178
        public function TextileThis($text, $lite = '', $encode = '')
179
        {
180
            if ($lite == '' && $encode == '') {
181
                $text = Markdown($text);
182
            }
183
            if (function_exists('SmartyPants')) {
184
                $text = SmartyPants($text);
185
            }
186
187
            return $text;
188
        }
189
190
        // Fake restricted version: restrictions are not supported for now.
191
        public function TextileRestricted($text, $lite = '', $noimage = '')
0 ignored issues
show
Unused Code introduced by
The parameter $noimage is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
192
        {
193
            return $this->TextileThis($text, $lite);
194
        }
195
196
        // Workaround to ensure compatibility with TextPattern 4.0.3.
197
        public function blockLite($text)
198
        {
199
            return $text;
200
        }
201
    }
202
}
203
204
//
205
// Markdown Parser Class
206
//
207
208
class Markdown_Parser
209
{
210
    //## Configuration Variables ###
211
212
    // Change to ">" for HTML output.
213
    public $empty_element_suffix = MARKDOWN_EMPTY_ELEMENT_SUFFIX;
214
    public $tab_width = MARKDOWN_TAB_WIDTH;
215
216
    // Change to `true` to disallow markup or entities.
217
    public $no_markup = false;
218
    public $no_entities = false;
219
220
    // Predefined urls and titles for reference links and images.
221
    public $predef_urls = [];
222
    public $predef_titles = [];
223
224
    //## Parser Implementation ###
225
226
    // Regex to match balanced [brackets].
227
    // Needed to insert a maximum bracked depth while converting to PHP.
228
    public $nested_brackets_depth = 6;
229
    public $nested_brackets_re;
230
231
    public $nested_url_parenthesis_depth = 4;
232
    public $nested_url_parenthesis_re;
233
234
    // Table of hash values for escaped characters:
235
    public $escape_chars = '\`*_{}[]()>#+-.!';
236
    public $escape_chars_re;
237
238
    public function Markdown_Parser()
239
    {
240
        //
241
        // Constructor function. Initialize appropriate member variables.
242
        //
243
        $this->_initDetab();
244
        $this->prepareItalicsAndBold();
245
246
        $this->nested_brackets_re =
247
            str_repeat('(?>[^\[\]]+|\[', $this->nested_brackets_depth).
248
            str_repeat('\])*', $this->nested_brackets_depth);
249
250
        $this->nested_url_parenthesis_re =
251
            str_repeat('(?>[^()\s]+|\(', $this->nested_url_parenthesis_depth).
252
            str_repeat('(?>\)))*', $this->nested_url_parenthesis_depth);
253
254
        $this->escape_chars_re = '['.preg_quote($this->escape_chars).']';
255
256
        // Sort document, block, and span gamut in ascendent priority order.
257
        asort($this->document_gamut);
258
        asort($this->block_gamut);
259
        asort($this->span_gamut);
260
    }
261
262
    // Internal hashes used during transformation.
263
    public $urls = [];
264
    public $titles = [];
265
    public $html_hashes = [];
266
267
    // Status flag to avoid invalid nesting.
268
    public $in_anchor = false;
269
270
    public function setup()
271
    {
272
        //
273
        // Called before the transformation process starts to setup parser
274
        // states.
275
        //
276
        // Clear global hashes.
277
        $this->urls = $this->predef_urls;
278
        $this->titles = $this->predef_titles;
279
        $this->html_hashes = [];
280
281
        $this->in_anchor = false;
282
    }
283
284
    public function teardown()
285
    {
286
        //
287
        // Called after the transformation process to clear any variable
288
        // which may be taking up memory unnecessarly.
289
        //
290
        $this->urls = [];
291
        $this->titles = [];
292
        $this->html_hashes = [];
293
    }
294
295
    public function transform($text)
296
    {
297
        //
298
        // Main function. Performs some preprocessing on the input text
299
        // and pass it through the document gamut.
300
        //
301
        $this->setup();
302
303
        // Remove UTF-8 BOM and marker character in input, if present.
304
        $text = preg_replace('{^\xEF\xBB\xBF|\x1A}', '', $text);
305
306
        // Standardize line endings:
307
        //   DOS to Unix and Mac to Unix
308
        $text = preg_replace('{\r\n?}', "\n", $text);
309
310
        // Make sure $text ends with a couple of newlines:
311
        $text .= "\n\n";
312
313
        // Convert all tabs to spaces.
314
        $text = $this->detab($text);
315
316
        // Turn block-level HTML blocks into hash entries
317
        $text = $this->hashHTMLBlocks($text);
318
319
        // Strip any lines consisting only of spaces and tabs.
320
        // This makes subsequent regexen easier to write, because we can
321
        // match consecutive blank lines with /\n+/ instead of something
322
        // contorted like /[ ]*\n+/ .
323
        $text = preg_replace('/^[ ]+$/m', '', $text);
324
325
        // Run document gamut methods.
326
        foreach ($this->document_gamut as $method => $priority) {
327
            $text = $this->$method($text);
328
        }
329
330
        $this->teardown();
331
332
        return $text."\n";
333
    }
334
335
    public $document_gamut = [
336
        // Strip link definitions, store in hashes.
337
        'stripLinkDefinitions' => 20,
338
        'runBasicBlockGamut'   => 30,
339
    ];
340
341
    public function stripLinkDefinitions($text)
342
    {
343
        //
344
        // Strips link definitions from text, stores the URLs and titles in
345
        // hash references.
346
        //
347
        $less_than_tab = $this->tab_width - 1;
348
349
        // Link defs are in the form: ^[id]: url "optional title"
350
        $text = preg_replace_callback('{
351
							^[ ]{0,'.$less_than_tab.'}\[(.+)\][ ]?:	# id = $1
352
							  [ ]*
353
							  \n?				# maybe *one* newline
354
							  [ ]*
355
							(?:
356
							  <(.+?)>			# url = $2
357
							|
358
							  (\S+?)			# url = $3
359
							)
360
							  [ ]*
361
							  \n?				# maybe one newline
362
							  [ ]*
363
							(?:
364
								(?<=\s)			# lookbehind for whitespace
365
								["(]
366
								(.*?)			# title = $4
367
								[")]
368
								[ ]*
369
							)?	# title is optional
370
							(?:\n+|\Z)
371
			}xm',
372
            [&$this, '_stripLinkDefinitions_callback'],
373
            $text);
374
375
        return $text;
376
    }
377
378
    public function _stripLinkDefinitions_callback($matches)
379
    {
380
        $link_id = strtolower($matches[1]);
381
        $url = $matches[2] == '' ? $matches[3] : $matches[2];
382
        $this->urls[$link_id] = $url;
383
        $this->titles[$link_id] = &$matches[4];
384
385
        return ''; // String that will replace the block
386
    }
387
388
    public function hashHTMLBlocks($text)
389
    {
390
        if ($this->no_markup) {
391
            return $text;
392
        }
393
394
        $less_than_tab = $this->tab_width - 1;
395
396
        // Hashify HTML blocks:
397
        // We only want to do this for block-level HTML tags, such as headers,
398
        // lists, and tables. That's because we still want to wrap <p>s around
399
        // "paragraphs" that are wrapped in non-block-level tags, such as anchors,
400
        // phrase emphasis, and spans. The list of tags we're looking for is
401
        // hard-coded:
402
        //
403
        // *  List "a" is made of tags which can be both inline or block-level.
404
        //    These will be treated block-level when the start tag is alone on
405
        //    its line, otherwise they're not matched here and will be taken as
406
        //    inline later.
407
        // *  List "b" is made of tags which are always block-level;
408
        //
409
        $block_tags_a_re = 'ins|del';
410
        $block_tags_b_re = 'p|div|h[1-6]|blockquote|pre|table|dl|ol|ul|address|'.
411
            'script|noscript|form|fieldset|iframe|math|svg|'.
412
            'article|section|nav|aside|hgroup|header|footer|'.
413
            'figure';
414
415
        // Regular expression for the content of a block tag.
416
        $nested_tags_level = 4;
417
        $attr = '
418
			(?>				# optional tag attributes
419
			  \s			# starts with whitespace
420
			  (?>
421
				[^>"/]+		# text outside quotes
422
			  |
423
				/+(?!>)		# slash not followed by ">"
424
			  |
425
				"[^"]*"		# text inside double quotes (tolerate ">")
426
			  |
427
				\'[^\']*\'	# text inside single quotes (tolerate ">")
428
			  )*
429
			)?
430
			';
431
        $content =
432
            str_repeat('
433
				(?>
434
				  [^<]+			# content without tag
435
				|
436
				  <\2			# nested opening tag
437
					'.$attr.'	# attributes
438
					(?>
439
					  />
440
					|
441
					  >', $nested_tags_level).    // end of opening tag
442
            '.*?'.                    // last level nested tag content
443
            str_repeat('
444
					  </\2\s*>	# closing nested tag
445
					)
446
				  |
447
					<(?!/\2\s*>	# other tags with a different name
448
				  )
449
				)*',
450
                $nested_tags_level);
451
        $content2 = str_replace('\2', '\3', $content);
452
453
        // First, look for nested blocks, e.g.:
454
        // 	<div>
455
        // 		<div>
456
        // 		tags for inner block must be indented.
457
        // 		</div>
458
        // 	</div>
459
        //
460
        // The outermost tags must start at the left margin for this to match, and
461
        // the inner nested divs must be indented.
462
        // We need to do this before the next, more liberal match, because the next
463
        // match will start at the first `<div>` and stop at the first `</div>`.
464
        $text = preg_replace_callback('{(?>
465
			(?>
466
				(?<=\n\n)		# Starting after a blank line
467
				|				# or
468
				\A\n?			# the beginning of the doc
469
			)
470
			(						# save in $1
471
472
			  # Match from `\n<tag>` to `</tag>\n`, handling nested tags
473
			  # in between.
474
475
						[ ]{0,'.$less_than_tab.'}
476
						<('.$block_tags_b_re.')# start tag = $2
477
						'.$attr.'>			# attributes followed by > and \n
478
						'.$content.'		# content, support nesting
479
						</\2>				# the matching end tag
480
						[ ]*				# trailing spaces/tabs
481
						(?=\n+|\Z)	# followed by a newline or end of document
482
483
			| # Special version for tags of group a.
484
485
						[ ]{0,'.$less_than_tab.'}
486
						<('.$block_tags_a_re.')# start tag = $3
487
						'.$attr.'>[ ]*\n	# attributes followed by >
488
						'.$content2.'		# content, support nesting
489
						</\3>				# the matching end tag
490
						[ ]*				# trailing spaces/tabs
491
						(?=\n+|\Z)	# followed by a newline or end of document
492
493
			| # Special case just for <hr />. It was easier to make a special
494
			  # case than to make the other regex more complicated.
495
496
						[ ]{0,'.$less_than_tab.'}
497
						<(hr)				# start tag = $2
498
						'.$attr.'			# attributes
499
						/?>					# the matching end tag
500
						[ ]*
501
						(?=\n{2,}|\Z)		# followed by a blank line or end of document
502
503
			| # Special case for standalone HTML comments:
504
505
					[ ]{0,'.$less_than_tab.'}
506
					(?s:
507
						<!-- .*? -->
508
					)
509
					[ ]*
510
					(?=\n{2,}|\Z)		# followed by a blank line or end of document
511
512
			| # PHP and ASP-style processor instructions (<? and <%)
513
514
					[ ]{0,'.$less_than_tab.'}
515
					(?s:
516
						<([?%])			# $2
517
						.*?
518
						\2>
519
					)
520
					[ ]*
521
					(?=\n{2,}|\Z)		# followed by a blank line or end of document
522
523
			)
524
			)}Sxmi',
525
            [&$this, '_hashHTMLBlocks_callback'],
526
            $text);
527
528
        return $text;
529
    }
530
531
    public function _hashHTMLBlocks_callback($matches)
532
    {
533
        $text = $matches[1];
534
        $key = $this->hashBlock($text);
535
536
        return "\n\n$key\n\n";
537
    }
538
539
    public function hashPart($text, $boundary = 'X')
540
    {
541
        //
542
        // Called whenever a tag must be hashed when a function insert an atomic
543
        // element in the text stream. Passing $text to through this function gives
544
        // a unique text-token which will be reverted back when calling unhash.
545
        //
546
        // The $boundary argument specify what character should be used to surround
547
        // the token. By convension, "B" is used for block elements that needs not
548
        // to be wrapped into paragraph tags at the end, ":" is used for elements
549
        // that are word separators and "X" is used in the general case.
550
        //
551
        // Swap back any tag hash found in $text so we do not have to `unhash`
552
        // multiple times at the end.
553
        $text = $this->unhash($text);
554
555
        // Then hash the block.
556
        static $i = 0;
557
        $key = "$boundary\x1A".++$i.$boundary;
558
        $this->html_hashes[$key] = $text;
559
560
        return $key; // String that will replace the tag.
561
    }
562
563
    public function hashBlock($text)
564
    {
565
        //
566
        // Shortcut function for hashPart with block-level boundaries.
567
        //
568
        return $this->hashPart($text, 'B');
569
    }
570
571
    public $block_gamut = [
572
        //
573
        // These are all the transformations that form block-level
574
        // tags like paragraphs, headers, and list items.
575
        //
576
        'doHeaders'         => 10,
577
        'doHorizontalRules' => 20,
578
        'doLists'           => 40,
579
        'doCodeBlocks'      => 50,
580
        'doBlockQuotes'     => 60,
581
    ];
582
583
    public function runBlockGamut($text)
584
    {
585
        //
586
        // Run block gamut tranformations.
587
        //
588
        // We need to escape raw HTML in Markdown source before doing anything
589
        // else. This need to be done for each block, and not only at the
590
        // begining in the Markdown function since hashed blocks can be part of
591
        // list items and could have been indented. Indented blocks would have
592
        // been seen as a code block in a previous pass of hashHTMLBlocks.
593
        $text = $this->hashHTMLBlocks($text);
594
595
        return $this->runBasicBlockGamut($text);
596
    }
597
598
    public function runBasicBlockGamut($text)
599
    {
600
        //
601
        // Run block gamut tranformations, without hashing HTML blocks. This is
602
        // useful when HTML blocks are known to be already hashed, like in the first
603
        // whole-document pass.
604
        //
605
        foreach ($this->block_gamut as $method => $priority) {
606
            $text = $this->$method($text);
607
        }
608
609
        // Finally form paragraph and restore hashed blocks.
610
        $text = $this->formParagraphs($text);
611
612
        return $text;
613
    }
614
615
    public function doHorizontalRules($text)
616
    {
617
        // Do Horizontal Rules:
618
        return preg_replace(
619
            '{
620
				^[ ]{0,3}	# Leading space
621
				([-*_])		# $1: First marker
622
				(?>			# Repeated marker group
623
					[ ]{0,2}	# Zero, one, or two spaces.
624
					\1			# Marker character
625
				){2,}		# Group repeated at least twice
626
				[ ]*		# Tailing spaces
627
				$			# End of line.
628
			}mx',
629
            "\n".$this->hashBlock("<hr$this->empty_element_suffix")."\n",
630
            $text);
631
    }
632
633
    public $span_gamut = [
634
        //
635
        // These are all the transformations that occur *within* block-level
636
        // tags like paragraphs, headers, and list items.
637
        //
638
        // Process character escapes, code spans, and inline HTML
639
        // in one shot.
640
        'parseSpan'           => -30,
641
        // Process anchor and image tags. Images must come first,
642
        // because ![foo][f] looks like an anchor.
643
        'doImages'            => 10,
644
        'doAnchors'           => 20,
645
        // Make links out of things like `<http://example.com/>`
646
        // Must come after doAnchors, because you can use < and >
647
        // delimiters in inline links like [this](<url>).
648
        'doAutoLinks'         => 30,
649
        'encodeAmpsAndAngles' => 40,
650
        'doItalicsAndBold'    => 50,
651
        'doHardBreaks'        => 60,
652
    ];
653
654
    public function runSpanGamut($text)
655
    {
656
        //
657
        // Run span gamut tranformations.
658
        //
659
        foreach ($this->span_gamut as $method => $priority) {
660
            $text = $this->$method($text);
661
        }
662
663
        return $text;
664
    }
665
666
    public function doHardBreaks($text)
667
    {
668
        // Do hard breaks:
669
        return preg_replace_callback('/ {2,}\n/',
670
            [&$this, '_doHardBreaks_callback'], $text);
671
    }
672
673
    public function _doHardBreaks_callback($matches)
0 ignored issues
show
Unused Code introduced by
The parameter $matches is not used and could be removed.

This check looks from parameters that have been defined for a function or method, but which are not used in the method body.

Loading history...
674
    {
675
        return $this->hashPart("<br$this->empty_element_suffix\n");
676
    }
677
678
    public function doAnchors($text)
679
    {
680
        //
681
        // Turn Markdown link shortcuts into XHTML <a> tags.
682
        //
683
        if ($this->in_anchor) {
684
            return $text;
685
        }
686
        $this->in_anchor = true;
687
688
        //
689
        // First, handle reference-style links: [link text] [id]
690
        //
691
        $text = preg_replace_callback('{
692
			(					# wrap whole match in $1
693
			  \[
694
				('.$this->nested_brackets_re.')	# link text = $2
695
			  \]
696
697
			  [ ]?				# one optional space
698
			  (?:\n[ ]*)?		# one optional newline followed by spaces
699
700
			  \[
701
				(.*?)		# id = $3
702
			  \]
703
			)
704
			}xs',
705
            [&$this, '_doAnchors_reference_callback'], $text);
706
707
        //
708
        // Next, inline-style links: [link text](url "optional title")
709
        //
710
        $text = preg_replace_callback('{
711
			(				# wrap whole match in $1
712
			  \[
713
				('.$this->nested_brackets_re.')	# link text = $2
714
			  \]
715
			  \(			# literal paren
716
				[ \n]*
717
				(?:
718
					<(.+?)>	# href = $3
719
				|
720
					('.$this->nested_url_parenthesis_re.')	# href = $4
721
				)
722
				[ \n]*
723
				(			# $5
724
				  ([\'"])	# quote char = $6
725
				  (.*?)		# Title = $7
726
				  \6		# matching quote
727
				  [ \n]*	# ignore any spaces/tabs between closing quote and )
728
				)?			# title is optional
729
			  \)
730
			)
731
			}xs',
732
            [&$this, '_doAnchors_inline_callback'], $text);
733
734
        //
735
        // Last, handle reference-style shortcuts: [link text]
736
        // These must come last in case you've also got [link text][1]
737
        // or [link text](/foo)
738
        //
739
        $text = preg_replace_callback('{
740
			(					# wrap whole match in $1
741
			  \[
742
				([^\[\]]+)		# link text = $2; can\'t contain [ or ]
743
			  \]
744
			)
745
			}xs',
746
            [&$this, '_doAnchors_reference_callback'], $text);
747
748
        $this->in_anchor = false;
749
750
        return $text;
751
    }
752
753
    public function _doAnchors_reference_callback($matches)
754
    {
755
        $whole_match = $matches[1];
756
        $link_text = $matches[2];
757
        $link_id = &$matches[3];
758
759
        if ($link_id == '') {
760
            // for shortcut links like [this][] or [this].
761
            $link_id = $link_text;
762
        }
763
764
        // lower-case and turn embedded newlines into spaces
765
        $link_id = strtolower($link_id);
766
        $link_id = preg_replace('{[ ]?\n}', ' ', $link_id);
767
768
        if (isset($this->urls[$link_id])) {
769
            $url = $this->urls[$link_id];
770
            $url = $this->encodeAttribute($url);
771
772
            $result = "<a href=\"$url\"";
773 View Code Duplication
            if (isset($this->titles[$link_id])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
774
                $title = $this->titles[$link_id];
775
                $title = $this->encodeAttribute($title);
776
                $result .= " title=\"$title\"";
777
            }
778
779
            $link_text = $this->runSpanGamut($link_text);
780
            $result .= ">$link_text</a>";
781
            $result = $this->hashPart($result);
782
        } else {
783
            $result = $whole_match;
784
        }
785
786
        return $result;
787
    }
788
789 View Code Duplication
    public function _doAnchors_inline_callback($matches)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
790
    {
791
        $whole_match = $matches[1];
0 ignored issues
show
Unused Code introduced by
$whole_match is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
792
        $link_text = $this->runSpanGamut($matches[2]);
793
        $url = $matches[3] == '' ? $matches[4] : $matches[3];
794
        $title = &$matches[7];
795
796
        $url = $this->encodeAttribute($url);
797
798
        $result = "<a href=\"$url\"";
799
        if (isset($title)) {
800
            $title = $this->encodeAttribute($title);
801
            $result .= " title=\"$title\"";
802
        }
803
804
        $link_text = $this->runSpanGamut($link_text);
805
        $result .= ">$link_text</a>";
806
807
        return $this->hashPart($result);
808
    }
809
810
    public function doImages($text)
811
    {
812
        //
813
        // Turn Markdown image shortcuts into <img> tags.
814
        //
815
        //
816
        // First, handle reference-style labeled images: ![alt text][id]
817
        //
818
        $text = preg_replace_callback('{
819
			(				# wrap whole match in $1
820
			  !\[
821
				('.$this->nested_brackets_re.')		# alt text = $2
822
			  \]
823
824
			  [ ]?				# one optional space
825
			  (?:\n[ ]*)?		# one optional newline followed by spaces
826
827
			  \[
828
				(.*?)		# id = $3
829
			  \]
830
831
			)
832
			}xs',
833
            [&$this, '_doImages_reference_callback'], $text);
834
835
        //
836
        // Next, handle inline images:  ![alt text](url "optional title")
837
        // Don't forget: encode * and _
838
        //
839
        $text = preg_replace_callback('{
840
			(				# wrap whole match in $1
841
			  !\[
842
				('.$this->nested_brackets_re.')		# alt text = $2
843
			  \]
844
			  \s?			# One optional whitespace character
845
			  \(			# literal paren
846
				[ \n]*
847
				(?:
848
					<(\S*)>	# src url = $3
849
				|
850
					('.$this->nested_url_parenthesis_re.')	# src url = $4
851
				)
852
				[ \n]*
853
				(			# $5
854
				  ([\'"])	# quote char = $6
855
				  (.*?)		# title = $7
856
				  \6		# matching quote
857
				  [ \n]*
858
				)?			# title is optional
859
			  \)
860
			)
861
			}xs',
862
            [&$this, '_doImages_inline_callback'], $text);
863
864
        return $text;
865
    }
866
867
    public function _doImages_reference_callback($matches)
868
    {
869
        $whole_match = $matches[1];
870
        $alt_text = $matches[2];
871
        $link_id = strtolower($matches[3]);
872
873
        if ($link_id == '') {
874
            $link_id = strtolower($alt_text); // for shortcut links like ![this][].
875
        }
876
877
        $alt_text = $this->encodeAttribute($alt_text);
878
        if (isset($this->urls[$link_id])) {
879
            $url = $this->encodeAttribute($this->urls[$link_id]);
880
            $result = "<img src=\"$url\" alt=\"$alt_text\"";
881 View Code Duplication
            if (isset($this->titles[$link_id])) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
882
                $title = $this->titles[$link_id];
883
                $title = $this->encodeAttribute($title);
884
                $result .= " title=\"$title\"";
885
            }
886
            $result .= $this->empty_element_suffix;
887
            $result = $this->hashPart($result);
888
        } else {
889
            // If there's no such link ID, leave intact:
890
            $result = $whole_match;
891
        }
892
893
        return $result;
894
    }
895
896 View Code Duplication
    public function _doImages_inline_callback($matches)
0 ignored issues
show
Duplication introduced by
This method seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
897
    {
898
        $whole_match = $matches[1];
0 ignored issues
show
Unused Code introduced by
$whole_match is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
899
        $alt_text = $matches[2];
900
        $url = $matches[3] == '' ? $matches[4] : $matches[3];
901
        $title = &$matches[7];
902
903
        $alt_text = $this->encodeAttribute($alt_text);
904
        $url = $this->encodeAttribute($url);
905
        $result = "<img src=\"$url\" alt=\"$alt_text\"";
906
        if (isset($title)) {
907
            $title = $this->encodeAttribute($title);
908
            $result .= " title=\"$title\""; // $title already quoted
909
        }
910
        $result .= $this->empty_element_suffix;
911
912
        return $this->hashPart($result);
913
    }
914
915
    public function doHeaders($text)
916
    {
917
        // Setext-style headers:
918
        //	  Header 1
919
        //	  ========
920
        //
921
        //	  Header 2
922
        //	  --------
923
        //
924
        $text = preg_replace_callback('{ ^(.+?)[ ]*\n(=+|-+)[ ]*\n+ }mx',
925
            [&$this, '_doHeaders_callback_setext'], $text);
926
927
        // atx-style headers:
928
        //	# Header 1
929
        //	## Header 2
930
        //	## Header 2 with closing hashes ##
931
        //	...
932
        //	###### Header 6
933
        //
934
        $text = preg_replace_callback('{
935
				^(\#{1,6})	# $1 = string of #\'s
936
				[ ]*
937
				(.+?)		# $2 = Header text
938
				[ ]*
939
				\#*			# optional closing #\'s (not counted)
940
				\n+
941
			}xm',
942
            [&$this, '_doHeaders_callback_atx'], $text);
943
944
        return $text;
945
    }
946
947
    public function _doHeaders_callback_setext($matches)
948
    {
949
        // Terrible hack to check we haven't found an empty list item.
950
        if ($matches[2] == '-' && preg_match('{^-(?: |$)}', $matches[1])) {
951
            return $matches[0];
952
        }
953
954
        $level = $matches[2][0] == '=' ? 1 : 2;
955
        $block = "<h$level>".$this->runSpanGamut($matches[1])."</h$level>";
956
957
        return "\n".$this->hashBlock($block)."\n\n";
958
    }
959
960
    public function _doHeaders_callback_atx($matches)
961
    {
962
        $level = strlen($matches[1]);
963
        $block = "<h$level>".$this->runSpanGamut($matches[2])."</h$level>";
964
965
        return "\n".$this->hashBlock($block)."\n\n";
966
    }
967
968
    public function doLists($text)
969
    {
970
        //
971
        // Form HTML ordered (numbered) and unordered (bulleted) lists.
972
        //
973
        $less_than_tab = $this->tab_width - 1;
974
975
        // Re-usable patterns to match list item bullets and number markers:
976
        $marker_ul_re = '[*+-]';
977
        $marker_ol_re = '\d+[\.]';
978
        $marker_any_re = "(?:$marker_ul_re|$marker_ol_re)";
0 ignored issues
show
Unused Code introduced by
$marker_any_re is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
979
980
        $markers_relist = [
981
            $marker_ul_re => $marker_ol_re,
982
            $marker_ol_re => $marker_ul_re,
983
        ];
984
985
        foreach ($markers_relist as $marker_re => $other_marker_re) {
986
            // Re-usable pattern to match any entirel ul or ol list:
987
            $whole_list_re = '
988
				(								# $1 = whole list
989
				  (								# $2
990
					([ ]{0,'.$less_than_tab.'})	# $3 = number of spaces
991
					('.$marker_re.')			# $4 = first list item marker
992
					[ ]+
993
				  )
994
				  (?s:.+?)
995
				  (								# $5
996
					  \z
997
					|
998
					  \n{2,}
999
					  (?=\S)
1000
					  (?!						# Negative lookahead for another list item marker
1001
						[ ]*
1002
						'.$marker_re.'[ ]+
1003
					  )
1004
					|
1005
					  (?=						# Lookahead for another kind of list
1006
					    \n
1007
						\3						# Must have the same indentation
1008
						'.$other_marker_re.'[ ]+
1009
					  )
1010
				  )
1011
				)
1012
			'; // mx
1013
1014
            // We use a different prefix before nested lists than top-level lists.
1015
            // See extended comment in _ProcessListItems().
1016
1017
            if ($this->list_level) {
1018
                $text = preg_replace_callback('{
1019
						^
1020
						'.$whole_list_re.'
1021
					}mx',
1022
                    [&$this, '_doLists_callback'], $text);
1023
            } else {
1024
                $text = preg_replace_callback('{
1025
						(?:(?<=\n)\n|\A\n?) # Must eat the newline
1026
						'.$whole_list_re.'
1027
					}mx',
1028
                    [&$this, '_doLists_callback'], $text);
1029
            }
1030
        }
1031
1032
        return $text;
1033
    }
1034
1035
    public function _doLists_callback($matches)
1036
    {
1037
        // Re-usable patterns to match list item bullets and number markers:
1038
        $marker_ul_re = '[*+-]';
1039
        $marker_ol_re = '\d+[\.]';
1040
        $marker_any_re = "(?:$marker_ul_re|$marker_ol_re)";
0 ignored issues
show
Unused Code introduced by
$marker_any_re is not used, you could remove the assignment.

This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.

$myVar = 'Value';
$higher = false;

if (rand(1, 6) > 3) {
    $higher = true;
} else {
    $higher = false;
}

Both the $myVar assignment in line 1 and the $higher assignment in line 2 are dead. The first because $myVar is never used and the second because $higher is always overwritten for every possible time line.

Loading history...
1041
1042
        $list = $matches[1];
1043
        $list_type = preg_match("/$marker_ul_re/", $matches[4]) ? 'ul' : 'ol';
1044
1045
        $marker_any_re = ($list_type == 'ul' ? $marker_ul_re : $marker_ol_re);
1046
1047
        $list .= "\n";
1048
        $result = $this->processListItems($list, $marker_any_re);
1049
1050
        $result = $this->hashBlock("<$list_type>\n".$result."</$list_type>");
1051
1052
        return "\n".$result."\n\n";
1053
    }
1054
1055
    public $list_level = 0;
1056
1057
    public function processListItems($list_str, $marker_any_re)
1058
    {
1059
        //
1060
        //	Process the contents of a single ordered or unordered list, splitting it
1061
        //	into individual list items.
1062
        //
1063
        // The $this->list_level global keeps track of when we're inside a list.
1064
        // Each time we enter a list, we increment it; when we leave a list,
1065
        // we decrement. If it's zero, we're not in a list anymore.
1066
        //
1067
        // We do this because when we're not inside a list, we want to treat
1068
        // something like this:
1069
        //
1070
        //		I recommend upgrading to version
1071
        //		8. Oops, now this line is treated
1072
        //		as a sub-list.
1073
        //
1074
        // As a single paragraph, despite the fact that the second line starts
1075
        // with a digit-period-space sequence.
1076
        //
1077
        // Whereas when we're inside a list (or sub-list), that line will be
1078
        // treated as the start of a sub-list. What a kludge, huh? This is
1079
        // an aspect of Markdown's syntax that's hard to parse perfectly
1080
        // without resorting to mind-reading. Perhaps the solution is to
1081
        // change the syntax rules such that sub-lists must start with a
1082
        // starting cardinal number; e.g. "1." or "a.".
1083
1084
        $this->list_level++;
1085
1086
        // trim trailing blank lines:
1087
        $list_str = preg_replace("/\n{2,}\\z/", "\n", $list_str);
1088
1089
        $list_str = preg_replace_callback('{
1090
			(\n)?							# leading line = $1
1091
			(^[ ]*)							# leading whitespace = $2
1092
			('.$marker_any_re.'				# list marker and space = $3
1093
				(?:[ ]+|(?=\n))	# space only required if item is not empty
1094
			)
1095
			((?s:.*?))						# list item text   = $4
1096
			(?:(\n+(?=\n))|\n)				# tailing blank line = $5
1097
			(?= \n* (\z | \2 ('.$marker_any_re.') (?:[ ]+|(?=\n))))
1098
			}xm',
1099
            [&$this, '_processListItems_callback'], $list_str);
1100
1101
        $this->list_level--;
1102
1103
        return $list_str;
1104
    }
1105
1106
    public function _processListItems_callback($matches)
1107
    {
1108
        $item = $matches[4];
1109
        $leading_line = &$matches[1];
1110
        $leading_space = &$matches[2];
1111
        $marker_space = $matches[3];
1112
        $tailing_blank_line = &$matches[5];
1113
1114
        if ($leading_line || $tailing_blank_line ||
1115
            preg_match('/\n{2,}/', $item)
1116
        ) {
1117
            // Replace marker with the appropriate whitespace indentation
1118
            $item = $leading_space.str_repeat(' ', strlen($marker_space)).$item;
1119
            $item = $this->runBlockGamut($this->outdent($item)."\n");
1120
        } else {
1121
            // Recursion for sub-lists:
1122
            $item = $this->doLists($this->outdent($item));
1123
            $item = preg_replace('/\n+$/', '', $item);
1124
            $item = $this->runSpanGamut($item);
1125
        }
1126
1127
        return '<li>'.$item."</li>\n";
1128
    }
1129
1130
    public function doCodeBlocks($text)
1131
    {
1132
        //
1133
        //	Process Markdown `<pre><code>` blocks.
1134
        //
1135
        $text = preg_replace_callback('{
1136
				(?:\n\n|\A\n?)
1137
				(	            # $1 = the code block -- one or more lines, starting with a space/tab
1138
				  (?>
1139
					[ ]{'.$this->tab_width.'}  # Lines must start with a tab or a tab-width of spaces
1140
					.*\n+
1141
				  )+
1142
				)
1143
				((?=^[ ]{0,'.$this->tab_width.'}\S)|\Z)	# Lookahead for non-space at line-start, or end of doc
1144
			}xm',
1145
            [&$this, '_doCodeBlocks_callback'], $text);
1146
1147
        return $text;
1148
    }
1149
1150
    public function _doCodeBlocks_callback($matches)
1151
    {
1152
        $codeblock = $matches[1];
1153
1154
        $codeblock = $this->outdent($codeblock);
1155
        $codeblock = htmlspecialchars($codeblock, ENT_NOQUOTES);
1156
1157
        // trim leading newlines and trailing newlines
1158
        $codeblock = preg_replace('/\A\n+|\n+\z/', '', $codeblock);
1159
1160
        $codeblock = "<pre><code>$codeblock\n</code></pre>";
1161
1162
        return "\n\n".$this->hashBlock($codeblock)."\n\n";
1163
    }
1164
1165
    public function makeCodeSpan($code)
1166
    {
1167
        //
1168
        // Create a code span markup for $code. Called from handleSpanToken.
1169
        //
1170
        $code = htmlspecialchars(trim($code), ENT_NOQUOTES);
1171
1172
        return $this->hashPart("<code>$code</code>");
1173
    }
1174
1175
    public $em_relist = [
1176
        ''  => '(?:(?<!\*)\*(?!\*)|(?<!_)_(?!_))(?=\S|$)(?![\.,:;]\s)',
1177
        '*' => '(?<=\S|^)(?<!\*)\*(?!\*)',
1178
        '_' => '(?<=\S|^)(?<!_)_(?!_)',
1179
    ];
1180
    public $strong_relist = [
1181
        ''   => '(?:(?<!\*)\*\*(?!\*)|(?<!_)__(?!_))(?=\S|$)(?![\.,:;]\s)',
1182
        '**' => '(?<=\S|^)(?<!\*)\*\*(?!\*)',
1183
        '__' => '(?<=\S|^)(?<!_)__(?!_)',
1184
    ];
1185
    public $em_strong_relist = [
1186
        ''    => '(?:(?<!\*)\*\*\*(?!\*)|(?<!_)___(?!_))(?=\S|$)(?![\.,:;]\s)',
1187
        '***' => '(?<=\S|^)(?<!\*)\*\*\*(?!\*)',
1188
        '___' => '(?<=\S|^)(?<!_)___(?!_)',
1189
    ];
1190
    public $em_strong_prepared_relist;
1191
1192
    public function prepareItalicsAndBold()
1193
    {
1194
        //
1195
        // Prepare regular expressions for searching emphasis tokens in any
1196
        // context.
1197
        //
1198
        foreach ($this->em_relist as $em => $em_re) {
1199
            foreach ($this->strong_relist as $strong => $strong_re) {
1200
                // Construct list of allowed token expressions.
1201
                $token_relist = [];
1202
                if (isset($this->em_strong_relist["$em$strong"])) {
1203
                    $token_relist[] = $this->em_strong_relist["$em$strong"];
1204
                }
1205
                $token_relist[] = $em_re;
1206
                $token_relist[] = $strong_re;
1207
1208
                // Construct master expression from list.
1209
                $token_re = '{('.implode('|', $token_relist).')}';
1210
                $this->em_strong_prepared_relist["$em$strong"] = $token_re;
1211
            }
1212
        }
1213
    }
1214
1215
    public function doItalicsAndBold($text)
1216
    {
1217
        $token_stack = [''];
1218
        $text_stack = [''];
1219
        $em = '';
1220
        $strong = '';
1221
        $tree_char_em = false;
1222
1223
        while (1) {
1224
            //
1225
            // Get prepared regular expression for seraching emphasis tokens
1226
            // in current context.
1227
            //
1228
            $token_re = $this->em_strong_prepared_relist["$em$strong"];
1229
1230
            //
1231
            // Each loop iteration search for the next emphasis token.
1232
            // Each token is then passed to handleSpanToken.
1233
            //
1234
            $parts = preg_split($token_re, $text, 2, PREG_SPLIT_DELIM_CAPTURE);
1235
            $text_stack[0] .= $parts[0];
1236
            $token = &$parts[1];
1237
            $text = &$parts[2];
1238
1239 View Code Duplication
            if (empty($token)) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1240
                // Reached end of text span: empty stack without emitting.
1241
                // any more emphasis.
1242
                while ($token_stack[0]) {
1243
                    $text_stack[1] .= array_shift($token_stack);
1244
                    $text_stack[0] .= array_shift($text_stack);
1245
                }
1246
                break;
1247
            }
1248
1249
            $token_len = strlen($token);
1250
            if ($tree_char_em) {
1251
                // Reached closing marker while inside a three-char emphasis.
1252
                if ($token_len == 3) {
1253
                    // Three-char closing marker, close em and strong.
1254
                    array_shift($token_stack);
1255
                    $span = array_shift($text_stack);
1256
                    $span = $this->runSpanGamut($span);
1257
                    $span = "<strong><em>$span</em></strong>";
1258
                    $text_stack[0] .= $this->hashPart($span);
1259
                    $em = '';
1260
                    $strong = '';
1261
                } else {
1262
                    // Other closing marker: close one em or strong and
1263
                    // change current token state to match the other
1264
                    $token_stack[0] = str_repeat($token[0], 3 - $token_len);
1265
                    $tag = $token_len == 2 ? 'strong' : 'em';
1266
                    $span = $text_stack[0];
1267
                    $span = $this->runSpanGamut($span);
1268
                    $span = "<$tag>$span</$tag>";
1269
                    $text_stack[0] = $this->hashPart($span);
1270
                    $$tag = ''; // $$tag stands for $em or $strong
1271
                }
1272
                $tree_char_em = false;
1273
            } else {
1274
                if ($token_len == 3) {
1275
                    if ($em) {
1276
                        // Reached closing marker for both em and strong.
1277
                        // Closing strong marker:
1278
                        for ($i = 0; $i < 2; ++$i) {
1279
                            $shifted_token = array_shift($token_stack);
1280
                            $tag = strlen($shifted_token) == 2 ? 'strong' : 'em';
1281
                            $span = array_shift($text_stack);
1282
                            $span = $this->runSpanGamut($span);
1283
                            $span = "<$tag>$span</$tag>";
1284
                            $text_stack[0] .= $this->hashPart($span);
1285
                            $$tag = ''; // $$tag stands for $em or $strong
1286
                        }
1287
                    } else {
1288
                        // Reached opening three-char emphasis marker. Push on token
1289
                        // stack; will be handled by the special condition above.
1290
                        $em = $token[0];
1291
                        $strong = "$em$em";
1292
                        array_unshift($token_stack, $token);
1293
                        array_unshift($text_stack, '');
1294
                        $tree_char_em = true;
1295
                    }
1296
                } else {
1297
                    if ($token_len == 2) {
1298
                        if ($strong) {
1299
                            // Unwind any dangling emphasis marker:
1300 View Code Duplication
                            if (strlen($token_stack[0]) == 1) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
1301
                                $text_stack[1] .= array_shift($token_stack);
1302
                                $text_stack[0] .= array_shift($text_stack);
1303
                            }
1304
                            // Closing strong marker:
1305
                            array_shift($token_stack);
1306
                            $span = array_shift($text_stack);
1307
                            $span = $this->runSpanGamut($span);
1308
                            $span = "<strong>$span</strong>";
1309
                            $text_stack[0] .= $this->hashPart($span);
1310
                            $strong = '';
1311
                        } else {
1312
                            array_unshift($token_stack, $token);
1313
                            array_unshift($text_stack, '');
1314
                            $strong = $token;
1315
                        }
1316
                    } else {
1317
                        // Here $token_len == 1
1318
                        if ($em) {
1319
                            if (strlen($token_stack[0]) == 1) {
1320
                                // Closing emphasis marker:
1321
                                array_shift($token_stack);
1322
                                $span = array_shift($text_stack);
1323
                                $span = $this->runSpanGamut($span);
1324
                                $span = "<em>$span</em>";
1325
                                $text_stack[0] .= $this->hashPart($span);
1326
                                $em = '';
1327
                            } else {
1328
                                $text_stack[0] .= $token;
1329
                            }
1330
                        } else {
1331
                            array_unshift($token_stack, $token);
1332
                            array_unshift($text_stack, '');
1333
                            $em = $token;
1334
                        }
1335
                    }
1336
                }
1337
            }
1338
        }
1339
1340
        return $text_stack[0];
1341
    }
1342
1343
    public function doBlockQuotes($text)
1344
    {
1345
        $text = preg_replace_callback('/
1346
			  (								# Wrap whole match in $1
1347
				(?>
1348
				  ^[ ]*>[ ]?			# ">" at the start of a line
1349
					.+\n					# rest of the first line
1350
				  (.+\n)*					# subsequent consecutive lines
1351
				  \n*						# blanks
1352
				)+
1353
			  )
1354
			/xm',
1355
            [&$this, '_doBlockQuotes_callback'], $text);
1356
1357
        return $text;
1358
    }
1359
1360
    public function _doBlockQuotes_callback($matches)
1361
    {
1362
        $bq = $matches[1];
1363
        // trim one level of quoting - trim whitespace-only lines
1364
        $bq = preg_replace('/^[ ]*>[ ]?|^[ ]+$/m', '', $bq);
1365
        $bq = $this->runBlockGamut($bq);        // recurse
1366
1367
        $bq = preg_replace('/^/m', '  ', $bq);
1368
        // These leading spaces cause problem with <pre> content,
1369
        // so we need to fix that:
1370
        $bq = preg_replace_callback('{(\s*<pre>.+?</pre>)}sx',
1371
            [&$this, '_doBlockQuotes_callback2'], $bq);
1372
1373
        return "\n".$this->hashBlock("<blockquote>\n$bq\n</blockquote>")."\n\n";
1374
    }
1375
1376
    public function _doBlockQuotes_callback2($matches)
1377
    {
1378
        $pre = $matches[1];
1379
        $pre = preg_replace('/^  /m', '', $pre);
1380
1381
        return $pre;
1382
    }
1383
1384
    public function formParagraphs($text)
1385
    {
1386
        //
1387
        //	Params:
1388
        //		$text - string to process with html <p> tags
1389
        //
1390
        // Strip leading and trailing lines:
1391
        $text = preg_replace('/\A\n+|\n+\z/', '', $text);
1392
1393
        $grafs = preg_split('/\n{2,}/', $text, -1, PREG_SPLIT_NO_EMPTY);
1394
1395
        //
1396
        // Wrap <p> tags and unhashify HTML blocks
1397
        //
1398
        foreach ($grafs as $key => $value) {
1399
            if (!preg_match('/^B\x1A[0-9]+B$/', $value)) {
1400
                // Is a paragraph.
1401
                $value = $this->runSpanGamut($value);
1402
                $value = preg_replace('/^([ ]*)/', '<p>', $value);
1403
                $value .= '</p>';
1404
                $grafs[$key] = $this->unhash($value);
1405
            } else {
1406
                // Is a block.
1407
                // Modify elements of @grafs in-place...
1408
                $graf = $value;
1409
                $block = $this->html_hashes[$graf];
1410
                $graf = $block;
1411
                //				if (preg_match('{
1412
                //					\A
1413
                //					(							# $1 = <div> tag
1414
                //					  <div  \s+
1415
                //					  [^>]*
1416
                //					  \b
1417
                //					  markdown\s*=\s*  ([\'"])	#	$2 = attr quote char
1418
                //					  1
1419
                //					  \2
1420
                //					  [^>]*
1421
                //					  >
1422
                //					)
1423
                //					(							# $3 = contents
1424
                //					.*
1425
                //					)
1426
                //					(</div>)					# $4 = closing tag
1427
                //					\z
1428
                //					}xs', $block, $matches))
1429
                //				{
1430
                //					list(, $div_open, , $div_content, $div_close) = $matches;
1431
                //
1432
                //					# We can't call Markdown(), because that resets the hash;
1433
                //					# that initialization code should be pulled into its own sub, though.
1434
                //					$div_content = $this->hashHTMLBlocks($div_content);
1435
                //
1436
                //					# Run document gamut methods on the content.
1437
                //					foreach ($this->document_gamut as $method => $priority) {
1438
                //						$div_content = $this->$method($div_content);
1439
                //					}
1440
                //
1441
                //					$div_open = preg_replace(
1442
                //						'{\smarkdown\s*=\s*([\'"]).+?\1}', '', $div_open);
1443
                //
1444
                //					$graf = $div_open . "\n" . $div_content . "\n" . $div_close;
1445
                //				}
1446
                $grafs[$key] = $graf;
1447
            }
1448
        }
1449
1450
        return implode("\n\n", $grafs);
1451
    }
1452
1453
    public function encodeAttribute($text)
1454
    {
1455
        //
1456
        // Encode text for a double-quoted HTML attribute. This function
1457
        // is *not* suitable for attributes enclosed in single quotes.
1458
        //
1459
        $text = $this->encodeAmpsAndAngles($text);
1460
        $text = str_replace('"', '&quot;', $text);
1461
1462
        return $text;
1463
    }
1464
1465
    public function encodeAmpsAndAngles($text)
1466
    {
1467
        //
1468
        // Smart processing for ampersands and angle brackets that need to
1469
        // be encoded. Valid character entities are left alone unless the
1470
        // no-entities mode is set.
1471
        //
1472
        if ($this->no_entities) {
1473
            $text = str_replace('&', '&amp;', $text);
1474
        } else {
1475
            // Ampersand-encoding based entirely on Nat Irons's Amputator
1476
            // MT plugin: <http://bumppo.net/projects/amputator/>
1477
            $text = preg_replace('/&(?!#?[xX]?(?:[0-9a-fA-F]+|\w+);)/',
1478
                '&amp;', $text);
1479
        }
1480
        // Encode remaining <'s
1481
        $text = str_replace('<', '&lt;', $text);
1482
1483
        return $text;
1484
    }
1485
1486
    public function doAutoLinks($text)
1487
    {
1488
        $text = preg_replace_callback('{<((https?|ftp|dict):[^\'">\s]+)>}i',
1489
            [&$this, '_doAutoLinks_url_callback'], $text);
1490
1491
        // Email addresses: <[email protected]>
1492
        $text = preg_replace_callback('{
1493
			<
1494
			(?:mailto:)?
1495
			(
1496
				(?:
1497
					[-!#$%&\'*+/=?^_`.{|}~\w\x80-\xFF]+
1498
				|
1499
					".*?"
1500
				)
1501
				\@
1502
				(?:
1503
					[-a-z0-9\x80-\xFF]+(\.[-a-z0-9\x80-\xFF]+)*\.[a-z]+
1504
				|
1505
					\[[\d.a-fA-F:]+\]	# IPv4 & IPv6
1506
				)
1507
			)
1508
			>
1509
			}xi',
1510
            [&$this, '_doAutoLinks_email_callback'], $text);
1511
1512
        return $text;
1513
    }
1514
1515
    public function _doAutoLinks_url_callback($matches)
1516
    {
1517
        $url = $this->encodeAttribute($matches[1]);
1518
        $link = "<a href=\"$url\">$url</a>";
1519
1520
        return $this->hashPart($link);
1521
    }
1522
1523
    public function _doAutoLinks_email_callback($matches)
1524
    {
1525
        $address = $matches[1];
1526
        $link = $this->encodeEmailAddress($address);
1527
1528
        return $this->hashPart($link);
1529
    }
1530
1531
    public function encodeEmailAddress($addr)
1532
    {
1533
        //
1534
        //	Input: an email address, e.g. "[email protected]"
1535
        //
1536
        //	Output: the email address as a mailto link, with each character
1537
        //		of the address encoded as either a decimal or hex entity, in
1538
        //		the hopes of foiling most address harvesting spam bots. E.g.:
1539
        //
1540
        //	  <p><a href="&#109;&#x61;&#105;&#x6c;&#116;&#x6f;&#58;&#x66;o&#111;
1541
        //        &#x40;&#101;&#x78;&#97;&#x6d;&#112;&#x6c;&#101;&#46;&#x63;&#111;
1542
        //        &#x6d;">&#x66;o&#111;&#x40;&#101;&#x78;&#97;&#x6d;&#112;&#x6c;
1543
        //        &#101;&#46;&#x63;&#111;&#x6d;</a></p>
1544
        //
1545
        //	Based by a filter by Matthew Wickline, posted to BBEdit-Talk.
1546
        //   With some optimizations by Milian Wolff.
1547
        //
1548
        $addr = 'mailto:'.$addr;
1549
        $chars = preg_split('/(?<!^)(?!$)/', $addr);
1550
        $seed = (int) abs(crc32($addr) / strlen($addr)); // Deterministic seed.
1551
1552
        foreach ($chars as $key => $char) {
1553
            $ord = ord($char);
1554
            // Ignore non-ascii chars.
1555
            if ($ord < 128) {
1556
                $r = ($seed * (1 + $key)) % 100; // Pseudo-random function.
1557
                // roughly 10% raw, 45% hex, 45% dec
1558
                // '@' *must* be encoded. I insist.
1559
                if ($r > 90 && $char != '@') /* do nothing */ {
1560
1561
                } else {
1562
                    if ($r < 45) {
1563
                        $chars[$key] = '&#x'.dechex($ord).';';
1564
                    } else {
1565
                        $chars[$key] = '&#'.$ord.';';
1566
                    }
1567
                }
1568
            }
1569
        }
1570
1571
        $addr = implode('', $chars);
1572
        $text = implode('', array_slice($chars, 7)); // text without `mailto:`
1573
        $addr = "<a href=\"$addr\">$text</a>";
1574
1575
        return $addr;
1576
    }
1577
1578
    public function parseSpan($str)
1579
    {
1580
        //
1581
        // Take the string $str and parse it into tokens, hashing embeded HTML,
1582
        // escaped characters and handling code spans.
1583
        //
1584
        $output = '';
1585
1586
        $span_re = '{
1587
				(
1588
					\\\\'.$this->escape_chars_re.'
1589
				|
1590
					(?<![`\\\\])
1591
					`+						# code span marker
1592
			'.($this->no_markup ? '' : '
1593
				|
1594
					<!--    .*?     -->		# comment
1595
				|
1596
					<\?.*?\?> | <%.*?%>		# processing instruction
1597
				|
1598
					<[!$]?[-a-zA-Z0-9:_]+	# regular tags
1599
					(?>
1600
						\s
1601
						(?>[^"\'>]+|"[^"]*"|\'[^\']*\')*
1602
					)?
1603
					>
1604
				|
1605
					<[-a-zA-Z0-9:_]+\s*/> # xml-style empty tag
1606
				|
1607
					</[-a-zA-Z0-9:_]+\s*> # closing tag
1608
			').'
1609
				)
1610
				}xs';
1611
1612
        while (1) {
1613
            //
1614
            // Each loop iteration seach for either the next tag, the next
1615
            // openning code span marker, or the next escaped character.
1616
            // Each token is then passed to handleSpanToken.
1617
            //
1618
            $parts = preg_split($span_re, $str, 2, PREG_SPLIT_DELIM_CAPTURE);
1619
1620
            // Create token from text preceding tag.
1621
            if ($parts[0] != '') {
1622
                $output .= $parts[0];
1623
            }
1624
1625
            // Check if we reach the end.
1626
            if (isset($parts[1])) {
1627
                $output .= $this->handleSpanToken($parts[1], $parts[2]);
1628
                $str = $parts[2];
1629
            } else {
1630
                break;
1631
            }
1632
        }
1633
1634
        return $output;
1635
    }
1636
1637
    public function handleSpanToken($token, &$str)
1638
    {
1639
        //
1640
        // Handle $token provided by parseSpan by determining its nature and
1641
        // returning the corresponding value that should replace it.
1642
        //
1643
        switch ($token[0]) {
1644
            case '\\':
1645
                return $this->hashPart('&#'.ord($token[1]).';');
1646
            case '`':
1647
                // Search for end marker in remaining text.
1648
                if (preg_match('/^(.*?[^`])'.preg_quote($token).'(?!`)(.*)$/sm',
1649
                    $str, $matches)) {
1650
                    $str = $matches[2];
1651
                    $codespan = $this->makeCodeSpan($matches[1]);
1652
1653
                    return $this->hashPart($codespan);
1654
                }
1655
1656
                return $token; // return as text since no ending marker found.
1657
            default:
1658
                return $this->hashPart($token);
1659
        }
1660
    }
1661
1662
    public function outdent($text)
1663
    {
1664
        //
1665
        // Remove one level of line-leading tabs or spaces
1666
        //
1667
        return preg_replace('/^(\t|[ ]{1,'.$this->tab_width.'})/m', '', $text);
1668
    }
1669
1670
    // String length function for detab. `_initDetab` will create a function to
1671
    // hanlde UTF-8 if the default function does not exist.
1672
    public $utf8_strlen = 'mb_strlen';
1673
1674
    public function detab($text)
1675
    {
1676
        //
1677
        // Replace tabs with the appropriate amount of space.
1678
        //
1679
        // For each line we separate the line in blocks delemited by
1680
        // tab characters. Then we reconstruct every line by adding the
1681
        // appropriate number of space between each blocks.
1682
1683
        $text = preg_replace_callback('/^.*\t.*$/m',
1684
            [&$this, '_detab_callback'], $text);
1685
1686
        return $text;
1687
    }
1688
1689
    public function _detab_callback($matches)
1690
    {
1691
        $line = $matches[0];
1692
        $strlen = $this->utf8_strlen; // strlen function for UTF-8.
1693
1694
        // Split in blocks.
1695
        $blocks = explode("\t", $line);
1696
        // Add each blocks to the line.
1697
        $line = $blocks[0];
1698
        unset($blocks[0]); // Do not add first block twice.
1699
        foreach ($blocks as $block) {
1700
            // Calculate amount of space, insert spaces, insert block.
1701
            $amount = $this->tab_width -
1702
                $strlen($line, 'UTF-8') % $this->tab_width;
1703
            $line .= str_repeat(' ', $amount).$block;
1704
        }
1705
1706
        return $line;
1707
    }
1708
1709
    public function _initDetab()
1710
    {
1711
        //
1712
        // Check for the availability of the function in the `utf8_strlen` property
1713
        // (initially `mb_strlen`). If the function is not available, create a
1714
        // function that will loosely count the number of UTF-8 characters with a
1715
        // regular expression.
1716
        //
1717
        if (function_exists($this->utf8_strlen)) {
1718
            return;
1719
        }
1720
        $this->utf8_strlen = create_function('$text', 'return preg_match_all(
1721
			"/[\\\\x00-\\\\xBF]|[\\\\xC0-\\\\xFF][\\\\x80-\\\\xBF]*/",
1722
			$text, $m);');
1723
    }
1724
1725
    public function unhash($text)
1726
    {
1727
        //
1728
        // Swap back in all the tags hashed by _HashHTMLBlocks.
1729
        //
1730
        return preg_replace_callback('/(.)\x1A[0-9]+\1/',
1731
            [&$this, '_unhash_callback'], $text);
1732
    }
1733
1734
    public function _unhash_callback($matches)
1735
    {
1736
        return $this->html_hashes[$matches[0]];
1737
    }
1738
}
1739
1740
/*
1741
1742
PHP Markdown
1743
============
1744
1745
Description
1746
-----------
1747
1748
This is a PHP translation of the original Markdown formatter written in
1749
Perl by John Gruber.
1750
1751
Markdown is a text-to-HTML filter; it translates an easy-to-read /
1752
easy-to-write structured text format into HTML. Markdown's text format
1753
is mostly similar to that of plain text email, and supports features such
1754
as headers, *emphasis*, code blocks, blockquotes, and links.
1755
1756
Markdown's syntax is designed not as a generic markup language, but
1757
specifically to serve as a front-end to (X)HTML. You can use span-level
1758
HTML tags anywhere in a Markdown document, and you can use block level
1759
HTML tags (like <div> and <table> as well).
1760
1761
For more information about Markdown's syntax, see:
1762
1763
<http://daringfireball.net/projects/markdown/>
1764
1765
1766
Bugs
1767
----
1768
1769
To file bug reports please send email to:
1770
1771
<[email protected]>
1772
1773
Please include with your report: (1) the example input; (2) the output you
1774
expected; (3) the output Markdown actually produced.
1775
1776
1777
Version History
1778
---------------
1779
1780
See the readme file for detailed release notes for this version.
1781
1782
1783
Copyright and License
1784
---------------------
1785
1786
PHP Markdown
1787
Copyright (c) 2004-2013 Michel Fortin
1788
<http://michelf.ca/>
1789
All rights reserved.
1790
1791
Based on Markdown
1792
Copyright (c) 2003-2006 John Gruber
1793
<http://daringfireball.net/>
1794
All rights reserved.
1795
1796
Redistribution and use in source and binary forms, with or without
1797
modification, are permitted provided that the following conditions are
1798
met:
1799
1800
*	Redistributions of source code must retain the above copyright notice,
1801
    this list of conditions and the following disclaimer.
1802
1803
*	Redistributions in binary form must reproduce the above copyright
1804
    notice, this list of conditions and the following disclaimer in the
1805
    documentation and/or other materials provided with the distribution.
1806
1807
*	Neither the name "Markdown" nor the names of its contributors may
1808
    be used to endorse or promote products derived from this software
1809
    without specific prior written permission.
1810
1811
This software is provided by the copyright holders and contributors "as
1812
is" and any express or implied warranties, including, but not limited
1813
to, the implied warranties of merchantability and fitness for a
1814
particular purpose are disclaimed. In no event shall the copyright owner
1815
or contributors be liable for any direct, indirect, incidental, special,
1816
exemplary, or consequential damages (including, but not limited to,
1817
procurement of substitute goods or services; loss of use, data, or
1818
profits; or business interruption) however caused and on any theory of
1819
liability, whether in contract, strict liability, or tort (including
1820
negligence or otherwise) arising in any way out of the use of this
1821
software, even if advised of the possibility of such damage.
1822
1823
*/;
1824