Complex classes like ParallelRegex often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use ParallelRegex, and based on these observations, apply Extract Interface, too.
| 1 | <?php |
||
| 18 | class ParallelRegex |
||
| 19 | { |
||
| 20 | /** @var string[][] patterns to match */ |
||
| 21 | protected $patterns; |
||
| 22 | /** @var string[][] labels for above patterns */ |
||
| 23 | protected $labels; |
||
| 24 | /** @var string[] the compound regexes matching all patterns */ |
||
| 25 | protected $regexes; |
||
| 26 | /** @var bool case sensitive matching? */ |
||
| 27 | protected $case; |
||
| 28 | |||
| 29 | /** |
||
| 30 | * Constructor. Starts with no patterns. |
||
| 31 | * |
||
| 32 | * @param boolean $case True for case sensitive, false for insensitive. |
||
| 33 | */ |
||
| 34 | public function __construct($case) |
||
| 41 | |||
| 42 | /** |
||
| 43 | * Adds a pattern with an optional label. |
||
| 44 | * |
||
| 45 | * @param mixed $pattern Perl style regex. Must be UTF-8 encoded. If its a string, |
||
| 46 | * the (, ) lose their meaning unless they form part of |
||
| 47 | * a lookahead or lookbehind assertation. |
||
| 48 | * @param bool|string $label Label of regex to be returned on a match. Label must be ASCII |
||
| 49 | */ |
||
| 50 | public function addPattern($pattern, $label = true) |
||
| 62 | |||
| 63 | /** |
||
| 64 | * Decides whether the given pattern needs Unicode-aware regex treatment. |
||
| 65 | * Reference: https://www.php.net/manual/en/regexp.reference.unicode.php |
||
| 66 | * |
||
| 67 | * @param mixed $pattern Perl style regex. Must be UTF-8 encoded. |
||
| 68 | * @return boolean True for Unicode-aware, false for byte-oriented. |
||
| 69 | * |
||
| 70 | * @author Moisés Braga Ribeiro <[email protected]> |
||
| 71 | */ |
||
| 72 | protected function needsUnicodeAware($pattern) |
||
| 76 | |||
| 77 | /** |
||
| 78 | * Attempts to match all patterns at once against a string. |
||
| 79 | * |
||
| 80 | * @param string $subject String to match against. |
||
| 81 | * @param string $match First matched portion of subject. |
||
| 82 | * @return bool|string False if no match found, label if label exists, true if not |
||
| 83 | * |
||
| 84 | * @author Moisés Braga Ribeiro <[email protected]> |
||
| 85 | */ |
||
| 86 | public function match($subject, &$match) |
||
| 104 | |||
| 105 | /** |
||
| 106 | * Attempts to match all patterns of a certain type at once against a string. |
||
| 107 | * |
||
| 108 | * @param string $subject String to match against. |
||
| 109 | * @param string $match First matched portion of subject. |
||
| 110 | * @param int $offset Offset of the first matched portion of subject. |
||
| 111 | * @param boolean $unicode True for Unicode-aware, false for byte-oriented. |
||
| 112 | * @return bool|string False if no match found, label if label exists, true if not |
||
| 113 | */ |
||
| 114 | protected function partialMatch($subject, &$match, &$offset, $unicode) |
||
| 135 | |||
| 136 | /** |
||
| 137 | * Attempts to split the string against all patterns at once. |
||
| 138 | * |
||
| 139 | * @param string $subject String to match against. |
||
| 140 | * @param array $split The split result: array containing pre-match, match & post-match strings |
||
| 141 | * @return boolean True on success. |
||
| 142 | * |
||
| 143 | * @author Moisés Braga Ribeiro <[email protected]> |
||
| 144 | */ |
||
| 145 | public function split($subject, &$split) |
||
| 165 | |||
| 166 | /** |
||
| 167 | * Attempts to split the string against all patterns of a certain type at once. |
||
| 168 | * |
||
| 169 | * @param string $subject String to match against. |
||
| 170 | * @param array $split The split result: array containing pre-match, match & post-match strings |
||
| 171 | * @param boolean $unicode True for Unicode-aware, false for byte-oriented. |
||
| 172 | * @return boolean True on success. |
||
| 173 | * |
||
| 174 | * @author Christopher Smith <[email protected]> |
||
| 175 | */ |
||
| 176 | protected function partialSplit($subject, &$split, $unicode) |
||
| 212 | |||
| 213 | /** |
||
| 214 | * Compounds the patterns into a single regular expression separated with the |
||
| 215 | * "or" operator. Caches the regex. Will automatically escape (, ) and / tokens. |
||
| 216 | * |
||
| 217 | * @param boolean $unicode True for Unicode-aware, false for byte-oriented. |
||
| 218 | * @return null|string |
||
| 219 | */ |
||
| 220 | protected function getCompoundedRegex($unicode) |
||
| 272 | |||
| 273 | /** |
||
| 274 | * Accessor for perl regex mode flags to use. |
||
| 275 | * @param boolean $unicode True for Unicode-aware, false for byte-oriented. |
||
| 276 | * @return string Perl regex flags. |
||
| 277 | */ |
||
| 278 | protected function getPerlMatchingFlags($unicode) |
||
| 284 | } |
||
| 285 |