Complex classes like Lexer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use Lexer, and based on these observations, apply Extract Interface, too.
| 1 | <?php |
||
| 42 | class Lexer extends \Symfony\Component\ExpressionLanguage\Lexer |
||
| 43 | { |
||
| 44 | /** |
||
| 45 | * Tokenizes an expression. |
||
| 46 | * |
||
| 47 | * @param string $expression The expression to tokenize |
||
| 48 | * |
||
| 49 | * @return TokenStream A token stream instance |
||
| 50 | * |
||
| 51 | * @throws SyntaxError |
||
| 52 | */ |
||
| 53 | public function tokenize($expression) |
||
| 54 | { |
||
| 55 | $length = strlen($expression); |
||
| 56 | $stringsAndExpressions = $this->splitIntoStringsAndActualExpressions($expression, $length); |
||
| 57 | |||
| 58 | // Actually tokenize all remaining expressions |
||
| 59 | $tokens = array(); |
||
| 60 | $isMultipleExpressions = isset($stringsAndExpressions[1]); |
||
| 61 | foreach ($stringsAndExpressions as $i => $item) { |
||
| 62 | if ($isMultipleExpressions && $i > 0) { |
||
| 63 | $tokens[] = new Token(Token::OPERATOR_TYPE, '~', $item[2]); |
||
| 64 | } |
||
| 65 | if ($item[1]) { |
||
| 66 | if ($isMultipleExpressions) { |
||
| 67 | $tokens[] = new Token(Token::PUNCTUATION_TYPE, '(', $item[2]); |
||
| 68 | } |
||
| 69 | foreach ($this->tokenizeExpression($item[0]) as $token) { |
||
| 70 | $token->cursor += $item[2]; |
||
| 71 | $tokens[] = $token; |
||
| 72 | } |
||
| 73 | if ($isMultipleExpressions) { |
||
| 74 | $tokens[] = new Token(Token::PUNCTUATION_TYPE, ')', $item[2]); |
||
| 75 | } |
||
| 76 | } else { |
||
| 77 | $tokens[] = new Token(Token::STRING_TYPE, $item[0], $item[2]); |
||
| 78 | } |
||
| 79 | } |
||
| 80 | |||
| 81 | $tokens[] = new Token(Token::EOF_TYPE, null, $length); |
||
| 82 | |||
| 83 | return new TokenStream($tokens); |
||
| 84 | } |
||
| 85 | |||
| 86 | /** |
||
| 87 | * Split a "string with {expression} and string parts" into |
||
| 88 | * [ |
||
| 89 | * [ "string with an ", false, 0 ], |
||
| 90 | * [ "expression", true, 12 ], |
||
| 91 | * [ " and string parts", false, 23 ] |
||
| 92 | * ] |
||
| 93 | * |
||
| 94 | * @param string $expression The mixed expression string |
||
| 95 | * @param int $length Length beginning from 0 to analyze |
||
| 96 | * |
||
| 97 | * @return array |
||
| 98 | */ |
||
| 99 | protected function splitIntoStringsAndActualExpressions($expression, $length) |
||
| 139 | |||
| 140 | /** |
||
| 141 | * Actually tokenize an expression - at this point object and property access is |
||
| 142 | * transformed, so that "this.property" will be "get('this.propery')" |
||
| 143 | * |
||
| 144 | * Also all function calls (but isset and empty) will be rewritten from |
||
| 145 | * function(abc) to call("function", abc) to make dynamic functions possible |
||
| 146 | * |
||
| 147 | * @param string $expression The expression |
||
| 148 | * |
||
| 149 | * @return array |
||
| 150 | */ |
||
| 151 | protected function tokenizeExpression($expression) |
||
| 231 | } |
||
| 232 | |||
| 234 |
This check looks for the bodies of
ifstatements that have no statements or where all statements have been commented out. This may be the result of changes for debugging or the code may simply be obsolete.These
ifbodies can be removed. If you have an empty if but statements in theelsebranch, consider inverting the condition.could be turned into
This is much more concise to read.