Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
Complex classes like PfPageparser often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use PfPageparser, and based on these observations, apply Extract Interface, too.
1 | <?php |
||
6 | class PfPageparser |
||
7 | { |
||
8 | // Build your next great package. |
||
9 | private $config; |
||
10 | private $content=""; |
||
11 | private $chunks=[]; |
||
12 | private $parsed=[]; |
||
13 | |||
14 | public function __construct($config=[]){ |
||
23 | |||
24 | public function get_config(){ |
||
27 | |||
28 | public function load_from_url(string $url,array $options=[]): PfPageparser |
||
45 | |||
46 | public function load_from_file(string $filename): PfPageparser |
||
55 | |||
56 | public function load_fom_string(string $string): PfPageparser |
||
62 | |||
63 | public function get_content(){ |
||
66 | |||
67 | public function raw(){ |
||
70 | |||
71 | /** |
||
72 | * @param $pattern |
||
73 | * @param bool $is_regex |
||
74 | * @return $this |
||
75 | */ |
||
76 | View Code Duplication | public function trim_before($pattern,$is_regex=false): PfPageparser |
|
90 | |||
91 | /** |
||
92 | * @param $pattern |
||
93 | * @param bool $is_regex |
||
94 | * @return $this |
||
95 | */ |
||
96 | View Code Duplication | public function trim_after($pattern,$is_regex=false): PfPageparser |
|
111 | |||
112 | /** |
||
113 | * @param string $before |
||
114 | * @param string $after |
||
115 | * @param bool $is_regex |
||
116 | * @return $this |
||
117 | */ |
||
118 | |||
119 | public function trim($before="<body",$after="</body",$is_regex=false): PfPageparser |
||
125 | |||
126 | /** |
||
127 | * @param $pattern |
||
128 | * @param $is_regex |
||
129 | * @return $this |
||
130 | * split the HTML content into chunks based on a text or regex separator |
||
131 | */ |
||
132 | |||
133 | public function split_chunks($pattern,$is_regex=false): PfPageparser |
||
154 | |||
155 | /** |
||
156 | * @param array $pattern_keep - array of patterns that should be found (combined with OR) |
||
157 | * @param array $pattern_remove - array of patterns that should not be found (combined with OR) |
||
158 | * @param bool $is_regex - whether patterns are regex or just strings |
||
159 | * @return $this |
||
160 | */ |
||
161 | public function filter_chunks($pattern_keep=[],$pattern_remove=[],bool $is_regex=false): PfPageparser |
||
211 | |||
212 | /** |
||
213 | * @param $pattern |
||
214 | * @return array |
||
215 | */ |
||
216 | public function parse_fom_chunks(string $pattern,bool $restart=false): array |
||
238 | |||
239 | /** |
||
240 | * @return array |
||
241 | */ |
||
242 | public function get_chunks(): array |
||
246 | |||
247 | public function results(bool $before_parsing=false): array |
||
255 | |||
256 | |||
257 | public function preg_get($pattern,$haystack){ |
||
265 | |||
266 | /** |
||
267 | * PROTECTED FUNCTIONS |
||
268 | */ |
||
269 | |||
270 | } |
||
271 |
Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.
You can also find more detailed suggestions in the “Code” section of your repository.