Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
Complex classes like PublishDateExtractor often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use PublishDateExtractor, and based on these observations, apply Extract Interface, too.
1 | <?php |
||
17 | class PublishDateExtractor extends AbstractModule implements ModuleInterface { |
||
18 | use ArticleMutatorTrait; |
||
19 | |||
20 | /** |
||
21 | * @param Article $article |
||
22 | * |
||
23 | * @return \DateTime |
||
24 | */ |
||
25 | public function run(Article $article) { |
||
50 | |||
51 | private function getDateFromURL() { |
||
68 | |||
69 | /** |
||
70 | * Check for and determine dates from Schema.org's datePublished property. |
||
71 | * |
||
72 | * Checks HTML tags (e.g. <meta>, <time>, etc.) and JSON-LD. |
||
73 | * |
||
74 | * @return \DateTime|null |
||
75 | * |
||
76 | * @see https://schema.org/datePublished |
||
77 | */ |
||
78 | private function getDateFromSchemaOrg() { |
||
124 | |||
125 | /** |
||
126 | * Check for and determine dates based on Dublin Core standards. |
||
127 | * |
||
128 | * @return \DateTime|null |
||
129 | * |
||
130 | * @see http://dublincore.org/documents/dcmi-terms/#elements-date |
||
131 | * @see http://dublincore.org/documents/2000/07/16/usageguide/qualified-html.shtml |
||
132 | */ |
||
133 | private function getDateFromDublinCore() { |
||
156 | |||
157 | /** |
||
158 | * Check for and determine dates based on OpenGraph standards. |
||
159 | * |
||
160 | * @return \DateTime|null |
||
161 | * |
||
162 | * @see http://ogp.me/ |
||
163 | * @see http://ogp.me/#type_article |
||
164 | */ |
||
165 | private function getDateFromOpenGraph() { |
||
184 | |||
185 | /** |
||
186 | * Check for and determine dates based on Parsely metadata. |
||
187 | * |
||
188 | * Checks JSON-LD, <meta> tags and parsely-page. |
||
189 | * |
||
190 | * @return \DateTime|null |
||
191 | * |
||
192 | * @see https://www.parsely.com/help/integration/jsonld/ |
||
193 | * @see https://www.parsely.com/help/integration/metatags/ |
||
194 | * @see https://www.parsely.com/help/integration/ppage/ |
||
195 | */ |
||
196 | private function getDateFromParsely() { |
||
261 | } |
||
262 |
This check looks for variable assignements that are either overwritten by other assignments or where the variable is not used subsequently.
Both the
$myVar
assignment in line 1 and the$higher
assignment in line 2 are dead. The first because$myVar
is never used and the second because$higher
is always overwritten for every possible time line.