Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
Complex classes like fpdi_pdf_parser often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use fpdi_pdf_parser, and based on these observations, apply Extract Interface, too.
1 | <?php |
||
22 | class fpdi_pdf_parser extends pdf_parser { |
||
23 | |||
24 | /** |
||
25 | * Pages |
||
26 | * Index beginns at 0 |
||
27 | * |
||
28 | * @var array |
||
29 | */ |
||
30 | var $pages; |
||
31 | |||
32 | /** |
||
33 | * Page count |
||
34 | * @var integer |
||
35 | */ |
||
36 | var $page_count; |
||
37 | |||
38 | /** |
||
39 | * actual page number |
||
40 | * @var integer |
||
41 | */ |
||
42 | var $pageno; |
||
43 | |||
44 | /** |
||
45 | * PDF Version of imported Document |
||
46 | * @var string |
||
47 | */ |
||
48 | var $pdfVersion; |
||
49 | |||
50 | /** |
||
51 | * FPDI Reference |
||
52 | * @var object |
||
53 | */ |
||
54 | var $fpdi; |
||
55 | |||
56 | /** |
||
57 | * Available BoxTypes |
||
58 | * |
||
59 | * @var array |
||
60 | */ |
||
61 | var $availableBoxes = array("/MediaBox","/CropBox","/BleedBox","/TrimBox","/ArtBox"); |
||
62 | |||
63 | /** |
||
64 | * Constructor |
||
65 | * |
||
66 | * @param string $filename Source-Filename |
||
67 | * @param object $fpdi Object of type fpdi |
||
68 | */ |
||
69 | function fpdi_pdf_parser($filename,&$fpdi) { |
||
84 | |||
85 | /** |
||
86 | * Overwrite parent::error() |
||
87 | * |
||
88 | * @param string $msg Error-Message |
||
89 | */ |
||
90 | function error($msg) { |
||
93 | |||
94 | /** |
||
95 | * Get pagecount from sourcefile |
||
96 | * |
||
97 | * @return int |
||
98 | */ |
||
99 | function getPageCount() { |
||
102 | |||
103 | |||
104 | /** |
||
105 | * Set pageno |
||
106 | * |
||
107 | * @param int $pageno Pagenumber to use |
||
108 | */ |
||
109 | function setPageno($pageno) { |
||
118 | |||
119 | /** |
||
120 | * Get page-resources from current page |
||
121 | * |
||
122 | * @return array |
||
123 | */ |
||
124 | function getPageResources() { |
||
127 | |||
128 | /** |
||
129 | * Get page-resources from /Page |
||
130 | * |
||
131 | * @param array $obj Array of pdf-data |
||
132 | */ |
||
133 | View Code Duplication | function _getPageResources ($obj) { // $obj = /Page |
|
156 | |||
157 | |||
158 | /** |
||
159 | * Get content of current page |
||
160 | * |
||
161 | * If more /Contents is an array, the streams are concated |
||
162 | * |
||
163 | * @return string |
||
164 | */ |
||
165 | function getContent() { |
||
177 | |||
178 | |||
179 | /** |
||
180 | * Resolve all content-objects |
||
181 | * |
||
182 | * @param array $content_ref |
||
183 | * @return array |
||
184 | */ |
||
185 | function _getPageContent($content_ref) { |
||
203 | |||
204 | |||
205 | /** |
||
206 | * Rebuild content-streams |
||
207 | * |
||
208 | * @param array $obj |
||
209 | * @return string |
||
210 | */ |
||
211 | function _rebuildContentStream($obj) { |
||
258 | |||
259 | |||
260 | /** |
||
261 | * Get a Box from a page |
||
262 | * Arrayformat is same as used by fpdf_tpl |
||
263 | * |
||
264 | * @param array $page a /Page |
||
265 | * @param string $box_index Type of Box @see $availableBoxes |
||
266 | * @return array |
||
267 | */ |
||
268 | function getPageBox($page, $box_index) { |
||
291 | |||
292 | function getPageBoxes($pageno) { |
||
295 | |||
296 | /** |
||
297 | * Get all Boxes from /Page |
||
298 | * |
||
299 | * @param array a /Page |
||
300 | * @return array |
||
301 | */ |
||
302 | function _getPageBoxes($page) { |
||
313 | |||
314 | /** |
||
315 | * Get the page rotation by pageno |
||
316 | * |
||
317 | * @param integer $pageno |
||
318 | * @return array |
||
319 | */ |
||
320 | function getPageRotation($pageno) { |
||
323 | |||
324 | View Code Duplication | function _getPageRotation ($obj) { // $obj = /Page |
|
342 | |||
343 | /** |
||
344 | * Read all /Page(es) |
||
345 | * |
||
346 | * @param object pdf_context |
||
347 | * @param array /Pages |
||
348 | * @param array the result-array |
||
349 | */ |
||
350 | function read_pages (&$c, &$pages, &$result) { |
||
367 | |||
368 | |||
369 | |||
370 | /** |
||
371 | * Get PDF-Version |
||
372 | * |
||
373 | * And reset the PDF Version used in FPDI if needed |
||
374 | */ |
||
375 | function getPDFVersion() { |
||
379 | |||
380 | } |
The PSR-1: Basic Coding Standard recommends that a file should either introduce new symbols, that is classes, functions, constants or similar, or have side effects. Side effects are anything that executes logic, like for example printing output, changing ini settings or writing to a file.
The idea behind this recommendation is that merely auto-loading a class should not change the state of an application. It also promotes a cleaner style of programming and makes your code less prone to errors, because the logic is not spread out all over the place.
To learn more about the PSR-1, please see the PHP-FIG site on the PSR-1.