Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
Complex classes like MimeAnalyzer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use MimeAnalyzer, and based on these observations, apply Extract Interface, too.
1 | <?php |
||
30 | class MimeAnalyzer implements LoggerAwareInterface { |
||
31 | /** @var string */ |
||
32 | protected $typeFile; |
||
33 | /** @var string */ |
||
34 | protected $infoFile; |
||
35 | /** @var string */ |
||
36 | protected $xmlTypes; |
||
37 | /** @var callable */ |
||
38 | protected $initCallback; |
||
39 | /** @var callable */ |
||
40 | protected $detectCallback; |
||
41 | /** @var callable */ |
||
42 | protected $guessCallback; |
||
43 | /** @var callable */ |
||
44 | protected $extCallback; |
||
45 | /** @var array Mapping of media types to arrays of MIME types */ |
||
46 | protected $mediaTypes = null; |
||
47 | /** @var array Map of MIME type aliases */ |
||
48 | protected $mimeTypeAliases = null; |
||
49 | /** @var array Map of MIME types to file extensions (as a space separated list) */ |
||
50 | protected $mimetoExt = null; |
||
51 | |||
52 | /** @var array Map of file extensions types to MIME types (as a space separated list) */ |
||
53 | public $mExtToMime = null; // legacy name; field accessed by hooks |
||
54 | |||
55 | /** @var IEContentAnalyzer */ |
||
56 | protected $IEAnalyzer; |
||
57 | |||
58 | /** @var string Extra MIME types, set for example by media handling extensions */ |
||
59 | private $extraTypes = ''; |
||
60 | /** @var string Extra MIME info, set for example by media handling extensions */ |
||
61 | private $extraInfo = ''; |
||
62 | |||
63 | /** @var LoggerInterface */ |
||
64 | private $logger; |
||
65 | |||
66 | /** |
||
67 | * Defines a set of well known MIME types |
||
68 | * This is used as a fallback to mime.types files. |
||
69 | * An extensive list of well known MIME types is provided by |
||
70 | * the file mime.types in the includes directory. |
||
71 | * |
||
72 | * This list concatenated with mime.types is used to create a MIME <-> ext |
||
73 | * map. Each line contains a MIME type followed by a space separated list of |
||
74 | * extensions. If multiple extensions for a single MIME type exist or if |
||
75 | * multiple MIME types exist for a single extension then in most cases |
||
76 | * MediaWiki assumes that the first extension following the MIME type is the |
||
77 | * canonical extension, and the first time a MIME type appears for a certain |
||
78 | * extension is considered the canonical MIME type. |
||
79 | * |
||
80 | * (Note that appending the type file list to the end of self::$wellKnownTypes |
||
81 | * sucks because you can't redefine canonical types. This could be fixed by |
||
82 | * appending self::$wellKnownTypes behind type file list, but who knows |
||
83 | * what will break? In practice this probably isn't a problem anyway -- Bryan) |
||
84 | */ |
||
85 | protected static $wellKnownTypes = <<<EOT |
||
86 | application/ogg ogx ogg ogm ogv oga spx |
||
87 | application/pdf pdf |
||
88 | application/vnd.oasis.opendocument.chart odc |
||
89 | application/vnd.oasis.opendocument.chart-template otc |
||
90 | application/vnd.oasis.opendocument.database odb |
||
91 | application/vnd.oasis.opendocument.formula odf |
||
92 | application/vnd.oasis.opendocument.formula-template otf |
||
93 | application/vnd.oasis.opendocument.graphics odg |
||
94 | application/vnd.oasis.opendocument.graphics-template otg |
||
95 | application/vnd.oasis.opendocument.image odi |
||
96 | application/vnd.oasis.opendocument.image-template oti |
||
97 | application/vnd.oasis.opendocument.presentation odp |
||
98 | application/vnd.oasis.opendocument.presentation-template otp |
||
99 | application/vnd.oasis.opendocument.spreadsheet ods |
||
100 | application/vnd.oasis.opendocument.spreadsheet-template ots |
||
101 | application/vnd.oasis.opendocument.text odt |
||
102 | application/vnd.oasis.opendocument.text-master otm |
||
103 | application/vnd.oasis.opendocument.text-template ott |
||
104 | application/vnd.oasis.opendocument.text-web oth |
||
105 | application/javascript js |
||
106 | application/x-shockwave-flash swf |
||
107 | audio/midi mid midi kar |
||
108 | audio/mpeg mpga mpa mp2 mp3 |
||
109 | audio/x-aiff aif aiff aifc |
||
110 | audio/x-wav wav |
||
111 | audio/ogg oga spx ogg |
||
112 | image/x-bmp bmp |
||
113 | image/gif gif |
||
114 | image/jpeg jpeg jpg jpe |
||
115 | image/png png |
||
116 | image/svg+xml svg |
||
117 | image/svg svg |
||
118 | image/tiff tiff tif |
||
119 | image/vnd.djvu djvu |
||
120 | image/x.djvu djvu |
||
121 | image/x-djvu djvu |
||
122 | image/x-portable-pixmap ppm |
||
123 | image/x-xcf xcf |
||
124 | text/plain txt |
||
125 | text/html html htm |
||
126 | video/ogg ogv ogm ogg |
||
127 | video/mpeg mpg mpeg |
||
128 | EOT; |
||
129 | |||
130 | /** |
||
131 | * Defines a set of well known MIME info entries |
||
132 | * This is used as a fallback to mime.info files. |
||
133 | * An extensive list of well known MIME types is provided by |
||
134 | * the file mime.info in the includes directory. |
||
135 | */ |
||
136 | protected static $wellKnownInfo = <<<EOT |
||
137 | application/pdf [OFFICE] |
||
138 | application/vnd.oasis.opendocument.chart [OFFICE] |
||
139 | application/vnd.oasis.opendocument.chart-template [OFFICE] |
||
140 | application/vnd.oasis.opendocument.database [OFFICE] |
||
141 | application/vnd.oasis.opendocument.formula [OFFICE] |
||
142 | application/vnd.oasis.opendocument.formula-template [OFFICE] |
||
143 | application/vnd.oasis.opendocument.graphics [OFFICE] |
||
144 | application/vnd.oasis.opendocument.graphics-template [OFFICE] |
||
145 | application/vnd.oasis.opendocument.image [OFFICE] |
||
146 | application/vnd.oasis.opendocument.image-template [OFFICE] |
||
147 | application/vnd.oasis.opendocument.presentation [OFFICE] |
||
148 | application/vnd.oasis.opendocument.presentation-template [OFFICE] |
||
149 | application/vnd.oasis.opendocument.spreadsheet [OFFICE] |
||
150 | application/vnd.oasis.opendocument.spreadsheet-template [OFFICE] |
||
151 | application/vnd.oasis.opendocument.text [OFFICE] |
||
152 | application/vnd.oasis.opendocument.text-template [OFFICE] |
||
153 | application/vnd.oasis.opendocument.text-master [OFFICE] |
||
154 | application/vnd.oasis.opendocument.text-web [OFFICE] |
||
155 | application/javascript text/javascript application/x-javascript [EXECUTABLE] |
||
156 | application/x-shockwave-flash [MULTIMEDIA] |
||
157 | audio/midi [AUDIO] |
||
158 | audio/x-aiff [AUDIO] |
||
159 | audio/x-wav [AUDIO] |
||
160 | audio/mp3 audio/mpeg [AUDIO] |
||
161 | application/ogg audio/ogg video/ogg [MULTIMEDIA] |
||
162 | image/x-bmp image/x-ms-bmp image/bmp [BITMAP] |
||
163 | image/gif [BITMAP] |
||
164 | image/jpeg [BITMAP] |
||
165 | image/png [BITMAP] |
||
166 | image/svg+xml [DRAWING] |
||
167 | image/tiff [BITMAP] |
||
168 | image/vnd.djvu [BITMAP] |
||
169 | image/x-xcf [BITMAP] |
||
170 | image/x-portable-pixmap [BITMAP] |
||
171 | text/plain [TEXT] |
||
172 | text/html [TEXT] |
||
173 | video/ogg [VIDEO] |
||
174 | video/mpeg [VIDEO] |
||
175 | unknown/unknown application/octet-stream application/x-empty [UNKNOWN] |
||
176 | EOT; |
||
177 | |||
178 | /** |
||
179 | * @param array $params Configuration map, includes: |
||
180 | * - typeFile: path to file with the list of known MIME types |
||
181 | * - infoFile: path to file with the MIME type info |
||
182 | * - xmlTypes: map of root element names to XML MIME types |
||
183 | * - initCallback: initialization callback that is passed this object [optional] |
||
184 | * - detectCallback: alternative to finfo that returns the mime type for a file. |
||
185 | * For example, the callback can return the output of "file -bi". [optional] |
||
186 | * - guessCallback: callback to improve the guessed MIME type using the file data. |
||
187 | * This is intended for fixing mistakes in fileinfo or "detectCallback". [optional] |
||
188 | * - extCallback: callback to improve the guessed MIME type using the extension. [optional] |
||
189 | * - logger: PSR-3 logger [optional] |
||
190 | * @note Constructing these instances is expensive due to file reads. |
||
191 | * A service or singleton pattern should be used to avoid creating instances again and again. |
||
192 | */ |
||
193 | public function __construct( array $params ) { |
||
215 | |||
216 | protected function loadFiles() { |
||
377 | |||
378 | public function setLogger( LoggerInterface $logger ) { |
||
381 | |||
382 | /** |
||
383 | * Adds to the list mapping MIME to file extensions. |
||
384 | * As an extension author, you are encouraged to submit patches to |
||
385 | * MediaWiki's core to add new MIME types to mime.types. |
||
386 | * @param string $types |
||
387 | */ |
||
388 | public function addExtraTypes( $types ) { |
||
391 | |||
392 | /** |
||
393 | * Adds to the list mapping MIME to media type. |
||
394 | * As an extension author, you are encouraged to submit patches to |
||
395 | * MediaWiki's core to add new MIME info to mime.info. |
||
396 | * @param string $info |
||
397 | */ |
||
398 | public function addExtraInfo( $info ) { |
||
401 | |||
402 | /** |
||
403 | * Returns a list of file extensions for a given MIME type as a space |
||
404 | * separated string or null if the MIME type was unrecognized. Resolves |
||
405 | * MIME type aliases. |
||
406 | * |
||
407 | * @param string $mime |
||
408 | * @return string|null |
||
409 | */ |
||
410 | public function getExtensionsForType( $mime ) { |
||
428 | |||
429 | /** |
||
430 | * Returns a list of MIME types for a given file extension as a space |
||
431 | * separated string or null if the extension was unrecognized. |
||
432 | * |
||
433 | * @param string $ext |
||
434 | * @return string|null |
||
435 | */ |
||
436 | public function getTypesForExtension( $ext ) { |
||
442 | |||
443 | /** |
||
444 | * Returns a single MIME type for a given file extension or null if unknown. |
||
445 | * This is always the first type from the list returned by getTypesForExtension($ext). |
||
446 | * |
||
447 | * @param string $ext |
||
448 | * @return string|null |
||
449 | */ |
||
450 | public function guessTypesForExtension( $ext ) { |
||
462 | |||
463 | /** |
||
464 | * Tests if the extension matches the given MIME type. Returns true if a |
||
465 | * match was found, null if the MIME type is unknown, and false if the |
||
466 | * MIME type is known but no matches where found. |
||
467 | * |
||
468 | * @param string $extension |
||
469 | * @param string $mime |
||
470 | * @return bool|null |
||
471 | */ |
||
472 | public function isMatchingExtension( $extension, $mime ) { |
||
484 | |||
485 | /** |
||
486 | * Returns true if the MIME type is known to represent an image format |
||
487 | * supported by the PHP GD library. |
||
488 | * |
||
489 | * @param string $mime |
||
490 | * |
||
491 | * @return bool |
||
492 | */ |
||
493 | public function isPHPImageType( $mime ) { |
||
507 | |||
508 | /** |
||
509 | * Returns true if the extension represents a type which can |
||
510 | * be reliably detected from its content. Use this to determine |
||
511 | * whether strict content checks should be applied to reject |
||
512 | * invalid uploads; if we can't identify the type we won't |
||
513 | * be able to say if it's invalid. |
||
514 | * |
||
515 | * @todo Be more accurate when using fancy MIME detector plugins; |
||
516 | * right now this is the bare minimum getimagesize() list. |
||
517 | * @param string $extension |
||
518 | * @return bool |
||
519 | */ |
||
520 | function isRecognizableExtension( $extension ) { |
||
538 | |||
539 | /** |
||
540 | * Improves a MIME type using the file extension. Some file formats are very generic, |
||
541 | * so their MIME type is not very meaningful. A more useful MIME type can be derived |
||
542 | * by looking at the file extension. Typically, this method would be called on the |
||
543 | * result of guessMimeType(). |
||
544 | * |
||
545 | * @param string $mime The MIME type, typically guessed from a file's content. |
||
546 | * @param string $ext The file extension, as taken from the file name |
||
547 | * |
||
548 | * @return string The MIME type |
||
549 | */ |
||
550 | public function improveTypeFromExtension( $mime, $ext ) { |
||
592 | |||
593 | /** |
||
594 | * MIME type detection. This uses detectMimeType to detect the MIME type |
||
595 | * of the file, but applies additional checks to determine some well known |
||
596 | * file formats that may be missed or misinterpreted by the default MIME |
||
597 | * detection (namely XML based formats like XHTML or SVG, as well as ZIP |
||
598 | * based formats like OPC/ODF files). |
||
599 | * |
||
600 | * @param string $file The file to check |
||
601 | * @param string|bool $ext The file extension, or true (default) to extract |
||
602 | * it from the filename. Set it to false to ignore the extension. DEPRECATED! |
||
603 | * Set to false, use improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
604 | * |
||
605 | * @return string The MIME type of $file |
||
606 | */ |
||
607 | public function guessMimeType( $file, $ext = true ) { |
||
629 | |||
630 | /** |
||
631 | * Guess the MIME type from the file contents. |
||
632 | * |
||
633 | * @todo Remove $ext param |
||
634 | * |
||
635 | * @param string $file |
||
636 | * @param mixed $ext |
||
637 | * @return bool|string |
||
638 | * @throws UnexpectedValueException |
||
639 | */ |
||
640 | private function doGuessMimeType( $file, $ext ) { |
||
824 | |||
825 | /** |
||
826 | * Detect application-specific file type of a given ZIP file from its |
||
827 | * header data. Currently works for OpenDocument and OpenXML types... |
||
828 | * If can't tell, returns 'application/zip'. |
||
829 | * |
||
830 | * @param string $header Some reasonably-sized chunk of file header |
||
831 | * @param string|null $tail The tail of the file |
||
832 | * @param string|bool $ext The file extension, or true to extract it from the filename. |
||
833 | * Set it to false (default) to ignore the extension. DEPRECATED! Set to false, |
||
834 | * use improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
835 | * |
||
836 | * @return string |
||
837 | */ |
||
838 | function detectZipType( $header, $tail = null, $ext = false ) { |
||
927 | |||
928 | /** |
||
929 | * Internal MIME type detection. Detection is done using the fileinfo |
||
930 | * extension if it is available. It can be overriden by callback, which could |
||
931 | * use an external program, for example. If detection fails and $ext is not false, |
||
932 | * the MIME type is guessed from the file extension, using guessTypesForExtension. |
||
933 | * |
||
934 | * If the MIME type is still unknown, getimagesize is used to detect the |
||
935 | * MIME type if the file is an image. If no MIME type can be determined, |
||
936 | * this function returns 'unknown/unknown'. |
||
937 | * |
||
938 | * @param string $file The file to check |
||
939 | * @param string|bool $ext The file extension, or true (default) to extract it from the filename. |
||
940 | * Set it to false to ignore the extension. DEPRECATED! Set to false, use |
||
941 | * improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
942 | * |
||
943 | * @return string The MIME type of $file |
||
944 | */ |
||
945 | private function detectMimeType( $file, $ext = true ) { |
||
1007 | |||
1008 | /** |
||
1009 | * Determine the media type code for a file, using its MIME type, name and |
||
1010 | * possibly its contents. |
||
1011 | * |
||
1012 | * This function relies on the findMediaType(), mapping extensions and MIME |
||
1013 | * types to media types. |
||
1014 | * |
||
1015 | * @todo analyse file if need be |
||
1016 | * @todo look at multiple extension, separately and together. |
||
1017 | * |
||
1018 | * @param string $path Full path to the image file, in case we have to look at the contents |
||
1019 | * (if null, only the MIME type is used to determine the media type code). |
||
1020 | * @param string $mime MIME type. If null it will be guessed using guessMimeType. |
||
1021 | * |
||
1022 | * @return string A value to be used with the MEDIATYPE_xxx constants. |
||
1023 | */ |
||
1024 | function getMediaType( $path = null, $mime = null ) { |
||
1101 | |||
1102 | /** |
||
1103 | * Returns a media code matching the given MIME type or file extension. |
||
1104 | * File extensions are represented by a string starting with a dot (.) to |
||
1105 | * distinguish them from MIME types. |
||
1106 | * |
||
1107 | * This function relies on the mapping defined by $this->mMediaTypes |
||
1108 | * @access private |
||
1109 | * @param string $extMime |
||
1110 | * @return int|string |
||
1111 | */ |
||
1112 | function findMediaType( $extMime ) { |
||
1140 | |||
1141 | /** |
||
1142 | * Get the MIME types that various versions of Internet Explorer would |
||
1143 | * detect from a chunk of the content. |
||
1144 | * |
||
1145 | * @param string $fileName The file name (unused at present) |
||
1146 | * @param string $chunk The first 256 bytes of the file |
||
1147 | * @param string $proposed The MIME type proposed by the server |
||
1148 | * @return array |
||
1149 | */ |
||
1150 | public function getIEMimeTypes( $fileName, $chunk, $proposed ) { |
||
1154 | |||
1155 | /** |
||
1156 | * Get a cached instance of IEContentAnalyzer |
||
1157 | * |
||
1158 | * @return IEContentAnalyzer |
||
1159 | */ |
||
1160 | protected function getIEContentAnalyzer() { |
||
1166 | } |
||
1167 |
In PHP, under loose comparison (like
==
, or!=
, orswitch
conditions), values of different types might be equal.For
string
values, the empty string''
is a special case, in particular the following results might be unexpected: