Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.
Common duplication problems, and corresponding solutions are:
Complex classes like MimeAnalyzer often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes. You can also have a look at the cohesion graph to spot any un-connected, or weakly-connected components.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
While breaking up the class, it is a good idea to analyze how other classes use MimeAnalyzer, and based on these observations, apply Extract Interface, too.
| 1 | <?php |
||
| 30 | class MimeAnalyzer implements LoggerAwareInterface { |
||
| 31 | /** @var string */ |
||
| 32 | protected $typeFile; |
||
| 33 | /** @var string */ |
||
| 34 | protected $infoFile; |
||
| 35 | /** @var string */ |
||
| 36 | protected $xmlTypes; |
||
| 37 | /** @var callable */ |
||
| 38 | protected $initCallback; |
||
| 39 | /** @var callable */ |
||
| 40 | protected $detectCallback; |
||
| 41 | /** @var callable */ |
||
| 42 | protected $guessCallback; |
||
| 43 | /** @var callable */ |
||
| 44 | protected $extCallback; |
||
| 45 | /** @var array Mapping of media types to arrays of MIME types */ |
||
| 46 | protected $mediaTypes = null; |
||
| 47 | /** @var array Map of MIME type aliases */ |
||
| 48 | protected $mimeTypeAliases = null; |
||
| 49 | /** @var array Map of MIME types to file extensions (as a space separated list) */ |
||
| 50 | protected $mimetoExt = null; |
||
| 51 | |||
| 52 | /** @var array Map of file extensions types to MIME types (as a space separated list) */ |
||
| 53 | public $mExtToMime = null; // legacy name; field accessed by hooks |
||
| 54 | |||
| 55 | /** @var IEContentAnalyzer */ |
||
| 56 | protected $IEAnalyzer; |
||
| 57 | |||
| 58 | /** @var string Extra MIME types, set for example by media handling extensions */ |
||
| 59 | private $extraTypes = ''; |
||
| 60 | /** @var string Extra MIME info, set for example by media handling extensions */ |
||
| 61 | private $extraInfo = ''; |
||
| 62 | |||
| 63 | /** @var LoggerInterface */ |
||
| 64 | private $logger; |
||
| 65 | |||
| 66 | /** |
||
| 67 | * Defines a set of well known MIME types |
||
| 68 | * This is used as a fallback to mime.types files. |
||
| 69 | * An extensive list of well known MIME types is provided by |
||
| 70 | * the file mime.types in the includes directory. |
||
| 71 | * |
||
| 72 | * This list concatenated with mime.types is used to create a MIME <-> ext |
||
| 73 | * map. Each line contains a MIME type followed by a space separated list of |
||
| 74 | * extensions. If multiple extensions for a single MIME type exist or if |
||
| 75 | * multiple MIME types exist for a single extension then in most cases |
||
| 76 | * MediaWiki assumes that the first extension following the MIME type is the |
||
| 77 | * canonical extension, and the first time a MIME type appears for a certain |
||
| 78 | * extension is considered the canonical MIME type. |
||
| 79 | * |
||
| 80 | * (Note that appending the type file list to the end of self::$wellKnownTypes |
||
| 81 | * sucks because you can't redefine canonical types. This could be fixed by |
||
| 82 | * appending self::$wellKnownTypes behind type file list, but who knows |
||
| 83 | * what will break? In practice this probably isn't a problem anyway -- Bryan) |
||
| 84 | */ |
||
| 85 | protected static $wellKnownTypes = <<<EOT |
||
| 86 | application/ogg ogx ogg ogm ogv oga spx |
||
| 87 | application/pdf pdf |
||
| 88 | application/vnd.oasis.opendocument.chart odc |
||
| 89 | application/vnd.oasis.opendocument.chart-template otc |
||
| 90 | application/vnd.oasis.opendocument.database odb |
||
| 91 | application/vnd.oasis.opendocument.formula odf |
||
| 92 | application/vnd.oasis.opendocument.formula-template otf |
||
| 93 | application/vnd.oasis.opendocument.graphics odg |
||
| 94 | application/vnd.oasis.opendocument.graphics-template otg |
||
| 95 | application/vnd.oasis.opendocument.image odi |
||
| 96 | application/vnd.oasis.opendocument.image-template oti |
||
| 97 | application/vnd.oasis.opendocument.presentation odp |
||
| 98 | application/vnd.oasis.opendocument.presentation-template otp |
||
| 99 | application/vnd.oasis.opendocument.spreadsheet ods |
||
| 100 | application/vnd.oasis.opendocument.spreadsheet-template ots |
||
| 101 | application/vnd.oasis.opendocument.text odt |
||
| 102 | application/vnd.oasis.opendocument.text-master otm |
||
| 103 | application/vnd.oasis.opendocument.text-template ott |
||
| 104 | application/vnd.oasis.opendocument.text-web oth |
||
| 105 | application/javascript js |
||
| 106 | application/x-shockwave-flash swf |
||
| 107 | audio/midi mid midi kar |
||
| 108 | audio/mpeg mpga mpa mp2 mp3 |
||
| 109 | audio/x-aiff aif aiff aifc |
||
| 110 | audio/x-wav wav |
||
| 111 | audio/ogg oga spx ogg |
||
| 112 | image/x-bmp bmp |
||
| 113 | image/gif gif |
||
| 114 | image/jpeg jpeg jpg jpe |
||
| 115 | image/png png |
||
| 116 | image/svg+xml svg |
||
| 117 | image/svg svg |
||
| 118 | image/tiff tiff tif |
||
| 119 | image/vnd.djvu djvu |
||
| 120 | image/x.djvu djvu |
||
| 121 | image/x-djvu djvu |
||
| 122 | image/x-portable-pixmap ppm |
||
| 123 | image/x-xcf xcf |
||
| 124 | text/plain txt |
||
| 125 | text/html html htm |
||
| 126 | video/ogg ogv ogm ogg |
||
| 127 | video/mpeg mpg mpeg |
||
| 128 | EOT; |
||
| 129 | |||
| 130 | /** |
||
| 131 | * Defines a set of well known MIME info entries |
||
| 132 | * This is used as a fallback to mime.info files. |
||
| 133 | * An extensive list of well known MIME types is provided by |
||
| 134 | * the file mime.info in the includes directory. |
||
| 135 | */ |
||
| 136 | protected static $wellKnownInfo = <<<EOT |
||
| 137 | application/pdf [OFFICE] |
||
| 138 | application/vnd.oasis.opendocument.chart [OFFICE] |
||
| 139 | application/vnd.oasis.opendocument.chart-template [OFFICE] |
||
| 140 | application/vnd.oasis.opendocument.database [OFFICE] |
||
| 141 | application/vnd.oasis.opendocument.formula [OFFICE] |
||
| 142 | application/vnd.oasis.opendocument.formula-template [OFFICE] |
||
| 143 | application/vnd.oasis.opendocument.graphics [OFFICE] |
||
| 144 | application/vnd.oasis.opendocument.graphics-template [OFFICE] |
||
| 145 | application/vnd.oasis.opendocument.image [OFFICE] |
||
| 146 | application/vnd.oasis.opendocument.image-template [OFFICE] |
||
| 147 | application/vnd.oasis.opendocument.presentation [OFFICE] |
||
| 148 | application/vnd.oasis.opendocument.presentation-template [OFFICE] |
||
| 149 | application/vnd.oasis.opendocument.spreadsheet [OFFICE] |
||
| 150 | application/vnd.oasis.opendocument.spreadsheet-template [OFFICE] |
||
| 151 | application/vnd.oasis.opendocument.text [OFFICE] |
||
| 152 | application/vnd.oasis.opendocument.text-template [OFFICE] |
||
| 153 | application/vnd.oasis.opendocument.text-master [OFFICE] |
||
| 154 | application/vnd.oasis.opendocument.text-web [OFFICE] |
||
| 155 | application/javascript text/javascript application/x-javascript [EXECUTABLE] |
||
| 156 | application/x-shockwave-flash [MULTIMEDIA] |
||
| 157 | audio/midi [AUDIO] |
||
| 158 | audio/x-aiff [AUDIO] |
||
| 159 | audio/x-wav [AUDIO] |
||
| 160 | audio/mp3 audio/mpeg [AUDIO] |
||
| 161 | application/ogg audio/ogg video/ogg [MULTIMEDIA] |
||
| 162 | image/x-bmp image/x-ms-bmp image/bmp [BITMAP] |
||
| 163 | image/gif [BITMAP] |
||
| 164 | image/jpeg [BITMAP] |
||
| 165 | image/png [BITMAP] |
||
| 166 | image/svg+xml [DRAWING] |
||
| 167 | image/tiff [BITMAP] |
||
| 168 | image/vnd.djvu [BITMAP] |
||
| 169 | image/x-xcf [BITMAP] |
||
| 170 | image/x-portable-pixmap [BITMAP] |
||
| 171 | text/plain [TEXT] |
||
| 172 | text/html [TEXT] |
||
| 173 | video/ogg [VIDEO] |
||
| 174 | video/mpeg [VIDEO] |
||
| 175 | unknown/unknown application/octet-stream application/x-empty [UNKNOWN] |
||
| 176 | EOT; |
||
| 177 | |||
| 178 | /** |
||
| 179 | * @param array $params Configuration map, includes: |
||
| 180 | * - typeFile: path to file with the list of known MIME types |
||
| 181 | * - infoFile: path to file with the MIME type info |
||
| 182 | * - xmlTypes: map of root element names to XML MIME types |
||
| 183 | * - initCallback: initialization callback that is passed this object [optional] |
||
| 184 | * - detectCallback: alternative to finfo that returns the mime type for a file. |
||
| 185 | * For example, the callback can return the output of "file -bi". [optional] |
||
| 186 | * - guessCallback: callback to improve the guessed MIME type using the file data. |
||
| 187 | * This is intended for fixing mistakes in fileinfo or "detectCallback". [optional] |
||
| 188 | * - extCallback: callback to improve the guessed MIME type using the extension. [optional] |
||
| 189 | * - logger: PSR-3 logger [optional] |
||
| 190 | * @note Constructing these instances is expensive due to file reads. |
||
| 191 | * A service or singleton pattern should be used to avoid creating instances again and again. |
||
| 192 | */ |
||
| 193 | public function __construct( array $params ) { |
||
| 215 | |||
| 216 | protected function loadFiles() { |
||
| 377 | |||
| 378 | public function setLogger( LoggerInterface $logger ) { |
||
| 381 | |||
| 382 | /** |
||
| 383 | * Adds to the list mapping MIME to file extensions. |
||
| 384 | * As an extension author, you are encouraged to submit patches to |
||
| 385 | * MediaWiki's core to add new MIME types to mime.types. |
||
| 386 | * @param string $types |
||
| 387 | */ |
||
| 388 | public function addExtraTypes( $types ) { |
||
| 391 | |||
| 392 | /** |
||
| 393 | * Adds to the list mapping MIME to media type. |
||
| 394 | * As an extension author, you are encouraged to submit patches to |
||
| 395 | * MediaWiki's core to add new MIME info to mime.info. |
||
| 396 | * @param string $info |
||
| 397 | */ |
||
| 398 | public function addExtraInfo( $info ) { |
||
| 401 | |||
| 402 | /** |
||
| 403 | * Returns a list of file extensions for a given MIME type as a space |
||
| 404 | * separated string or null if the MIME type was unrecognized. Resolves |
||
| 405 | * MIME type aliases. |
||
| 406 | * |
||
| 407 | * @param string $mime |
||
| 408 | * @return string|null |
||
| 409 | */ |
||
| 410 | public function getExtensionsForType( $mime ) { |
||
| 428 | |||
| 429 | /** |
||
| 430 | * Returns a list of MIME types for a given file extension as a space |
||
| 431 | * separated string or null if the extension was unrecognized. |
||
| 432 | * |
||
| 433 | * @param string $ext |
||
| 434 | * @return string|null |
||
| 435 | */ |
||
| 436 | public function getTypesForExtension( $ext ) { |
||
| 442 | |||
| 443 | /** |
||
| 444 | * Returns a single MIME type for a given file extension or null if unknown. |
||
| 445 | * This is always the first type from the list returned by getTypesForExtension($ext). |
||
| 446 | * |
||
| 447 | * @param string $ext |
||
| 448 | * @return string|null |
||
| 449 | */ |
||
| 450 | public function guessTypesForExtension( $ext ) { |
||
| 462 | |||
| 463 | /** |
||
| 464 | * Tests if the extension matches the given MIME type. Returns true if a |
||
| 465 | * match was found, null if the MIME type is unknown, and false if the |
||
| 466 | * MIME type is known but no matches where found. |
||
| 467 | * |
||
| 468 | * @param string $extension |
||
| 469 | * @param string $mime |
||
| 470 | * @return bool|null |
||
| 471 | */ |
||
| 472 | public function isMatchingExtension( $extension, $mime ) { |
||
| 484 | |||
| 485 | /** |
||
| 486 | * Returns true if the MIME type is known to represent an image format |
||
| 487 | * supported by the PHP GD library. |
||
| 488 | * |
||
| 489 | * @param string $mime |
||
| 490 | * |
||
| 491 | * @return bool |
||
| 492 | */ |
||
| 493 | public function isPHPImageType( $mime ) { |
||
| 507 | |||
| 508 | /** |
||
| 509 | * Returns true if the extension represents a type which can |
||
| 510 | * be reliably detected from its content. Use this to determine |
||
| 511 | * whether strict content checks should be applied to reject |
||
| 512 | * invalid uploads; if we can't identify the type we won't |
||
| 513 | * be able to say if it's invalid. |
||
| 514 | * |
||
| 515 | * @todo Be more accurate when using fancy MIME detector plugins; |
||
| 516 | * right now this is the bare minimum getimagesize() list. |
||
| 517 | * @param string $extension |
||
| 518 | * @return bool |
||
| 519 | */ |
||
| 520 | function isRecognizableExtension( $extension ) { |
||
| 538 | |||
| 539 | /** |
||
| 540 | * Improves a MIME type using the file extension. Some file formats are very generic, |
||
| 541 | * so their MIME type is not very meaningful. A more useful MIME type can be derived |
||
| 542 | * by looking at the file extension. Typically, this method would be called on the |
||
| 543 | * result of guessMimeType(). |
||
| 544 | * |
||
| 545 | * @param string $mime The MIME type, typically guessed from a file's content. |
||
| 546 | * @param string $ext The file extension, as taken from the file name |
||
| 547 | * |
||
| 548 | * @return string The MIME type |
||
| 549 | */ |
||
| 550 | public function improveTypeFromExtension( $mime, $ext ) { |
||
| 592 | |||
| 593 | /** |
||
| 594 | * MIME type detection. This uses detectMimeType to detect the MIME type |
||
| 595 | * of the file, but applies additional checks to determine some well known |
||
| 596 | * file formats that may be missed or misinterpreted by the default MIME |
||
| 597 | * detection (namely XML based formats like XHTML or SVG, as well as ZIP |
||
| 598 | * based formats like OPC/ODF files). |
||
| 599 | * |
||
| 600 | * @param string $file The file to check |
||
| 601 | * @param string|bool $ext The file extension, or true (default) to extract |
||
| 602 | * it from the filename. Set it to false to ignore the extension. DEPRECATED! |
||
| 603 | * Set to false, use improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
| 604 | * |
||
| 605 | * @return string The MIME type of $file |
||
| 606 | */ |
||
| 607 | public function guessMimeType( $file, $ext = true ) { |
||
| 629 | |||
| 630 | /** |
||
| 631 | * Guess the MIME type from the file contents. |
||
| 632 | * |
||
| 633 | * @todo Remove $ext param |
||
| 634 | * |
||
| 635 | * @param string $file |
||
| 636 | * @param mixed $ext |
||
| 637 | * @return bool|string |
||
| 638 | * @throws UnexpectedValueException |
||
| 639 | */ |
||
| 640 | private function doGuessMimeType( $file, $ext ) { |
||
| 824 | |||
| 825 | /** |
||
| 826 | * Detect application-specific file type of a given ZIP file from its |
||
| 827 | * header data. Currently works for OpenDocument and OpenXML types... |
||
| 828 | * If can't tell, returns 'application/zip'. |
||
| 829 | * |
||
| 830 | * @param string $header Some reasonably-sized chunk of file header |
||
| 831 | * @param string|null $tail The tail of the file |
||
| 832 | * @param string|bool $ext The file extension, or true to extract it from the filename. |
||
| 833 | * Set it to false (default) to ignore the extension. DEPRECATED! Set to false, |
||
| 834 | * use improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
| 835 | * |
||
| 836 | * @return string |
||
| 837 | */ |
||
| 838 | function detectZipType( $header, $tail = null, $ext = false ) { |
||
| 927 | |||
| 928 | /** |
||
| 929 | * Internal MIME type detection. Detection is done using the fileinfo |
||
| 930 | * extension if it is available. It can be overriden by callback, which could |
||
| 931 | * use an external program, for example. If detection fails and $ext is not false, |
||
| 932 | * the MIME type is guessed from the file extension, using guessTypesForExtension. |
||
| 933 | * |
||
| 934 | * If the MIME type is still unknown, getimagesize is used to detect the |
||
| 935 | * MIME type if the file is an image. If no MIME type can be determined, |
||
| 936 | * this function returns 'unknown/unknown'. |
||
| 937 | * |
||
| 938 | * @param string $file The file to check |
||
| 939 | * @param string|bool $ext The file extension, or true (default) to extract it from the filename. |
||
| 940 | * Set it to false to ignore the extension. DEPRECATED! Set to false, use |
||
| 941 | * improveTypeFromExtension($mime, $ext) later to improve MIME type. |
||
| 942 | * |
||
| 943 | * @return string The MIME type of $file |
||
| 944 | */ |
||
| 945 | private function detectMimeType( $file, $ext = true ) { |
||
| 1007 | |||
| 1008 | /** |
||
| 1009 | * Determine the media type code for a file, using its MIME type, name and |
||
| 1010 | * possibly its contents. |
||
| 1011 | * |
||
| 1012 | * This function relies on the findMediaType(), mapping extensions and MIME |
||
| 1013 | * types to media types. |
||
| 1014 | * |
||
| 1015 | * @todo analyse file if need be |
||
| 1016 | * @todo look at multiple extension, separately and together. |
||
| 1017 | * |
||
| 1018 | * @param string $path Full path to the image file, in case we have to look at the contents |
||
| 1019 | * (if null, only the MIME type is used to determine the media type code). |
||
| 1020 | * @param string $mime MIME type. If null it will be guessed using guessMimeType. |
||
| 1021 | * |
||
| 1022 | * @return string A value to be used with the MEDIATYPE_xxx constants. |
||
| 1023 | */ |
||
| 1024 | function getMediaType( $path = null, $mime = null ) { |
||
| 1101 | |||
| 1102 | /** |
||
| 1103 | * Returns a media code matching the given MIME type or file extension. |
||
| 1104 | * File extensions are represented by a string starting with a dot (.) to |
||
| 1105 | * distinguish them from MIME types. |
||
| 1106 | * |
||
| 1107 | * This function relies on the mapping defined by $this->mMediaTypes |
||
| 1108 | * @access private |
||
| 1109 | * @param string $extMime |
||
| 1110 | * @return int|string |
||
| 1111 | */ |
||
| 1112 | function findMediaType( $extMime ) { |
||
| 1140 | |||
| 1141 | /** |
||
| 1142 | * Get the MIME types that various versions of Internet Explorer would |
||
| 1143 | * detect from a chunk of the content. |
||
| 1144 | * |
||
| 1145 | * @param string $fileName The file name (unused at present) |
||
| 1146 | * @param string $chunk The first 256 bytes of the file |
||
| 1147 | * @param string $proposed The MIME type proposed by the server |
||
| 1148 | * @return array |
||
| 1149 | */ |
||
| 1150 | public function getIEMimeTypes( $fileName, $chunk, $proposed ) { |
||
| 1154 | |||
| 1155 | /** |
||
| 1156 | * Get a cached instance of IEContentAnalyzer |
||
| 1157 | * |
||
| 1158 | * @return IEContentAnalyzer |
||
| 1159 | */ |
||
| 1160 | protected function getIEContentAnalyzer() { |
||
| 1166 | } |
||
| 1167 |
In PHP, under loose comparison (like
==, or!=, orswitchconditions), values of different types might be equal.For
stringvalues, the empty string''is a special case, in particular the following results might be unexpected: