Completed
Push — master ( 76ad1f...386b33 )
by Greg
05:49
created

Extract::archiveType()   C

Complexity

Conditions 12
Paths 84

Size

Total Lines 62

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 62
rs 6.4024
c 0
b 0
f 0
cc 12
nc 84
nop 1

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
<?php
2
3
namespace Robo\Task\Archive;
4
5
use Robo\Result;
6
use Robo\Task\BaseTask;
7
use Robo\Task\Filesystem\FilesystemStack;
8
use Robo\Task\Filesystem\DeleteDir;
9
use Robo\Contract\BuilderAwareInterface;
10
use Robo\Common\BuilderAwareTrait;
11
12
/**
13
 * Extracts an archive.
14
 *
15
 * Note that often, distributions are packaged in tar or zip archives
16
 * where the topmost folder may contain variable information, such as
17
 * the release date, or the version of the package.  This information
18
 * is very useful when unpacking by hand, but arbitrarily-named directories
19
 * are much less useful to scripts.  Therefore, by default, Extract will
20
 * remove the top-level directory, and instead store all extracted files
21
 * into the directory specified by $archivePath.
22
 *
23
 * To keep the top-level directory when extracting, use
24
 * `preserveTopDirectory(true)`.
25
 *
26
 * ``` php
27
 * <?php
28
 * $this->taskExtract($archivePath)
29
 *  ->to($destination)
30
 *  ->preserveTopDirectory(false) // the default
31
 *  ->run();
32
 * ?>
33
 * ```
34
 */
35
class Extract extends BaseTask implements BuilderAwareInterface
36
{
37
    use BuilderAwareTrait;
38
39
    /**
40
     * @var string
41
     */
42
    protected $filename;
43
44
    /**
45
     * @var string
46
     */
47
    protected $to;
48
49
    /**
50
     * @var bool
51
     */
52
    private $preserveTopDirectory = false;
53
54
    /**
55
     * @param string $filename
56
     */
57
    public function __construct($filename)
58
    {
59
        $this->filename = $filename;
60
    }
61
62
    /**
63
     * Location to store extracted files.
64
     *
65
     * @param string $to
66
     *
67
     * @return $this
68
     */
69
    public function to($to)
70
    {
71
        $this->to = $to;
72
        return $this;
73
    }
74
75
    /**
76
     * @param bool $preserve
77
     *
78
     * @return $this
79
     */
80
    public function preserveTopDirectory($preserve = true)
81
    {
82
        $this->preserveTopDirectory = $preserve;
83
        return $this;
84
    }
85
86
    /**
87
     * {@inheritdoc}
88
     */
89
    public function run()
90
    {
91 View Code Duplication
        if (!file_exists($this->filename)) {
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated across your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
92
            $this->printTaskError("File {filename} does not exist", ['filename' => $this->filename]);
93
94
            return false;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return false; (false) is incompatible with the return type declared by the interface Robo\Contract\TaskInterface::run of type Robo\Result.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
95
        }
96
        if (!($mimetype = static::archiveType($this->filename))) {
97
            $this->printTaskError("Could not determine type of archive for {filename}", ['filename' => $this->filename]);
98
99
            return false;
0 ignored issues
show
Bug Best Practice introduced by
The return type of return false; (false) is incompatible with the return type declared by the interface Robo\Contract\TaskInterface::run of type Robo\Result.

If you return a value from a function or method, it should be a sub-type of the type that is given by the parent type f.e. an interface, or abstract method. This is more formally defined by the Lizkov substitution principle, and guarantees that classes that depend on the parent type can use any instance of a child type interchangably. This principle also belongs to the SOLID principles for object oriented design.

Let’s take a look at an example:

class Author {
    private $name;

    public function __construct($name) {
        $this->name = $name;
    }

    public function getName() {
        return $this->name;
    }
}

abstract class Post {
    public function getAuthor() {
        return 'Johannes';
    }
}

class BlogPost extends Post {
    public function getAuthor() {
        return new Author('Johannes');
    }
}

class ForumPost extends Post { /* ... */ }

function my_function(Post $post) {
    echo strtoupper($post->getAuthor());
}

Our function my_function expects a Post object, and outputs the author of the post. The base class Post returns a simple string and outputting a simple string will work just fine. However, the child class BlogPost which is a sub-type of Post instead decided to return an object, and is therefore violating the SOLID principles. If a BlogPost were passed to my_function, PHP would not complain, but ultimately fail when executing the strtoupper call in its body.

Loading history...
100
        }
101
102
        // We will first extract to $extractLocation and then move to $this->to
103
        $extractLocation = static::getTmpDir();
104
        @mkdir($extractLocation);
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
105
        @mkdir(dirname($this->to));
0 ignored issues
show
Security Best Practice introduced by
It seems like you do not handle an error condition here. This can introduce security issues, and is generally not recommended.

If you suppress an error, we recommend checking for the error condition explicitly:

// For example instead of
@mkdir($dir);

// Better use
if (@mkdir($dir) === false) {
    throw new \RuntimeException('The directory '.$dir.' could not be created.');
}
Loading history...
106
107
        $this->startTimer();
108
109
        $this->printTaskInfo("Extracting {filename}", ['filename' => $this->filename]);
110
111
        $result = $this->extractAppropriateType($mimetype, $extractLocation);
112
        if ($result->wasSuccessful()) {
113
            $this->printTaskInfo("{filename} extracted", ['filename' => $this->filename]);
114
            // Now, we want to move the extracted files to $this->to. There
115
            // are two possibilities that we must consider:
116
            //
117
            // (1) Archived files were encapsulated in a folder with an arbitrary name
118
            // (2) There was no encapsulating folder, and all the files in the archive
119
            //     were extracted into $extractLocation
120
            //
121
            // In the case of (1), we want to move and rename the encapsulating folder
122
            // to $this->to.
123
            //
124
            // In the case of (2), we will just move and rename $extractLocation.
125
            $filesInExtractLocation = glob("$extractLocation/*");
126
            $hasEncapsulatingFolder = ((count($filesInExtractLocation) == 1) && is_dir($filesInExtractLocation[0]));
127
            if ($hasEncapsulatingFolder && !$this->preserveTopDirectory) {
128
                $result = (new FilesystemStack())
129
                    ->inflect($this)
130
                    ->rename($filesInExtractLocation[0], $this->to)
131
                    ->run();
132
                (new DeleteDir($extractLocation))
133
                    ->inflect($this)
134
                    ->run();
135
            } else {
136
                $result = (new FilesystemStack())
137
                    ->inflect($this)
138
                    ->rename($extractLocation, $this->to)
139
                    ->run();
140
            }
141
        }
142
        $this->stopTimer();
143
        $result['time'] = $this->getExecutionTime();
144
145
        return $result;
146
    }
147
148
    /**
149
     * @param string $mimetype
150
     * @param string $extractLocation
151
     *
152
     * @return \Robo\Result
153
     */
154
    protected function extractAppropriateType($mimetype, $extractLocation)
155
    {
156
        // Perform the extraction of a zip file.
157
        if (($mimetype == 'application/zip') || ($mimetype == 'application/x-zip')) {
158
            return $this->extractZip($extractLocation);
159
        }
160
        return $this->extractTar($extractLocation);
161
    }
162
163
    /**
164
     * @param string $extractLocation
165
     *
166
     * @return \Robo\Result
167
     */
168
    protected function extractZip($extractLocation)
169
    {
170
        if (!extension_loaded('zlib')) {
171
            return Result::errorMissingExtension($this, 'zlib', 'zip extracting');
172
        }
173
174
        $zip = new \ZipArchive();
175
        if (($status = $zip->open($this->filename)) !== true) {
176
            return Result::error($this, "Could not open zip archive {$this->filename}");
177
        }
178
        if (!$zip->extractTo($extractLocation)) {
179
            return Result::error($this, "Could not extract zip archive {$this->filename}");
180
        }
181
        $zip->close();
182
183
        return Result::success($this);
184
    }
185
186
    /**
187
     * @param string $extractLocation
188
     *
189
     * @return \Robo\Result
190
     */
191
    protected function extractTar($extractLocation)
192
    {
193
        if (!class_exists('Archive_Tar')) {
194
            return Result::errorMissingPackage($this, 'Archive_Tar', 'pear/archive_tar');
195
        }
196
        $tar_object = new \Archive_Tar($this->filename);
197
        if (!$tar_object->extract($extractLocation)) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $tar_object->extract($extractLocation) of type null|boolean is loosely compared to false; this is ambiguous if the boolean can be false. You might want to explicitly use !== null instead.

If an expression can have both false, and null as possible values. It is generally a good practice to always use strict comparison to clearly distinguish between those two values.

$a = canBeFalseAndNull();

// Instead of
if ( ! $a) { }

// Better use one of the explicit versions:
if ($a !== null) { }
if ($a !== false) { }
if ($a !== null && $a !== false) { }
Loading history...
198
            return Result::error($this, "Could not extract tar archive {$this->filename}");
199
        }
200
201
        return Result::success($this);
202
    }
203
204
    /**
205
     * @param string $filename
206
     *
207
     * @return bool|string
208
     */
209
    protected static function archiveType($filename)
210
    {
211
        $content_type = false;
212
        if (class_exists('finfo')) {
213
            $finfo = new \finfo(FILEINFO_MIME_TYPE);
214
            $content_type = $finfo->file($filename);
215
            // If finfo cannot determine the content type, then we will try other methods
216
            if ($content_type == 'application/octet-stream') {
217
                $content_type = false;
218
            }
219
        }
220
        // Examing the file's magic header bytes.
221
        if (!$content_type) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $content_type of type string|false is loosely compared to false; this is ambiguous if the string can be empty. You might want to explicitly use === false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
222
            if ($file = fopen($filename, 'rb')) {
223
                $first = fread($file, 2);
224
                fclose($file);
225
                if ($first !== false) {
226
                    // Interpret the two bytes as a little endian 16-bit unsigned int.
227
                    $data = unpack('v', $first);
228
                    switch ($data[1]) {
229
                        case 0x8b1f:
230
                            // First two bytes of gzip files are 0x1f, 0x8b (little-endian).
231
                            // See http://www.gzip.org/zlib/rfc-gzip.html#header-trailer
232
                            $content_type = 'application/x-gzip';
233
                            break;
234
235
                        case 0x4b50:
236
                            // First two bytes of zip files are 0x50, 0x4b ('PK') (little-endian).
237
                            // See http://en.wikipedia.org/wiki/Zip_(file_format)#File_headers
238
                            $content_type = 'application/zip';
239
                            break;
240
241
                        case 0x5a42:
242
                            // First two bytes of bzip2 files are 0x5a, 0x42 ('BZ') (big-endian).
243
                            // See http://en.wikipedia.org/wiki/Bzip2#File_format
244
                            $content_type = 'application/x-bzip2';
245
                            break;
246
                    }
247
                }
248
            }
249
        }
250
        // 3. Lastly if above methods didn't work, try to guess the mime type from
251
        // the file extension. This is useful if the file has no identificable magic
252
        // header bytes (for example tarballs).
253
        if (!$content_type) {
0 ignored issues
show
Bug Best Practice introduced by
The expression $content_type of type string|false is loosely compared to false; this is ambiguous if the string can be empty. You might want to explicitly use === false instead.

In PHP, under loose comparison (like ==, or !=, or switch conditions), values of different types might be equal.

For string values, the empty string '' is a special case, in particular the following results might be unexpected:

''   == false // true
''   == null  // true
'ab' == false // false
'ab' == null  // false

// It is often better to use strict comparison
'' === false // false
'' === null  // false
Loading history...
254
            // Remove querystring from the filename, if present.
255
            $filename = basename(current(explode('?', $filename, 2)));
256
            $extension_mimetype = array(
257
                '.tar.gz' => 'application/x-gzip',
258
                '.tgz' => 'application/x-gzip',
259
                '.tar' => 'application/x-tar',
260
            );
261
            foreach ($extension_mimetype as $extension => $ct) {
262
                if (substr($filename, -strlen($extension)) === $extension) {
263
                    $content_type = $ct;
264
                    break;
265
                }
266
            }
267
        }
268
269
        return $content_type;
270
    }
271
272
    /**
273
     * @return string
274
     */
275
    protected static function getTmpDir()
276
    {
277
        return getcwd() . '/tmp' . rand() . time();
278
    }
279
}
280