Conditions | 29 |
Total Lines | 95 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like unpack_if_needed() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | ''' |
||
17 | def unpack_if_needed(destination_path, fpath): |
||
18 | ''' |
||
19 | fpath is the fully qualified path to a single file that |
||
20 | might be a ZIP / TGZ archive. |
||
21 | |||
22 | The function moves the file, or the content if it is an |
||
23 | archive, to the directory given by destination_path. |
||
24 | |||
25 | The function returns two values. The first one is a |
||
26 | directory name if: |
||
27 | |||
28 | - fpath is an archive. |
||
29 | - The archive contains only one this single directory with |
||
30 | arbitrary content. |
||
31 | |||
32 | Otherwise, it is zero. |
||
33 | |||
34 | This is helpful in catching the typical "right-click to compress" |
||
35 | cases for single ZIP files in Explorer / Finder. |
||
36 | |||
37 | The second return value is a boolean indicating if |
||
38 | fpath was an archive. |
||
39 | |||
40 | ''' |
||
41 | single_dir = None |
||
42 | did_unpack = False |
||
43 | |||
44 | dircontent = os.listdir(destination_path) |
||
45 | logger.debug("Content of %s before unarchiving: %s" % |
||
46 | (destination_path, str(dircontent))) |
||
47 | |||
48 | # Perform un-archiving, in case |
||
49 | if zipfile.is_zipfile(fpath): |
||
50 | logger.debug("Detected ZIP file at %s, unpacking it." % (fpath)) |
||
51 | did_unpack = True |
||
52 | with zipfile.ZipFile(fpath, "r") as zip: |
||
53 | infolist = zip.infolist() |
||
54 | directories = [ |
||
55 | entry.filename for entry in infolist if entry.filename.endswith('/')] |
||
56 | logger.debug("List of directory entries: " + str(directories)) |
||
57 | |||
58 | # Consider this case: ['subdir1/', 'subdir1/subdir2/'] |
||
59 | if len(directories) > 1: |
||
60 | redundant = [] |
||
61 | for current in directories: |
||
62 | starts_with_this = [ |
||
63 | el for el in directories if el.startswith(current)] |
||
64 | if len(starts_with_this) == len(directories): |
||
65 | # current is a partial directory name that is contained |
||
66 | # in all others |
||
67 | redundant.append(current) |
||
68 | logger.debug("Redundant directory entries: " + str(redundant)) |
||
69 | directories = [ |
||
70 | entry for entry in directories if entry not in redundant] |
||
71 | logger.debug( |
||
72 | "Updated list of directory entries: " + str(directories)) |
||
73 | |||
74 | files = [ |
||
75 | entry.filename for entry in infolist if not entry.filename.endswith('/')] |
||
76 | logger.debug("List of files: " + str(files)) |
||
77 | if len(directories) == 1: |
||
78 | d = directories[0] |
||
79 | in_this_dir = [entry for entry in files if entry.startswith(d)] |
||
80 | if len(files) == len(in_this_dir): |
||
81 | logger.debug("ZIP archive contains only one subdirectory") |
||
82 | single_dir = d |
||
83 | zip.extractall(destination_path) |
||
84 | elif tarfile.is_tarfile(fpath): |
||
85 | logger.debug("Detected TAR file at %s, unpacking it." % (fpath)) |
||
86 | did_unpack = True |
||
87 | with tarfile.open(fpath) as tar: |
||
88 | infolist = tar.getmembers() |
||
89 | # A TGZ file of one subdirectory with arbitrary files |
||
90 | # has one infolist entry per directory and file |
||
91 | directories = [entry.name for entry in infolist if entry.isdir()] |
||
92 | files = [entry.name for entry in infolist if entry.isfile()] |
||
93 | logger.debug(directories) |
||
94 | logger.debug(files) |
||
95 | if len(directories) == 1: |
||
96 | d = directories[0] |
||
97 | in_this_dir = [entry for entry in files if entry.startswith(d)] |
||
98 | if len(files) == len(in_this_dir): |
||
99 | logger.debug("TGZ archive contains only one subdirectory") |
||
100 | single_dir = d |
||
101 | tar.extractall(destination_path) |
||
102 | else: |
||
103 | if not fpath.startswith(destination_path): |
||
104 | logger.debug( |
||
105 | "File at %s is a single non-archive file, copying it to %s" % (fpath, destination_path)) |
||
106 | shutil.copy(fpath, destination_path) |
||
107 | |||
108 | dircontent = os.listdir(destination_path) |
||
109 | logger.debug("Content of %s after unarchiving: %s" % |
||
110 | (destination_path, str(dircontent))) |
||
111 | return single_dir, did_unpack |
||
112 | |||
223 |