Conditions | 14 |
Total Lines | 84 |
Code Lines | 42 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like ocrd.resolver.Resolver.download_to_directory() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | from tempfile import mkdtemp |
||
26 | def download_to_directory(self, directory, url, basename=None, if_exists='skip', subdir=None): |
||
27 | """ |
||
28 | Download a file to a directory. |
||
29 | |||
30 | Early Shortcut: If `url` is a local file and that file is already in the directory, keep it there. |
||
31 | |||
32 | If `basename` is not given but subdir is, assume user knows what she's doing and |
||
33 | use last URL segment as the basename. |
||
34 | |||
35 | If `basename` is not given and no subdir is given, use the alnum characters in the URL as the basename. |
||
36 | |||
37 | Args: |
||
38 | directory (string): Directory to download files to |
||
39 | basename (string, None): basename part of the filename on disk. |
||
40 | url (string): URL to download from |
||
41 | if_exists (string, "skip"): What to do if target file already exists. \ |
||
42 | One of ``skip`` (default), ``overwrite`` or ``raise`` |
||
43 | subdir (string, None): Subdirectory to create within the directory. Think ``mets:fileGrp``. |
||
44 | |||
45 | Returns: |
||
46 | Local filename string, *relative* to directory |
||
47 | """ |
||
48 | log = getLogger('ocrd.resolver.download_to_directory') # pylint: disable=redefined-outer-name |
||
49 | log.debug("directory=|%s| url=|%s| basename=|%s| if_exists=|%s| subdir=|%s|", directory, url, basename, if_exists, subdir) |
||
50 | |||
51 | if not url: |
||
52 | raise Exception("'url' must be a string") |
||
53 | if not directory: |
||
54 | raise Exception("'directory' must be a string") # actually Path would also work |
||
55 | |||
56 | directory = Path(directory) |
||
57 | directory.mkdir(parents=True, exist_ok=True) |
||
58 | directory = str(directory.resolve()) |
||
59 | |||
60 | subdir_path = Path(subdir if subdir else '') |
||
61 | basename_path = Path(basename if basename else nth_url_segment(url)) |
||
62 | ret = str(Path(subdir_path, basename_path)) |
||
63 | dst_path = Path(directory, ret) |
||
64 | |||
65 | # log.info("\n\tdst_path='%s \n\turl=%s", dst_path, url) |
||
66 | # print('url=%s', url) |
||
67 | # print('directory=%s', directory) |
||
68 | # print('subdir_path=%s', subdir_path) |
||
69 | # print('basename_path=%s', basename_path) |
||
70 | # print('ret=%s', ret) |
||
71 | # print('dst_path=%s', dst_path) |
||
72 | |||
73 | src_path = None |
||
74 | if is_local_filename(url): |
||
75 | try: |
||
76 | # XXX this raises FNFE in Python 3.5 if src_path doesn't exist but not 3.6+ |
||
77 | src_path = Path(get_local_filename(url)).resolve() |
||
78 | except FileNotFoundError as e: |
||
79 | log.error("Failed to resolve URL locally: %s --> '%s' which does not exist" % (url, src_path)) |
||
80 | raise e |
||
81 | if not src_path.exists(): |
||
82 | raise FileNotFoundError("File path passed as 'url' to download_to_directory does not exist: %s" % url) |
||
83 | if src_path == dst_path: |
||
84 | log.debug("Stop early, src_path and dst_path are the same: '%s' (url: '%s')" % (src_path, url)) |
||
85 | return ret |
||
86 | |||
87 | # Respect 'if_exists' arg |
||
88 | if dst_path.exists(): |
||
89 | if if_exists == 'skip': |
||
90 | return ret |
||
91 | if if_exists == 'raise': |
||
92 | raise FileExistsError("File already exists and if_exists == 'raise': %s" % (dst_path)) |
||
93 | |||
94 | # Create dst_path parent dir |
||
95 | dst_path.parent.mkdir(parents=True, exist_ok=True) |
||
96 | |||
97 | # Copy files or download remote assets |
||
98 | if src_path: |
||
99 | log.debug("Copying file '%s' to '%s'", src_path, dst_path) |
||
100 | dst_path.write_bytes(src_path.read_bytes()) |
||
101 | else: |
||
102 | log.debug("Downloading URL '%s' to '%s'", url, dst_path) |
||
103 | response = requests.get(url) |
||
104 | if response.status_code != 200: |
||
105 | raise Exception("HTTP request failed: %s (HTTP %d)" % (url, response.status_code)) |
||
106 | contents = handle_oai_response(response) |
||
107 | dst_path.write_bytes(contents) |
||
108 | |||
109 | return ret |
||
110 | |||
255 |