Passed
Push — main ( 60119b...e5c7f7 )
by Douglas
02:02
created

pocketutils.tools.path_tools.PathTools.prep_file()   A

Complexity

Conditions 4

Size

Total Lines 12
Code Lines 6

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 4
eloc 6
nop 3
dl 0
loc 12
rs 10
c 0
b 0
f 0
1
import sys
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
import warnings
0 ignored issues
show
Unused Code introduced by
The import warnings seems to be unused.
Loading history...
3
from typing import Callable, Mapping, Optional, Sequence
0 ignored issues
show
Unused Code introduced by
Unused Mapping imported from typing
Loading history...
4
5
import regex
0 ignored issues
show
introduced by
Unable to import 'regex'
Loading history...
6
7
from pocketutils.core.exceptions import *
0 ignored issues
show
Unused Code introduced by
ConfigWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
AlgorithmWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DataWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
Error was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ImportFailedWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ConstructionError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
IgnoringRequestWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
IllegalStateError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
UnsupportedOpError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
NotConstructableError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ImmutableError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
NaturalExpectedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
OpStateError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MultipleMatchesError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
IncompatibleDataError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
HashValidationError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
HashValidationFailedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MissingEnvVarError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
LockedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ResourceError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
RefusingRequestError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MissingConfigKeyError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ConfigError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
BadCommandError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
UserError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MismatchedDataError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
HardwareError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MissingResourceError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PathError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
AlgorithmError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
Collection was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
CalledProcessError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
wraps was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PathIsNotADirError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PathIsNotAFileError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XFileExistsError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PathExistsError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
UploadError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
AbstractWrappedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DownloadError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
CacheSaveError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
CacheLoadError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
SaveError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
LoadError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
BadWriteError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
MissingDeviceError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DeviceConnectionError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DataIntegrityError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
UnrecognizedKeyError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
LookupFailedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
WrongDimensionError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
RequestError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
RefusingError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DangerousRequestWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
AmbiguousRequestError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ReservedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
AlreadyUsedError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
LengthError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
LengthMismatchError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
EmptyCollectionError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ParsingError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XTypeError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XValueError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
StringPatternError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
OutOfRangeError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ZeroDistanceError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
NullValueError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
NumericConversionError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
InexactRoundError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XKeyError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
StrangeRequestWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
CodeIncompleteError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
DeprecatedWarning was imported with wildcard, but is not used.
Loading history...
Coding Style introduced by
The usage of wildcard imports like pocketutils.core.exceptions should generally be avoided.
Loading history...
Unused Code introduced by
ImmatureWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ObsoleteWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XWarning was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
XException was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
ErrorUtils was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
KeyLike was imported with wildcard, but is not used.
Loading history...
8
from pocketutils.tools.base_tools import BaseTools
9
10
logger = logging.getLogger("pocketutils")
11
12
13
class PathTools(BaseTools):
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
14
    @classmethod
15
    def updir(cls, n: int, *parts) -> Path:
0 ignored issues
show
Coding Style Naming introduced by
Argument name "n" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
16
        """
17
        Get an absolute path ``n`` parents from ``os.getcwd()``.
18
        Does not sanitize.
19
20
        Ex: In dir '/home/john/dir_a/dir_b':
21
            updir(2, 'dir1', 'dir2')  # returns Path('/home/john/dir1/dir2')
22
        """
23
        base = Path(os.getcwd())
24
        for _ in range(n):
25
            base = base.parent
26
        for part in parts:
27
            base = base / part
28
        return base.resolve()
29
30
    @classmethod
31
    def guess_trash(cls) -> Path:
32
        """
33
        Chooses a reasonable path for trash based on the OS.
34
        This is not reliable. For a more sophisticated solution, see https://github.com/hsoft/send2trash
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (104/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
35
        However, even that can fail.
36
        """
37
        plat = sys.platform.lower()
38
        if "darwin" in plat:
0 ignored issues
show
unused-code introduced by
Unnecessary "elif" after "return"
Loading history...
39
            return Path.home() / ".Trash"
40
        elif "win" in plat:
41
            return Path(Path.home().root) / "$Recycle.Bin"
42
        else:
43
            return Path.home() / ".trash"
44
45
    @classmethod
46
    def prep_dir(cls, path: PathLike, exist_ok: bool = True) -> bool:
47
        """
48
        Prepares a directory by making it if it doesn't exist.
49
        If exist_ok is False, calls logger.warning it already exists
50
        """
51
        path = Path(path)
52
        exists = path.exists()
53
        # On some platforms we get generic exceptions like permissions errors,
54
        # so these are better
55
        if exists and not path.is_dir():
56
            raise DirDoesNotExistError(f"Path {path} exists but is not a file")
57
        if exists and not exist_ok:
58
            logger.warning(f"Directory {path} already exists")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
59
        if not exists:
60
            # NOTE! exist_ok in mkdir throws an error on Windows
61
            path.mkdir(parents=True)
62
        return exists
63
64
    @classmethod
65
    def prep_file(cls, path: PathLike, exist_ok: bool = True) -> None:
66
        """
67
        Prepares a file path by making its parent directory.
68
        Same as ``pathlib.Path.mkdir`` but makes sure ``path`` is a file if it exists.
69
        """
70
        # On some platforms we get generic exceptions like permissions errors, so these are better
71
        path = Path(path)
72
        # check for errors first; don't make the dirs and then fail
73
        if path.exists() and not path.is_file() and not path.is_symlink():
74
            raise FileDoesNotExistError(f"Path {path} exists but is not a file")
75
        Path(path.parent).mkdir(parents=True, exist_ok=exist_ok)
76
77
    @classmethod
78
    def sanitize_path(
79
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
80
        path: PathLike,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
81
        is_file: Optional[bool] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
82
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
83
        show_warnings: Union[bool, Callable[[str], Any]] = True,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
84
    ) -> Path:
85
        r"""
86
        Sanitizes a path for major OSes and filesystems.
87
        Also see sanitize_path_nodes and sanitize_path_node.
88
        Platform-dependent.
89
        A corner case is drive letters in Linux:
90
        "C:\\Users\\john" is converted to '/C:/users/john' if os.name=='posix'
91
        """
92
        # the idea is to sanitize for both Windows and Posix, regardless of the platform in use
93
        # the sanitization should be as uniform as possible for both platforms
94
        # this works for at least Windows+NTFS
95
        # tilde substitution for long filenames in Windows -- is unsupported
96
        w = {True: logger.warning, False: lambda _: None}.get(show_warnings, show_warnings)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "w" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
97
        if path.startswith("\\\\?"):
98
            w(f"Long UNC Windows paths (\\\\? prefix) are not supported (path '{path}')", path=path)
99
        bits = str(path).strip().replace("\\", "/").split("/")
100
        new_path = cls.sanitize_path_nodes(bits, is_file=is_file)
101
        if new_path != path:
102
            w(f"Sanitized filename {path} → {new_path}")
103
        return Path(new_path)
104
105
    @classmethod
106
    def sanitize_path_nodes(cls, bits: Sequence[PathLike], is_file: Optional[bool] = None) -> Path:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
107
        fixed_bits = [
108
            bit + os.sep
109
            if i == 0 and bit.strip() in ["", ".", ".."]
110
            else cls.sanitize_path_node(
111
                bit,
112
                is_file=(False if i < len(bits) - 1 else is_file),
113
                is_root_or_drive=(None if i == 0 else False),
114
            )
115
            for i, bit in enumerate(bits)
116
            if bit.strip() not in ["", "."]
117
            or i == 0  # ignore // (empty) just like Path does (but fail on sanitize_path_node(' '))
118
        ]
119
        fixed_bits = [bit for i, bit in enumerate(fixed_bits) if i == 0 or bit not in ["", "."]]
120
        # unfortunately POSIX turns Path('C:\', '5') into C:\/5
121
        # this isn't an ideal way to fix it, but it works
122
        pat = regex.compile(r"^([A-Z]:)(?:\\)?$", flags=regex.V1)
123
        if os.name == "posix" and len(fixed_bits) > 0 and pat.fullmatch(fixed_bits[0]):
124
            fixed_bits[0] = fixed_bits[0].rstrip("\\")
125
            fixed_bits.insert(0, "/")
126
        return Path(*fixed_bits)
127
128
    @classmethod
129
    def sanitize_path_node(
130
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
131
        bit: PathLike,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
132
        is_file: Optional[bool] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
133
        is_root_or_drive: Optional[bool] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
134
        include_fat: bool = False,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
135
    ) -> str:
136
        r"""
137
        Sanitizes a path node such that it will be fine for major OSes and filesystems.
138
        For example:
139
        - 'plums;and/or;apples' becomes 'plums_and_or_apples' (escaped ; and /)
140
        - 'null.txt' becomes '_null_.txt' ('null' is forbidden in Windows)
141
        - 'abc  ' becomes 'abc' (no trailing spaces)
142
        The behavior is platform-independent -- os, sys, and pathlib are not used.
143
        For ex, calling sanitize_path_node(r'C:\') returns r'C:\' on both Windows and Linux
144
        If you want to sanitize a whole path, see sanitize_path instead.
145
146
        Args:
147
            bit: The node
148
            is_file: False for directories, True otherwise, None if unknown
149
            is_root_or_drive: True if known to be the root ('/') or a drive ('C:\'), None if unknown
150
            include_fat: Also make compatible with FAT filesystems
151
152
        Returns:
153
            A string
154
        """
155
        # since is_file and is_root_or_drive are both Optional[bool], let's be explicit and use 'is' for clarity
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
156
        if is_file is True and is_root_or_drive is True:
157
            raise ContradictoryRequestError("is_file and is_root_or_drive are both true")
158
        if is_file is True and is_root_or_drive is None:
159
            is_root_or_drive = False
160
        if is_root_or_drive is True and is_file is None:
161
            is_file = False
162
        source_bit = copy(str(bit))
163
        bit = str(bit).strip()
164
        # first, catch root or drive as long as is_root_or_drive is not false
165
        # if is_root_or_drive is True (which is a weird call), then fail if it's not
166
        # otherwise, it's not a root or drive letter, so keep going
167
        if is_root_or_drive is not False:
168
            # \ is allowed in Windows
169
            if bit in ["/", "\\"]:
170
                return bit
171
            m = regex.compile(r"^([A-Z]:)(?:\\)?$", flags=regex.V1).fullmatch(bit)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
172
            # this is interesting
173
            # for bit=='C:' and is_root_or_drive=None,
174
            # it could be either a drive letter
175
            # or a file path that should be corrected to 'C_'
176
            # I guess here we're going with a drive letter
177
            if m is not None:
178
                # we need C:\ and not C: because:
179
                # Path('C:\\', '5').is_absolute() is True
180
                # but Path('C:', '5').is_absolute() is False
181
                # unfortunately, doing Path('C:\\', '5') on Linux gives 'C:\\/5'
182
                # I can't handle that here, but sanitize_path() will account for it
183
                return m.group(1) + "\\"
184
            if is_root_or_drive is True:
185
                raise IllegalPathError(f"Node '{bit}' is not the root or a drive letter")
186
        # note that we can't call WindowsPath.is_reserved because it can't be instantiated on non-Linux
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
187
        # also, these appear to be different from the ones defined there
188
        bad_chars = {
189
            "<",
190
            ">",
191
            ":",
192
            '"',
193
            "|",
194
            "?",
195
            "*",
196
            "\\",
197
            "/",
198
            *{chr(c) for c in range(128, 128 + 33)},
199
            *{chr(c) for c in range(0, 32)},
200
            "\t",
201
        }
202
        # don't handle Long UNC paths
203
        # also cannot be blank or whitespace
204
        # the $ suffixed ones are for FAT
205
        # no CLOCK$, even with an ext
206
        # also no SCREEN$
207
        bad_strs = {
208
            "CON",
209
            "PRN",
210
            "AUX",
211
            "NUL",
212
            "COM1",
213
            "COM2",
214
            "COM3",
215
            "COM4",
216
            "COM5",
217
            "COM6",
218
            "COM7",
219
            "COM8",
220
            "COM9",
221
            "LPT1",
222
            "LPT2",
223
            "LPT3",
224
            "LPT4",
225
            "LPT5",
226
            "LPT6",
227
            "LPT7",
228
            "LPT8",
229
            "LPT9",
230
        }
231
        if include_fat:
232
            bad_strs += {"$IDLE$", "CONFIG$", "KEYBD$", "SCREEN$", "CLOCK$", "LST"}
233
        # just dots is invalid
234
        if set(bit.replace(" ", "")) == "." and bit not in ["..", "."]:
235
            raise IllegalPathError(f"Node '{source_bit}' is invalid")
236
        for q in bad_chars:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "q" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
237
            bit = bit.replace(q, "_")
238
        if bit.upper() in bad_strs:
239
            # arbitrary decision
240
            bit = "_" + bit + "_"
241
        else:
242
            stub, ext = os.path.splitext(bit)
243
            if stub.upper() in bad_strs:
244
                bit = "_" + stub + "_" + ext
245
        if bit.strip() == "":
246
            raise IllegalPathError(f"Node '{source_bit}' is empty or contains only whitespace")
247
        # do this after
248
        if len(bit) > 254:
249
            raise IllegalPathError(f"Node '{source_bit}' has more than 254 characters")
250
        bit = bit.strip()
251
        if is_file is not True and (bit == "." or bit == ".."):
0 ignored issues
show
Unused Code introduced by
Consider merging these comparisons with "in" to "bit in ('.', '..')"
Loading history...
252
            return bit
253
        # never allow '.' (or ' ') to end a filename
254
        bit = bit.rstrip(".")
255
        return bit
256
257
258
__all__ = ["PathTools"]
259