Passed
Push — main ( 60119b...e5c7f7 )
by Douglas
02:02
created

pocketutils.tools.filesys_tools   F

Complexity

Total Complexity 90

Size/Duplication

Total Lines 506
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
eloc 333
dl 0
loc 506
rs 2
c 0
b 0
f 0
wmc 90

20 Methods

Rating   Name   Duplication   Size   Complexity  
A FilesysTools.save_json() 0 5 2
A FilesysTools.load_json() 0 4 1
F FilesysTools.get_env_info() 0 99 14
A FilesysTools.new_hasher() 0 3 1
A FilesysTools.list_package_versions() 0 15 2
A FilesysTools.tmppath() 0 12 2
A FilesysTools.delete_surefire() 0 31 4
B FilesysTools.open_file() 0 22 7
A FilesysTools.trash() 0 14 2
B FilesysTools.tmpfile() 0 16 6
A FilesysTools.tmpdir() 0 4 2
D FilesysTools.read_any() 0 55 12
B FilesysTools.read_properties_file() 0 28 7
A FilesysTools.replace_in_file() 0 10 2
B FilesysTools.write_lines() 0 27 6
A FilesysTools.new_webresource() 0 5 1
C FilesysTools.write_properties_file() 0 25 10
A FilesysTools.try_cleanup() 0 11 2
A FilesysTools.hash_hex() 0 8 1
B FilesysTools.read_lines_file() 0 13 6

How to fix   Complexity   

Complexity

Complex classes like pocketutils.tools.filesys_tools often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
import gzip
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
import hashlib
3
import importlib.metadata
0 ignored issues
show
Bug introduced by
The name metadata does not seem to exist in module importlib.
Loading history...
introduced by
Unable to import 'importlib.metadata'
Loading history...
4
import json
5
import locale
6
import logging
7
import os
8
import platform
9
import shutil
10
import socket
11
import stat
12
import struct
13
import sys
14
import tempfile
15
import warnings
16
from contextlib import contextmanager
17
from datetime import datetime, timezone
18
from getpass import getuser
19
from pathlib import Path, PurePath
20
from typing import Any, Generator, Iterable, Mapping, Optional, Sequence, SupportsBytes, Type, Union
21
22
import numpy as np
0 ignored issues
show
introduced by
Unable to import 'numpy'
Loading history...
23
import pandas as pd
0 ignored issues
show
introduced by
Unable to import 'pandas'
Loading history...
24
import regex
0 ignored issues
show
introduced by
Unable to import 'regex'
Loading history...
25
26
from pocketutils.core import JsonEncoder
27
from pocketutils.core.exceptions import (
28
    AlreadyUsedError,
29
    ContradictoryRequestError,
30
    FileDoesNotExistError,
31
    ParsingError,
32
)
33
from pocketutils.core.hashers import *
0 ignored issues
show
Unused Code introduced by
HashValidationFailedError was imported with wildcard, but is not used.
Loading history...
Coding Style introduced by
The usage of wildcard imports like pocketutils.core.hashers should generally be avoided.
Loading history...
Unused Code introduced by
HashableFile was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
NonHashedFile was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PostHashedFile was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PreHashedFile was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
dataclass was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
Callable was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
IllegalStateError was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
PrePostHashedFile was imported with wildcard, but is not used.
Loading history...
34
from pocketutils.core.input_output import OpenMode, PathLike, Writeable
35
from pocketutils.core.web_resource import *
0 ignored issues
show
Coding Style introduced by
The usage of wildcard imports like pocketutils.core.web_resource should generally be avoided.
Loading history...
Unused Code introduced by
enum was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
zipfile was imported with wildcard, but is not used.
Loading history...
Unused Code introduced by
request was imported with wildcard, but is not used.
Loading history...
36
from pocketutils.tools.base_tools import BaseTools
37
from pocketutils.tools.path_tools import PathTools
38
39
logger = logging.getLogger("pocketutils")
40
COMPRESS_LEVEL = 9
41
ENCODING = "utf8"
42
43
try:
44
    import jsonpickle
0 ignored issues
show
Unused Code introduced by
The import jsonpickle seems to be unused.
Loading history...
45
    import jsonpickle.ext.numpy as jsonpickle_numpy
46
47
    jsonpickle_numpy.register_handlers()
48
    import jsonpickle.ext.pandas as jsonpickle_pandas
49
50
    jsonpickle_pandas.register_handlers()
51
except ImportError:
52
    # zero them all out
53
    jsonpickle, jsonpickle_numpy, jsonpickle_pandas = None, None, None
54
    logger.debug("Could not import jsonpickle", exc_info=True)
55
56
57
try:
58
    from defusedxml import ElementTree
59
except ImportError:
60
    logger.warning("Could not import defusedxml; falling back to xml")
61
    from xml.etree import ElementTree
62
63
64
class FilesysTools(BaseTools):
65
    """
66
    Tools for file/directory creation, etc.
67
68
    Security concerns
69
    -----------------
70
71
    Please note that several of these functions expose security concerns.
72
    In particular, ``pkl``, ``unpkl``, and any others that involve pickle or its derivatives.
73
    """
74
75
    @classmethod
76
    def new_hasher(cls, algorithm: str = "sha1") -> Hasher:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
77
        return Hasher(algorithm)
78
79
    @classmethod
80
    def new_webresource(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
81
        cls, url: str, archive_member: Optional[str], local_path: PathLike
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
82
    ) -> WebResource:
83
        return WebResource(url, archive_member, local_path)
84
85
    @classmethod
86
    def get_env_info(cls, *, include_insecure: bool = False) -> Mapping[str, str]:
87
        """
88
        Get a dictionary of some system and environment information.
89
        Includes os_release, hostname, username, mem + disk, shell, etc.
90
91
        Args:
92
            include_insecure: Include data like hostname and username
93
94
        .. caution ::
95
            Even with ``include_insecure=False``, avoid exposing this data to untrusted
96
            sources. For example, this includes the specific OS release, which could
97
            be used in attack.
98
        """
99
100
        now = datetime.now(timezone.utc).astimezone().isoformat()
101
        uname = platform.uname()
102
        language_code, encoding = locale.getlocale()
0 ignored issues
show
Unused Code introduced by
The variable language_code seems to be unused.
Loading history...
103
        data = {}
104
105
        def _try(os_fn, k: str, *args):
106
            if any((a is None for a in args)):
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable a does not seem to be defined.
Loading history...
107
                return None
108
            try:
109
                v = os_fn(*args)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "v" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
110
                data[k] = v
111
                return v
112
            except OSError:
113
                return None
114
115
        data.update(
116
            dict(
117
                platform=platform.platform(),
118
                python=".".join(str(i) for i in sys.version_info),
119
                os=uname.system,
120
                os_release=uname.release,
121
                os_version=uname.version,
122
                machine=uname.machine,
123
                byte_order=sys.byteorder,
124
                processor=uname.processor,
125
                build=sys.version,
126
                python_bits=8 * struct.calcsize("P"),
127
                environment_info_capture_datetime=now,
128
                encoding=encoding,
129
                locale=locale,
130
                recursion_limit=sys.getrecursionlimit(),
131
                float_info=sys.float_info,
132
                int_info=sys.int_info,
133
                flags=sys.flags,
134
                hash_info=sys.hash_info,
135
                implementation=sys.implementation,
136
                switch_interval=sys.getswitchinterval(),
137
                filesystem_encoding=sys.getfilesystemencoding(),
138
            )
139
        )
140
        if "LANG" in os.environ:
141
            data["lang"] = os.environ["LANG"]
142
        if "SHELL" in os.environ:
143
            data["shell"] = os.environ["SHELL"]
144
        if "LC_ALL" in os.environ:
145
            data["lc_all"] = os.environ["LC_ALL"]
146
        if hasattr(sys, "winver"):
147
            data["win_ver"] = (sys.getwindowsversion(),)
0 ignored issues
show
Bug introduced by
The Module sys does not seem to have a member named getwindowsversion.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
148
        if hasattr(sys, "macver"):
149
            data["mac_ver"] = (sys.mac_ver(),)
0 ignored issues
show
Bug introduced by
The Module sys does not seem to have a member named mac_ver.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
150
        if hasattr(sys, "linux_distribution"):
151
            data["linux_distribution"] = (sys.linux_distribution(),)
0 ignored issues
show
Bug introduced by
The Module sys does not seem to have a member named linux_distribution.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
152
        if include_insecure:
153
            try:
154
                data["username"] = getuser()
155
            except ModuleNotFoundError:
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable ModuleNotFoundError does not seem to be defined.
Loading history...
156
                pass
157
            data.update(
158
                dict(
159
                    hostname=socket.gethostname(),
160
                    cwd=os.getcwd(),
161
                    login=os.getlogin(),
162
                )
163
            )
164
            pid = _try(os.getpid, "pid")
165
            ppid = _try(os.getppid, "parent_pid")
166
            if hasattr(os, "getpriority"):
167
                _try(os.getpriority, "priority", os.PRIO_PROCESS, pid)
168
                _try(os.getpriority, "parent_priority", os.PRIO_PROCESS, ppid)
169
        try:
170
            import psutil
0 ignored issues
show
introduced by
Import outside toplevel (psutil)
Loading history...
171
        except ImportError:
172
            psutil = None
173
            logger.warning("psutil is not installed, so cannot get extended env info")
174
        if psutil is not None:
175
            data.update(
176
                dict(
177
                    disk_used=psutil.disk_usage(".").used,
178
                    disk_free=psutil.disk_usage(".").free,
179
                    memory_used=psutil.virtual_memory().used,
180
                    memory_available=psutil.virtual_memory().available,
181
                )
182
            )
183
        return {k: str(v) for k, v in dict(data).items()}
184
185
    @classmethod
186
    def list_package_versions(cls) -> Mapping[str, str]:
187
        """
188
        Returns installed packages and their version numbers.
189
        Reliable; uses importlib (Python 3.8+).
190
        """
191
        # calling .metadata reads the metadata file
192
        # and .version is an alias to .metadata["version"]
193
        # so make sure to only read once
194
        # TODO: get installed extras?
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
195
        dct = {}
196
        for d in importlib.metadata.distributions():
0 ignored issues
show
Coding Style Naming introduced by
Variable name "d" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Bug introduced by
The Module importlib does not seem to have a member named metadata.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
197
            meta = d.metadata
198
            dct[meta["name"]] = meta["version"]
199
        return dct
200
201
    @classmethod
202
    def delete_surefire(cls, path: PathLike) -> Optional[Exception]:
203
        """
204
        Deletes files or directories cross-platform, but working around multiple issues in Windows.
205
206
        Returns:
207
            None, or an Exception for minor warnings
208
209
        Raises:
210
            IOError: If it can't delete
211
        """
212
        # we need this because of Windows
213
        path = Path(path)
214
        logger.debug(f"Permanently deleting {path} ...")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
215
        chmod_err = None
216
        try:
217
            os.chmod(str(path), stat.S_IRWXU)
218
        except Exception as e:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
219
            chmod_err = e
220
        # another reason for returning exception:
221
        # We don't want to interrupt the current line being printed like in slow_delete
222
        if path.is_dir():
223
            shutil.rmtree(str(path), ignore_errors=True)  # ignore_errors because of Windows
224
            try:
225
                path.unlink(missing_ok=True)  # again, because of Windows
226
            except IOError:
227
                pass  # almost definitely because it doesn't exist
228
        else:
229
            path.unlink(missing_ok=True)
230
        logger.debug(f"Permanently deleted {path}")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
231
        return chmod_err
232
233
    @classmethod
234
    def trash(cls, path: PathLike, trash_dir: Optional[PathLike] = None) -> None:
235
        """
236
        Trash a file or directory.
237
238
        Args:
239
            path: The path to move to the trash
240
            trash_dir: If None, uses :meth:`pocketutils.tools.path_tools.PathTools.guess_trash`
241
        """
242
        if trash_dir is None:
243
            trash_dir = PathTools.guess_trash()
244
        logger.debug(f"Trashing {path} to {trash_dir} ...")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
245
        shutil.move(str(path), str(trash_dir))
246
        logger.debug(f"Trashed {path} to {trash_dir}")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
247
248
    @classmethod
249
    def try_cleanup(cls, path: Path, *, bound: Type[Exception] = PermissionError) -> None:
250
        """
251
        Try to delete a file (probably temp file), if it exists, and log any PermissionError.
252
        """
253
        path = Path(path)
254
        # noinspection PyBroadException
255
        try:
256
            path.unlink(missing_ok=True)
257
        except bound:
258
            logger.error(f"Permission error preventing deleting {path}")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
259
260
    @classmethod
261
    def read_lines_file(cls, path: PathLike, ignore_comments: bool = False) -> Sequence[str]:
262
        """
263
        Returns a list of lines in the file.
264
        Optionally skips lines starting with '#' or that only contain whitespace.
265
        """
266
        lines = []
267
        with FilesysTools.open_file(path, "r") as f:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "f" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
268
            for line in f.readlines():
269
                line = line.strip()
270
                if not ignore_comments or not line.startswith("#") and not len(line.strip()) == 0:
271
                    lines.append(line)
272
        return lines
273
274
    @classmethod
275
    def read_properties_file(cls, path: PathLike) -> Mapping[str, str]:
276
        """
277
        Reads a .properties file.
278
        A list of lines with key=value pairs (with an equals sign).
279
        Lines beginning with # are ignored.
280
        Each line must contain exactly 1 equals sign.
281
282
        Args:
283
            path: Read the file at this local path
284
285
        Returns:
286
            A dict mapping keys to values, both with surrounding whitespace stripped
287
        """
288
        dct = {}
289
        with FilesysTools.open_file(path, "r") as f:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "f" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
290
            for i, line in enumerate(f.readlines()):
291
                line = line.strip()
292
                if len(line) == 0 or line.startswith("#"):
293
                    continue
294
                if line.count("=") != 1:
295
                    raise ParsingError(f"Bad line {i} in {path}", resource=path)
296
                k, v = line.split("=")
0 ignored issues
show
Coding Style Naming introduced by
Variable name "v" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
297
                k, v = k.strip(), v.strip()
0 ignored issues
show
Coding Style Naming introduced by
Variable name "v" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
298
                if k in dct:
299
                    raise AlreadyUsedError(f"Duplicate property {k} (line {i})", key=k)
300
                dct[k] = v
301
        return dct
302
303
    @classmethod
304
    def write_properties_file(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
305
        cls, properties: Mapping[Any, Any], path: Union[str, PurePath], mode: str = "o"
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
306
    ):
307
        if not OpenMode(mode).write:
308
            raise ContradictoryRequestError(f"Cannot write text to {path} in mode {mode}")
309
        with FilesysTools.open_file(path, mode) as f:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "f" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
310
            bads = []
311
            for k, v in properties.items():
0 ignored issues
show
Coding Style Naming introduced by
Variable name "v" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
312
                if "=" in k or "=" in v or "\n" in k or "\n" in v:
313
                    bads.append(k)
314
                f.write(
315
                    str(k).replace("=", "--").replace("\n", "\\n")
316
                    + "="
317
                    + str(v).replace("=", "--").replace("\n", "\\n")
318
                    + "\n"
319
                )
320
            if 0 < len(bads) <= 10:
321
                logger.warning(
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
322
                    "At least one properties entry contains an equals sign or newline (\\n)."
323
                    f"These were escaped: {', '.join(bads)}"
324
                )
325
            elif len(bads) > 0:
326
                logger.warning(
327
                    "At least one properties entry contains an equals sign or newline (\\n),"
328
                    "which were escaped."
329
                )
330
331
    @classmethod
332
    def save_json(cls, data: Any, path: PathLike, mode: str = "w") -> None:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
333
        warnings.warn("save_json will be removed; use orjson instead", DeprecationWarning)
334
        with cls.open_file(path, mode) as f:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "f" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
335
            json.dump(data, f, ensure_ascii=False, cls=JsonEncoder)
336
337
    @classmethod
338
    def load_json(cls, path: PathLike):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
339
        warnings.warn("save_json will be removed; use orjson instead", DeprecationWarning)
340
        return json.loads(Path(path).read_text(encoding="utf8"))
341
342
    @classmethod
343
    def read_any(
0 ignored issues
show
best-practice introduced by
Too many return statements (10/6)
Loading history...
344
        cls, path: PathLike
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
345
    ) -> Union[
346
        str,
347
        bytes,
348
        Sequence[str],
349
        pd.DataFrame,
350
        Sequence[int],
351
        Sequence[float],
352
        Sequence[str],
353
        Mapping[str, str],
354
    ]:
355
        """
356
        Reads a variety of simple formats based on filename extension, including '.txt', 'csv', .xml', '.properties', '.json'.
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (126/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
357
        Also reads '.data' (binary), '.lines' (text lines).
358
        And formatted lists: '.strings', '.floats', and '.ints' (ex: "[1, 2, 3]").
359
        """
360
        path = Path(path)
361
        ext = path.suffix.lstrip(".")
362
363
        def load_list(dtype):
364
            return [
365
                dtype(s)
366
                for s in FilesysTools.read_lines_file(path)[0]
367
                .replace(" ", "")
368
                .replace("[", "")
369
                .replace("]", "")
370
                .split(",")
371
            ]
372
373
        if ext == "lines":
0 ignored issues
show
unused-code introduced by
Unnecessary "elif" after "return"
Loading history...
374
            return FilesysTools.read_lines_file(path)
375
        elif ext == "txt":
376
            return path.read_text("utf-8")
377
        elif ext == "data":
378
            return path.read_bytes()
379
        elif ext == "json":
380
            return FilesysTools.load_json(path)
381
        elif ext in ["npy", "npz"]:
382
            return np.load(str(path), allow_pickle=False)
383
        elif ext == "properties":
384
            return FilesysTools.read_properties_file(path)
385
        elif ext == "csv":
386
            return pd.read_csv(path)
387
        elif ext == "ints":
388
            return load_list(int)
389
        elif ext == "floats":
390
            return load_list(float)
391
        elif ext == "strings":
392
            return load_list(str)
393
        elif ext == "xml":
394
            ElementTree.parse(path).getroot()
395
        else:
396
            raise TypeError(f"Did not recognize resource file type for file {path}")
397
398
    @classmethod
399
    @contextmanager
400
    def open_file(cls, path: PathLike, mode: str):
401
        """
402
        Opens a file in a safer way, always using the encoding set in Kale (utf8) by default.
403
        This avoids the problems of accidentally overwriting, forgetting to set mode, and not setting the encoding.
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
404
        Note that the default encoding on open() is not UTF on Windows.
405
        Raises specific informative errors.
406
        Cannot set overwrite in append mode.
407
        """
408
        path = Path(path)
409
        mode = OpenMode(mode)
410
        if mode.write and mode.safe and path.exists():
411
            raise FileDoesNotExistError(f"Path {path} already exists")
412
        if not mode.read:
413
            PathTools.prep_file(path, exist_ok=mode.overwrite or mode.append)
414
        if mode.gzipped:
415
            yield gzip.open(path, mode.internal, compresslevel=COMPRESS_LEVEL)
416
        elif mode.binary:
417
            yield open(path, mode.internal)
418
        else:
419
            yield open(path, mode.internal, encoding=ENCODING)
420
421
    @classmethod
422
    def write_lines(cls, iterable: Iterable[Any], path: PathLike, mode: str = "w") -> int:
423
        """
424
        Just writes an iterable line-by-line to a file, using '\n'.
425
        Makes the parent directory if needed.
426
        Checks that the iterable is a "true iterable" (not a string or bytes).
427
428
        Returns:
429
            The number of lines written (the same as len(iterable) if iterable has a length)
430
431
        Raises:
432
            FileExistsError: If the path exists and append is False
433
            PathIsNotFileError: If append is True, and the path exists but is not a file
434
        """
435
        path = Path(path)
436
        mode = OpenMode(mode)
437
        if not mode.overwrite or mode.binary:
438
            raise ContradictoryRequestError(f"Wrong mode for writing a text file: {mode}")
439
        if not cls.is_true_iterable(iterable):
440
            raise TypeError("Not a true iterable")  # TODO include iterable if small
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
441
        PathTools.prep_file(path, exist_ok=mode.overwrite or mode.append)
442
        n = 0
0 ignored issues
show
Coding Style Naming introduced by
Variable name "n" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
443
        with cls.open_file(path, mode) as f:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "f" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
444
            for x in iterable:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
445
                f.write(str(x) + "\n")
446
            n += 1
0 ignored issues
show
Coding Style Naming introduced by
Variable name "n" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
447
        return n
448
449
    @classmethod
450
    def hash_hex(cls, x: SupportsBytes, algorithm: str) -> str:
0 ignored issues
show
Coding Style Naming introduced by
Argument name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
451
        """
452
        Returns the hex-encoded hash of the object (converted to bytes).
453
        """
454
        m = hashlib.new(algorithm)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
455
        m.update(bytes(x))
456
        return m.hexdigest()
457
458
    @classmethod
459
    def replace_in_file(cls, path: PathLike, changes: Mapping[str, str]) -> None:
460
        """
461
        Uses re.sub repeatedly to modify (AND REPLACE) a file's content.
462
        """
463
        path = Path(path)
464
        data = path.read_text(encoding="utf-8")
465
        for key, value in changes.items():
466
            data = regex.sub(key, value, data, flags=regex.V1 | regex.MULTILINE | regex.DOTALL)
467
        path.write_text(data, encoding="utf-8")
468
469
    @classmethod
470
    def tmppath(cls, path: Optional[PathLike] = None, **kwargs) -> Generator[Path, None, None]:
471
        """
472
        Makes a temporary Path. Won't create ``path`` but will delete it at the end.
473
        If ``path`` is None, will use ``tempfile.mkstemp``.
474
        """
475
        if path is None:
476
            _, path = tempfile.mkstemp()
477
        try:
478
            yield Path(path, **kwargs)
479
        finally:
480
            Path(path).unlink()
481
482
    @classmethod
483
    def tmpfile(
484
        cls, path: Optional[PathLike] = None, *, spooled: bool = False, **kwargs
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
485
    ) -> Generator[Writeable, None, None]:
486
        """
487
        Simple wrapper around tempfile.TemporaryFile, tempfile.NamedTemporaryFile, and tempfile.SpooledTemporaryFile.
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
488
        """
489
        if spooled:
490
            with tempfile.SpooledTemporaryFile(**kwargs) as x:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
491
                yield x
492
        elif path is None:
493
            with tempfile.TemporaryFile(**kwargs) as x:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
494
                yield x
495
        else:
496
            with tempfile.NamedTemporaryFile(str(path), **kwargs) as x:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
497
                yield x
498
499
    @classmethod
500
    def tmpdir(cls, **kwargs) -> Generator[Path, None, None]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
501
        with tempfile.TemporaryDirectory(**kwargs) as x:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
502
            yield Path(x)
503
504
505
__all__ = ["FilesysTools"]
506