Passed
Push — main ( e5b5eb...6c03a0 )
by Douglas
03:46 queued 01:46
created

mandos.entry.utils._arg_utils   F

Complexity

Total Complexity 73

Size/Duplication

Total Lines 260
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
eloc 225
dl 0
loc 260
rs 2.56
c 0
b 0
f 0
wmc 73

13 Methods

Rating   Name   Duplication   Size   Complexity  
A ArgUtils.get_target_types() 0 3 1
A ArgUtils.get_taxonomy() 0 16 3
B ArgUtils.parse_taxa() 0 24 6
A ParsedTaxa.empty() 0 3 1
B ArgUtils.parse_taxon() 0 12 8
A ArgUtils.definition_bullets() 0 5 1
A ArgUtils.parse_taxa_ids() 0 9 3
A ArgUtils._get_std_taxon() 0 15 2
B ArgUtils.list() 0 19 6
A ArgUtils.definition_list() 0 4 1
F EntryUtils.adjust_dir_name() 0 34 15
B EntryUtils._check_suffix() 0 10 6
F EntryUtils.adjust_filename() 0 37 20

How to fix   Complexity   

Complexity

Complex classes like mandos.entry.utils._arg_utils often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
from __future__ import annotations
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
3
import os
4
from dataclasses import dataclass
5
from pathlib import Path
6
from typing import (
7
    AbstractSet,
8
    Any,
9
    Callable,
10
    Iterable,
11
    Mapping,
12
    Optional,
13
    Sequence,
14
    Set,
15
    Tuple,
16
    TypeVar,
17
    Union,
18
)
19
20
import decorateme
0 ignored issues
show
introduced by
Unable to import 'decorateme'
Loading history...
21
from pocketutils.core.exceptions import PathExistsError, XTypeError, XValueError
0 ignored issues
show
introduced by
Unable to import 'pocketutils.core.exceptions'
Loading history...
22
from pocketutils.misc.typer_utils import Arg, Opt
0 ignored issues
show
introduced by
Unable to import 'pocketutils.misc.typer_utils'
Loading history...
23
from pocketutils.tools.filesys_tools import FilesysTools
0 ignored issues
show
introduced by
Unable to import 'pocketutils.tools.filesys_tools'
Loading history...
24
from pocketutils.tools.path_tools import PathTools
0 ignored issues
show
introduced by
Unable to import 'pocketutils.tools.path_tools'
Loading history...
25
from regex import regex
0 ignored issues
show
introduced by
Unable to import 'regex'
Loading history...
26
from typeddfs.df_errors import FilenameSuffixError
0 ignored issues
show
introduced by
Unable to import 'typeddfs.df_errors'
Loading history...
27
28
from mandos.model.apis.chembl_support.chembl_targets import TargetType
29
from mandos.model.settings import SETTINGS
30
from mandos.model.taxonomy import KnownTaxa, Taxonomy
0 ignored issues
show
Unused Code introduced by
Unused Taxonomy imported from mandos.model.taxonomy
Loading history...
31
from mandos.model.taxonomy_caches import LazyTaxonomy, TaxonomyFactories
32
from mandos.model.utils.globals import Globals
33
from mandos.model.utils.setup import logger
34
35
T = TypeVar("T", covariant=True)
0 ignored issues
show
Coding Style Naming introduced by
Class name "T" doesn't conform to PascalCase naming style ('[^\\W\\da-z][^\\W_]+$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
36
_dir_name_pattern = regex.compile(r"([^*]*)(?:\*(\..+))?", flags=regex.V1)
37
38
39
@dataclass(frozen=True, repr=True, order=True)
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
40
class ParsedTaxa:
41
    source: str
42
    allow: Sequence[Union[int, str]]
43
    forbid: Sequence[Union[int, str]]
44
    ancestors: Sequence[Union[int, str]]
45
46
    @classmethod
47
    def empty(cls) -> ParsedTaxa:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
48
        return ParsedTaxa("", [], [], [])
49
50
51
@decorateme.auto_utils()
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
52
class ArgUtils:
53
    @classmethod
54
    def definition_bullets(cls, dct: Mapping[Any, Any], colon: str = ": ", indent: int = 12) -> str:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
55
        joiner = os.linesep * 2 + " " * indent
56
        jesus = [f" - {k}{colon}{v}" for k, v in dct.items()]
57
        return joiner.join(jesus)
58
59
    @classmethod
60
    def definition_list(cls, dct: Mapping[Any, Any], colon: str = ": ", sep: str = "; ") -> str:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
61
        jesus = [f"{k}{colon}{v}" for k, v in dct.items()]
62
        return sep.join(jesus)
63
64
    @classmethod
65
    def list(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
66
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
67
        lst: Iterable[Any],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
68
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
69
        attr: Union[None, str, Callable[[Any], Any]] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
70
        sep: str = ", ",
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
71
    ) -> str:
72
        x = []
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
73
        for v in lst:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "v" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
74
            if attr is None and hasattr(v, "name"):
75
                x += [v.name]
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
76
            elif attr is None:
77
                x += [str(v)]
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
78
            elif isinstance(attr, str):
79
                x += [str(getattr(v, attr))]
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
80
            else:
81
                x += [str(attr(v))]
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
82
        return sep.join(x)
83
84
    @classmethod
85
    def get_taxonomy(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
86
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
87
        taxa: Optional[str],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
88
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
89
        local_only: bool = False,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
90
        allow_forbid: bool = True,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
91
    ) -> Optional[LazyTaxonomy]:
92
        if taxa is None or len(taxa) == 0:
93
            return None
94
        parsed = cls.parse_taxa(taxa, allow_forbid=allow_forbid)
95
        return TaxonomyFactories.get_smart_taxonomy(
96
            allow=parsed.allow,
97
            forbid=parsed.forbid,
98
            ancestors=parsed.ancestors,
99
            local_only=local_only,
100
        )
101
102
    @classmethod
103
    def parse_taxa(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
104
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
105
        taxa: Optional[str],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
106
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
107
        allow_forbid: bool = True,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
108
    ) -> ParsedTaxa:
109
        if taxa is None or taxa == "":
110
            return ParsedTaxa.empty()
111
        ancestors = f"{KnownTaxa.cellular},{KnownTaxa.viral}"
112
        if ":" in taxa:
113
            ancestors = taxa.split(":", 1)[1]
114
            taxa = taxa.split(":", 1)[0]
115
        taxa_objs = [t.strip() for t in taxa.split(",") if len(t.strip()) > 0]
116
        allow = [t.strip().lstrip("+") for t in taxa_objs if not t.startswith("-")]
117
        forbid = [t.strip().lstrip("-") for t in taxa_objs if t.startswith("-")]
118
        ancestors = [t.strip() for t in ancestors.split(",")]
119
        if not allow_forbid and len(forbid) > 0:
120
            raise XValueError(f"Cannot use '-' in {taxa}")
121
        return ParsedTaxa(
122
            source=taxa,
123
            allow=[ArgUtils.parse_taxon(t, id_only=False) for t in allow],
124
            forbid=[ArgUtils.parse_taxon(t, id_only=False) for t in forbid],
125
            ancestors=[ArgUtils.parse_taxon(t, id_only=True) for t in ancestors],
126
        )
127
128
    @classmethod
129
    def parse_taxa_ids(cls, taxa: str) -> Sequence[int]:
130
        """
131
        Does not allow negatives.
132
        """
133
        if taxa is None or taxa == "":
134
            return []
135
        taxa = [t.strip() for t in taxa.split(",") if len(t.strip()) > 0]
136
        return [ArgUtils.parse_taxon(t, id_only=True) for t in taxa]
137
138
    @classmethod
139
    def parse_taxon(cls, taxon: Union[int, str], *, id_only: bool = False) -> Union[int, str]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
140
        std = cls._get_std_taxon(taxon)
141
        if isinstance(taxon, str) and taxon in std:
142
            return std
143
        if isinstance(taxon, str) and not id_only:
0 ignored issues
show
unused-code introduced by
Unnecessary "elif" after "return"
Loading history...
144
            return taxon
145
        elif isinstance(taxon, str) and taxon.isdigit():
146
            return int(taxon)
147
        if id_only:
148
            raise XTypeError(f"Taxon {taxon} must be an ID")
149
        raise XTypeError(f"Taxon {taxon} must be an ID or name")
150
151
    @classmethod
152
    def _get_std_taxon(cls, taxa: str) -> str:
153
        x = dict(
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
154
            vertebrata=KnownTaxa.vertebrata,
155
            vertebrate=KnownTaxa.vertebrata,
156
            vertebrates=KnownTaxa.vertebrata,
157
            cellular=KnownTaxa.cellular,
158
            cell=KnownTaxa.cellular,
159
            cells=KnownTaxa.cellular,
160
            viral=KnownTaxa.viral,
161
            virus=KnownTaxa.viral,
162
            viruses=KnownTaxa.viral,
163
            all=f"{Globals.cellular_taxon},{Globals.viral_taxon}",
0 ignored issues
show
Bug introduced by
The Class Globals does not seem to have a member named viral_taxon.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
Bug introduced by
The Class Globals does not seem to have a member named cellular_taxon.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
164
        ).get(taxa)
165
        return taxa if x is None else str(x)
166
167
    @staticmethod
168
    def get_target_types(st: str) -> Set[str]:
0 ignored issues
show
Coding Style Naming introduced by
Argument name "st" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing function or method docstring
Loading history...
169
        return {s.name for s in TargetType.resolve(st)}
170
171
172
@decorateme.auto_utils()
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
173
class EntryUtils:
174
    @classmethod
175
    def adjust_filename(
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
Coding Style Naming introduced by
Argument name "to" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
176
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
177
        to: Optional[Path],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
178
        default: Union[str, Path],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
179
        replace: bool,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
180
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
181
        suffixes: Union[None, AbstractSet[str], Callable[[Union[Path, str]], Any]] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
182
        quiet: bool = False,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
183
    ) -> Path:
184
        if to is None:
185
            path = Path(default)
186
        elif str(to).startswith("."):
187
            path = Path(default).with_suffix(str(to))
188
        elif str(to).startswith("*."):
189
            path = Path(default).with_suffix(str(to)[1:])
190
        elif to.is_dir() or to.suffix == "":
191
            path = to / default
192
        else:
193
            path = Path(to)
194
        path = Path(path)
195
        if os.name == "nt" and SETTINGS.sanitize_paths:
196
            new_path = Path(*PathTools.sanitize_nodes(path.parts, is_file=True))
197
            if new_path.resolve() != path.resolve():
198
                if not quiet:
199
                    logger.warning(f"Sanitized filename {path} → {new_path}")
200
                path = new_path
201
        info = FilesysTools.get_info(path)
202
        if info.exists and not info.is_file and not info.is_socket and not info.is_char_device:
203
            raise PathExistsError(f"Path {path} exists and is not a file")
204
        if info.exists and not replace:
205
            raise PathExistsError(f"File {path} already exists")
206
        cls._check_suffix(path.suffix, suffixes)
207
        if info.exists and replace and not quiet:
208
            logger.info(f"Overwriting existing file {path}")
209
        logger.log("TRACE" if quiet else "INFO", f"Output file is {path}")
210
        return path
211
212
    @classmethod
213
    def adjust_dir_name(
0 ignored issues
show
Coding Style Naming introduced by
Argument name "to" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing function or method docstring
Loading history...
214
        cls,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
215
        to: Optional[Path],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
216
        default: Union[str, Path],
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
217
        *,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
218
        suffixes: Union[None, AbstractSet[str], Callable[[Union[Path, str]], Any]] = None,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
219
        quiet: bool = False,
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
220
    ) -> Tuple[Path, str]:
221
        out_dir = Path(default)
222
        suffix = SETTINGS.table_suffix
223
        if to is not None:
224
            m: regex.Match = _dir_name_pattern.fullmatch(str(to))
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
225
            out_dir = default if m.group(1) == "" else m.group(1)
226
            if m.group(2) is not None and m.group(2) != "":
227
                suffix = m.group(2)
228
            if out_dir.startswith(".") and not quiet:
229
                logger.warning(f"Writing to {out_dir} — was it meant as a suffix instead?")
230
            out_dir = Path(out_dir)
231
        if os.name == "nt" and SETTINGS.sanitize_paths:
232
            new_dir = Path(*PathTools.sanitize_nodes(out_dir._parts, is_file=False))
0 ignored issues
show
Coding Style Best Practice introduced by
It seems like _parts was declared protected and should not be accessed from this context.

Prefixing a member variable _ is usually regarded as the equivalent of declaring it with protected visibility that exists in other languages. Consequentially, such a member should only be accessed from the same class or a child class:

class MyParent:
    def __init__(self):
        self._x = 1;
        self.y = 2;

class MyChild(MyParent):
    def some_method(self):
        return self._x    # Ok, since accessed from a child class

class AnotherClass:
    def some_method(self, instance_of_my_child):
        return instance_of_my_child._x   # Would be flagged as AnotherClass is not
                                         # a child class of MyParent
Loading history...
233
            if new_dir.resolve() != out_dir.resolve():
234
                logger.warning(f"Sanitized directory {out_dir} → {new_dir}")
235
                out_dir = new_dir
236
        info = FilesysTools.get_info(out_dir)
237
        if info.exists and not info.is_dir:
238
            raise PathExistsError(f"Path {out_dir} already exists but and is not a directory")
239
        cls._check_suffix(suffix, suffixes)
240
        if info.exists:
241
            n_files = len(list(out_dir.iterdir()))
242
            if n_files > 0 and not quiet:
243
                logger.debug(f"Directory {out_dir} is non-emtpy")
244
        logger.debug(f"Output dir is {out_dir} (suffix: {suffix})")
245
        return out_dir, suffix
246
247
    @classmethod
248
    def _check_suffix(cls, suffix, suffixes):
249
        if suffixes is not None and callable(suffixes):
250
            try:
251
                suffixes(suffix)  # make sure it's ok
252
            except FilenameSuffixError:
253
                raise XValueError(f"Unsupported file format {suffix}")
254
        elif suffixes is not None:
255
            if suffix not in suffixes:
256
                raise XValueError(f"Unsupported file format {suffix}")
257
258
259
__all__ = ["Arg", "ArgUtils", "EntryUtils", "Opt"]
260