Passed
Push — dependabot/pip/flake8-bugbear-... ( 93dece...8d4b2b )
by
unknown
01:27
created

StandardTargetTraversalStrategy.parse()   C

Complexity

Conditions 10

Size

Total Lines 62
Code Lines 57

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 10
eloc 57
nop 2
dl 0
loc 62
rs 5.6072
c 0
b 0
f 0

How to fix   Long Method    Complexity   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

Complexity

Complex classes like mandos.search.chembl.target_traversal.StandardTargetTraversalStrategy.parse() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.

Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.

1
from __future__ import annotations
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
import abc
3
import enum
4
import sys
5
import sre_compile
6
import re
7
from pathlib import Path
8
from typing import Dict, Sequence, Type, Set, Union
0 ignored issues
show
Unused Code introduced by
Unused Union imported from typing
Loading history...
9
10
from mandos.model import MandosResources
11
from mandos.model.chembl_api import ChemblApi
12
from mandos.model.chembl_support.chembl_targets import (
0 ignored issues
show
Unused Code introduced by
Unused TargetFactory imported from mandos.model.chembl_support.chembl_targets
Loading history...
13
    TargetType,
14
    ChemblTarget,
15
    TargetFactory,
16
)
17
from mandos.model.chembl_support.chembl_target_graphs import (
18
    ChemblTargetGraph,
19
    TargetNode,
20
    TargetEdgeReqs,
21
    TargetRelType,
22
)
23
24
25
class Acceptance(enum.Enum):
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
26
    always = enum.auto()
27
    never = enum.auto()
28
    at_start = enum.auto()
29
    at_end = enum.auto()
30
31
32
class TargetTraversalStrategy(metaclass=abc.ABCMeta):
0 ignored issues
show
Documentation introduced by
Empty class docstring
Loading history...
33
    """"""
34
35
    @classmethod
36
    def api(cls) -> ChemblApi:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
37
        raise NotImplementedError()
38
39
    def traverse(self, target: ChemblTargetGraph) -> Sequence[ChemblTargetGraph]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
40
        return self.__call__(target)
41
42
    def __call__(self, target: ChemblTargetGraph) -> Sequence[ChemblTargetGraph]:
43
        """
44
45
        Returns:
46
47
        """
48
        raise NotImplementedError()
49
50
51
class StandardTargetTraversalStrategy(TargetTraversalStrategy, metaclass=abc.ABCMeta):
0 ignored issues
show
Documentation introduced by
Empty class docstring
Loading history...
52
    """"""
53
54
    @classmethod
55
    @property
56
    def edges(cls) -> Set[TargetEdgeReqs]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
57
        raise NotImplementedError()
58
59
    @classmethod
60
    @property
61
    def acceptance(cls) -> Dict[TargetEdgeReqs, Acceptance]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
62
        raise NotImplementedError()
63
64
    @classmethod
65
    def read(cls, path: Path) -> Set[TargetEdgeReqs]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
66
        lines = [
67
            line
68
            for line in path.read_text(encoding="utf8").splitlines()
69
            if not line.startswith("#") and len(line.strip()) > 0
70
        ]
71
        return cls.parse(lines)
72
73
    @classmethod
74
    def parse(cls, lines: Sequence[str]) -> Set[TargetEdgeReqs]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
Comprehensibility introduced by
This function exceeds the maximum number of variables (26/15).
Loading history...
75
        pat_type = r"([a-z_]+)"
76
        pat_rel = r"([<>~=])"
77
        pat_accept = r"(?:accept:([\-*^$]?))?"
78
        pat_src_words = r"(?:src:'''(.+?)''')?"
79
        pat_dest_words = r"(?:dest:'''(.+?)''')?"
80
        comment = r"(?:#(.*))?"
81
        pat = f"^ *{pat_type} *{pat_rel} *{pat_type} *{pat_accept} * {pat_src_words} *{pat_dest_words} *{comment} *$"
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
82
        print(pat)
83
        pat = re.compile(pat)
84
        to_rel = {
85
            ">": TargetRelType.superset_of,
86
            "<": TargetRelType.subset_of,
87
            "~": TargetRelType.overlaps_with,
88
            "=": TargetRelType.equivalent_to,
89
            "*": TargetRelType.any_link,
90
            ".": TargetRelType.self_link,
91
        }
92
        to_accept = {
93
            "*": Acceptance.always,
94
            "-": Acceptance.never,
95
            "^": Acceptance.at_start,
96
            "$": Acceptance.at_end,
97
        }
98
        edges = set()
99
        edge_to_acceptance: Dict[TargetEdgeReqs, Acceptance] = {}
100
        for line in lines:
101
            match = pat.fullmatch(line)
102
            if match is None:
103
                raise AssertionError(f"Could not parse line '{line}'")
104
            try:
105
                src_str = match.group(1).lower()
106
                sources = TargetType.all_types() if src_str == "any" else [TargetType[src_str]]
107
                rel = to_rel[match.group(2)]
108
                dest_str = match.group(3).lower()
109
                targets = TargetType.all_types() if dest_str == "any" else [TargetType[dest_str]]
110
                accept = to_accept[match.group(4).lower()]
111
                src_pat = (
112
                    None
113
                    if match.group(5) is None or match.group(5) == ""
114
                    else re.compile(match.group(5))
115
                )
116
                dest_pat = (
117
                    None
118
                    if match.group(6) is None or match.group(6) == ""
119
                    else re.compile(match.group(6))
120
                )
121
            except (KeyError, TypeError, sre_compile.error):
122
                raise AssertionError(f"Could not parse line '{line}'")
123
            for source in sources:
124
                for dest in targets:
125
                    edge = TargetEdgeReqs(
126
                        src_type=source,
127
                        src_pattern=src_pat,
128
                        rel_type=rel,
129
                        dest_type=dest,
130
                        dest_pattern=dest_pat,
131
                    )
132
                    edges.add(edge)
133
                    edge_to_acceptance[edge] = accept
134
        return edges
135
136
    def __call__(self, target: ChemblTargetGraph) -> Sequence[ChemblTarget]:
137
        if not target.type.is_traversable:
138
            return [target.target]
139
        found = target.traverse(self.edges)
140
        return [f.target for f in found if self.accept(f)]
141
142
    def accept(self, target: TargetNode) -> bool:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
143
        acceptance_type = self.acceptance[target.link_reqs]
144
        return (
145
            acceptance_type is Acceptance.always
146
            or (acceptance_type is Acceptance.at_start and target.is_start)
147
            or (acceptance_type is Acceptance.at_end and target.is_end)
148
        )
149
150
151
class TargetTraversalStrategy0(StandardTargetTraversalStrategy, metaclass=abc.ABCMeta):
0 ignored issues
show
Documentation introduced by
Empty class docstring
Loading history...
152
    """"""
153
154
    @classmethod
155
    @property
156
    def edges(cls) -> Set[TargetEdgeReqs]:
157
        return cls.read(MandosResources.path("strategies", "strategy0.txt"))
158
159
160
class TargetTraversalStrategy1(StandardTargetTraversalStrategy, metaclass=abc.ABCMeta):
0 ignored issues
show
Documentation introduced by
Empty class docstring
Loading history...
161
    """"""
162
163
    @classmethod
164
    @property
165
    def edges(cls) -> Set[TargetEdgeReqs]:
166
        return cls.read(MandosResources.path("strategies", "strategy1.txt"))
167
168
169
class TargetTraversalStrategy2(StandardTargetTraversalStrategy, metaclass=abc.ABCMeta):
170
    """
171
    Traverse the DAG up and down, following only desired links
172
    Some links from complex to complex group are "overlaps with"
173
    ex: CHEMBL4296059
174
    it's also rare to need going from a selectivity group "down" to complex group / family / etc.
175
    usually they have a link upwards
176
    so...
177
    If it's a single protein, it's too risk to traverse up into complexes
178
    That's because lots of proteins *occasionally* make complexes, and there are some weird ones
179
    BUT We want to catch some obvious cases like GABA A subunits
180
    ChEMBL calls many of these "something subunit something"
181
    This is the only time we'll allow going directly from protein to complex
182
    In this case, we'll also disallow links form protein to family,
183
    just because we're pretty sure it's a subunit
184
    But we can go from single protein to complex to complex group to family
185
    """
186
187
    @classmethod
188
    @property
189
    def edges(cls) -> Set[TargetEdgeReqs]:
190
        return cls.read(MandosResources.path("strategies", "strategy2.txt"))
191
192
193
class TargetTraversalStrategies:
194
    """
195
    Factory.
196
    """
197
198
    @classmethod
199
    def by_name(cls, fully_qualified: str, api: ChemblApi) -> TargetTraversalStrategy:
200
        """
201
        For dependency injection.
202
203
        Args:
204
            fully_qualified:
205
            api:
206
207
        Returns:
208
209
        """
210
        s = fully_qualified
0 ignored issues
show
Coding Style Naming introduced by
Variable name "s" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
211
        mod = s[: s.rfind(".")]
212
        clz = s[s.rfind(".") :]
213
        x = getattr(sys.modules[mod], clz)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "x" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
214
        return cls.create(x, api)
215
216
    @classmethod
217
    def strategy0(cls, api: ChemblApi) -> TargetTraversalStrategy:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
218
        return cls.create(TargetTraversalStrategy0, api)
219
220
    @classmethod
221
    def strategy1(cls, api: ChemblApi) -> TargetTraversalStrategy:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
222
        return cls.create(TargetTraversalStrategy1, api)
223
224
    @classmethod
225
    def strategy2(cls, api: ChemblApi) -> TargetTraversalStrategy:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
226
        return cls.create(TargetTraversalStrategy2, api)
227
228
    # noinspection PyAbstractClass
229
    @classmethod
230
    def create(cls, clz: Type[TargetTraversalStrategy], api: ChemblApi) -> TargetTraversalStrategy:
231
        """
232
        Factory method.
233
234
        Args:
235
            clz:
236
            api:
237
238
        Returns:
239
240
        """
241
242
        class X(clz):
0 ignored issues
show
Coding Style Naming introduced by
Class name "X" doesn't conform to PascalCase naming style ('[^\\W\\da-z][^\\W_]+$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing class docstring
Loading history...
243
            @classmethod
244
            def api(cls) -> ChemblApi:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
245
                return api
246
247
        X.__name__ = clz.__name__
248
        return X()
249
250
251
__all__ = ["TargetTraversalStrategy", "TargetTraversalStrategies"]
252