mandos.search.chembl.protein_search   A
last analyzed

Complexity

Total Complexity 13

Size/Duplication

Total Lines 142
Duplicated Lines 0 %

Importance

Changes 0
Metric Value
eloc 59
dl 0
loc 142
rs 10
c 0
b 0
f 0
wmc 13

8 Methods

Rating   Name   Duplication   Size   Complexity  
A ProteinSearch.traversal_strategy() 0 5 2
A ProteinSearch.should_include() 0 18 1
A ProteinSearch.query() 0 2 1
A ProteinSearch.to_hit() 0 24 1
A ProteinSearch.process() 0 26 4
A ProteinSearch.find_all() 0 5 1
A ProteinSearch.find() 0 16 2
A ProteinSearch.default_traversal_strategy() 0 3 1
1
import abc
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
import logging
3
from dataclasses import dataclass
4
from typing import Sequence, TypeVar
5
6
from pocketutils.core.dot_dict import NestedDotDict
0 ignored issues
show
introduced by
Unable to import 'pocketutils.core.dot_dict'
Loading history...
7
8
from mandos.model import AbstractHit, ChemblCompound, Search
9
from mandos.model.targets import Target, TargetFactory
10
from mandos.search.chembl.target_traversal_strategy import (
11
    TargetTraversalStrategies,
12
    TargetTraversalStrategy,
13
)
14
15
logger = logging.getLogger("mandos")
16
17
18
@dataclass(frozen=True, order=True, repr=True)
19
class ProteinHit(AbstractHit, metaclass=abc.ABCMeta):
20
    """
21
    A protein target entry for a compound.
22
    """
23
24
25
H = TypeVar("H", bound=ProteinHit, covariant=True)
0 ignored issues
show
Coding Style Naming introduced by
Class name "H" doesn't conform to PascalCase naming style ('[^\\W\\da-z][^\\W_]+$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
26
27
28
class ProteinSearch(Search[H], metaclass=abc.ABCMeta):
29
    """
30
    Abstract search.
31
    """
32
33
    def find_all(self, compounds: Sequence[str]) -> Sequence[H]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
34
        logger.info(
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
35
            f"Using traversal strategy {self.traversal_strategy.__class__.__name__} for {self.search_name}"
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
36
        )
37
        return super().find_all(compounds)
38
39
    def query(self, parent_form: ChemblCompound) -> Sequence[NestedDotDict]:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
40
        raise NotImplementedError()
41
42
    @property
43
    def traversal_strategy(self) -> TargetTraversalStrategy:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
44
        if self.config.traversal_strategy is None:
45
            return self.default_traversal_strategy
46
        return TargetTraversalStrategies.by_name(self.config.traversal_strategy, self.api)
47
48
    @property
49
    def default_traversal_strategy(self) -> TargetTraversalStrategy:
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
50
        raise NotImplementedError()
51
52
    def should_include(
53
        self, lookup: str, compound: ChemblCompound, data: NestedDotDict, target: Target
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
54
    ) -> bool:
55
        """
56
        Filter based on the returned (activity/mechanism) data.
57
        IGNORE filters about the target itself, including whether it's a valid target.
58
        Return True in these cases (better yet, don't check).
59
60
        Args:
61
            lookup:
62
            compound:
63
            data:
64
            target:
65
66
        Returns:
67
68
        """
69
        raise NotImplementedError()
70
71
    def to_hit(
72
        self, lookup: str, compound: ChemblCompound, data: NestedDotDict, best_target: Target
0 ignored issues
show
Coding Style introduced by
Wrong hanging indentation before block (add 4 spaces).
Loading history...
Unused Code introduced by
The argument lookup seems to be unused.
Loading history...
Unused Code introduced by
The argument compound seems to be unused.
Loading history...
73
    ) -> Sequence[H]:
74
        """
75
        Gets the desired data as a NestedDotDict from the data from a single element
76
        returned by ``api_endpoint.filter``.
77
        This MUST MATCH the constructor, EXCEPT for object_id and object_name,
78
        which come from traversal and should be added by ``ProteinSearch.to_hit`` (parameter ``best_target``).
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
79
80
        Turns the final data into ``H``.
81
        Note that this has a default behavior but could be overridden to split into multiple hits
82
        and/or to add additional attributes that might come from ``best_target``.
83
84
        Args:
85
            lookup:
86
            compound:
87
            data:
88
            best_target:
89
90
        Returns:
91
            A sequence of hits.
92
        """
93
        h = self.get_h()
0 ignored issues
show
Coding Style Naming introduced by
Variable name "h" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
94
        return [h(**data, object_id=best_target.chembl, object_name=best_target.name)]
95
96
    def find(self, lookup: str) -> Sequence[H]:
97
        """
98
99
        Args:
100
            lookup:
101
102
        Returns:e
103
104
        """
105
        form = self.get_compound(lookup)
106
        results = self.query(form)
107
        hits = []
108
        for result in results:
109
            result = NestedDotDict(result)
110
            hits.extend(self.process(lookup, form, result))
111
        return hits
112
113
    def process(self, lookup: str, compound: ChemblCompound, data: NestedDotDict) -> Sequence[H]:
114
        """
115
116
        Args:
117
            lookup:
118
            compound:
119
            data:
120
121
        Returns:
122
123
        """
124
        if data.get("target_chembl_id") is None:
125
            logger.debug(f"target_chembl_id missing from mechanism '{data}' for compound {lookup}")
0 ignored issues
show
introduced by
Use lazy % formatting in logging functions
Loading history...
126
            return []
127
        chembl_id = data["target_chembl_id"]
128
        target_obj = TargetFactory.find(chembl_id, self.api)
129
        if not self.should_include(lookup, compound, data, target_obj):
130
            return []
131
        # traverse() will return the source target if it's a non-traversable type (like DNA)
132
        # and the subclass decided whether to filter those
133
        # so don't worry about that here
134
        ancestors = self.traversal_strategy(target_obj)
0 ignored issues
show
Bug introduced by
self.traversal_strategy does not seem to be callable.
Loading history...
135
        lst = []
136
        for ancestor in ancestors:
137
            lst.extend(self.to_hit(lookup, compound, data, ancestor))
138
        return lst
139
140
141
__all__ = ["ProteinHit", "ProteinSearch"]
142