Completed
Push — master ( b60f49...c7f259 )
by Rich
01:20
created

Mol.from_binary()   A

Complexity

Conditions 1

Size

Total Lines 12

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 1
dl 0
loc 12
rs 9.4285
c 0
b 0
f 0
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2015-2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.core.mol
8
9
Defining molecules in scikit-chem.
10
"""
11
12
import rdkit.Chem
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
13
import rdkit.Chem.inchi
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.inchi could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
14
from rdkit.Chem.rdDepictor import Compute2DCoords
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdDepictor could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
15
from rdkit.Chem.rdMolDescriptors import CalcMolFormula, CalcExactMolWt
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdMolDescriptors could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
16
17
import json
18
19
from . import Atom, Bond, Conformer
0 ignored issues
show
Unused Code introduced by
The import Atom seems to be unused.
Loading history...
20
from .base import ChemicalObject, AtomView, PropertyView
21
from ..utils import Suppressor
0 ignored issues
show
Unused Code introduced by
Unused Suppressor imported from utils
Loading history...
22
23
class Mol(rdkit.Chem.rdchem.Mol, ChemicalObject):
24
25
    """Class representing a Molecule in scikit-chem.
26
27
    Mol objects inherit directly from rdkit Mol objects.  Therefore, they
28
    contain atom and bond information, and may also include properties and
29
    atom bookmarks.
30
31
    Example:
32
        Constructors are implemented as class methods with the `from_` prefix.
33
34
        ```python
35
        m = Mol.from_smiles('c1ccccc1')
36
        ```
37
38
        Serializers are implemented as instance methods with the `to_` prefix.
39
40
        ```python
41
        m.to_smiles()
42
        ```
43
44
    """
45
46
    def __init__(self, *args, **kwargs):
47
48
        """
49
        The default constructor.
50
51
        Note:
52
            This will be rarely used, as it can only create an empty molecule.
53
54
        Args:
55
            *args: Arguments to be passed to the rdkit Mol constructor.
56
            **kwargs: Arguments to be passed to the rdkit Mol constructor.
57
        """
58
        super(Mol, self).__init__(*args, **kwargs)
59
        self.__two_d = None #set in constructor
60
61
    @property
62
    def name(self):
63
64
        """ str: The name of the molecule.
65
66
        Raises:
67
            KeyError"""
68
69
        try:
70
            return self.GetProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
71
        except KeyError:
72
            return None
73
74
    @name.setter
75
    def name(self, value):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
76
77
        if value is None:
78
            self.ClearProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ClearProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
79
        else:
80
            self.SetProp('_Name', value)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named SetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
81
82
    @property
83
    def atoms(self):
84
85
        """ List[skchem.Atom]: An iterable over the atoms of the molecule. """
86
87
        if not hasattr(self, '_atoms'):
88
            self._atoms = AtomView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
89
        return self._atoms
90
91
    @property
92
    def bonds(self):
93
94
        """ List[skchem.Bond]: An iterable over the bonds of the molecule. """
95
96
        return [Bond.from_super(self.GetBondWithIdx(i)) \
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetBondWithIdx.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
97
                for i in range(self.GetNumBonds())]
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetNumBonds.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
98
99
    @property
100
    def mass(self):
101
102
        """ float: the mass of the molecule. """
103
104
        return CalcExactMolWt(self)
105
106
    @property
107
    def props(self):
108
109
        """ PropertyView: A dictionary of the properties of the molecule. """
110
111
        if not hasattr(self, '_props'):
112
            self._props = PropertyView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _props was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
113
        return self._props
114
115
    @property
116
    def conformers(self):
117
118
        """ List[Conformer]: conformers of the molecule. """
119
120
        return [Conformer.from_super(self.GetConformer(i)) \
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetConformer.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
121
            for i in range(len(self.GetConformers()))]
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetConformers.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
122
123
    def to_formula(self):
124
125
        """ str: the chemical formula of the molecule.
126
127
        Raises:
128
            RuntimeError"""
129
130
        # formula may be undefined if atoms are uncertainly typed
131
        # e.g. if the molecule was initialize through SMARTS
132
        try:
133
            return CalcMolFormula(self)
134
        except RuntimeError:
135
            raise ValueError('Formula is undefined for {}'.format(self))
136
137
    def _two_d(self):
138
139
        """ Return a conformer with coordinates in two dimension. """
140
141
        if not hasattr(self, '__two_d'):
142
            self.__two_d = Compute2DCoords(self)
143
        return self.conformers[self.__two_d]
144
145
    def to_dict(self, kind="chemdoodle"):
146
147
        """ A dictionary representation of the molecule.
148
149
        Args:
150
            kind (str):
151
                The type of representation to use.  Only `chemdoodle` is
152
                currently supported.
153
                Defaults to 'Chemdoodle'.
154
155
        Returns:
156
            dict:
157
                dictionary representation of the molecule."""
158
159
        if kind == "chemdoodle":
160
            return self._to_dict_chemdoodle()
161
162
        else:
163
            raise NotImplementedError
164
165
    def _to_dict_chemdoodle(self):
166
167
        """ Chemdoodle dict representation of the molecule.
168
169
        Documentation of the format may be found on the [chemdoodle website](https://web.chemdoodle.com/docs/chemdoodle-json-format/)"""
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (136/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
170
171
        atom_positions = [p.to_dict() for p in self._two_d().atom_positions]
172
        atom_elements = [a.element for a in self.atoms]
173
174
        for i, atom_position in enumerate(atom_positions):
175
            atom_position['l'] = atom_elements[i]
176
177
        bonds = [b.to_dict() for b in self.bonds]
178
179
        return {"m": [{"a": atom_positions, "b": bonds}]}
180
181
    def to_json(self, kind='chemdoodle'):
182
183
        """ Serialize a molecule using JSON.
184
185
        Args:
186
            kind (str):
187
                The type of serialization to use.  Only `chemdoodle` is
188
                currently supported.
189
190
        Returns:
191
            str: the json string. """
192
193
        return json.dumps(self.to_dict(kind=kind))
194
195
    def to_inchi_key(self):
196
197
        """ The InChI key of the molecule.
198
199
        Returns:
200
            str: the InChI key.
201
202
        Raises:
203
            RuntimeError"""
204
205
        if not rdkit.Chem.inchi.INCHI_AVAILABLE:
206
            raise ImportError("InChI module not available.")
207
208
        res = rdkit.Chem.InchiToInchiKey(self.to_inchi())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_inchi.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
209
210
        if res is None:
211
            raise RuntimeError("The molecule could not be encoded as InChI key.")
212
213
        return res
214
215
    def to_binary(self):
216
217
        """  Serialize the molecule to binary encoding.
218
219
        Args:
220
            None
221
222
        Returns:
223
            bytes: the molecule in bytes.
224
225
        Notes:
226
            Due to limitations in RDKit, not all data is serialized.  Notably,
227
            properties are not, so e.g. compound names are not saved."""
228
229
        return self.ToBinary()
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ToBinary.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
230
231
    @classmethod
232
    def from_binary(cls, binary):
233
234
        """ Decode a molecule from a binary serialization.
235
236
        Args:
237
            binary: The bytes string to decode.
238
239
        Returns:
240
            skchem.Mol: The molecule encoded in the binary."""
241
242
        return cls(binary)
243
244
    def __repr__(self):
245
        try:
246
            formula = self.to_formula()
247
        except ValueError:
248
            # if we can't generate the formula, just say it is unknown
249
            formula = 'unknown'
250
251
        return '<{klass} name="{name}" formula="{formula}" at {address}>'.format(
252
            klass=self.__class__.__name__,
253
            name=self.name,
254
            formula=formula,
255
            address=hex(id(self)))
256
257
    def __contains__(self, item):
258
        if isinstance(item, Mol):
259
            return self.HasSubstructMatch(item)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named HasSubstructMatch.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
260
        else:
261
            raise NotImplementedError('No way to check if {} contains {}'.format(self, item))
262
263
    def _repr_javascript(self):
264
265
        """ Rich printing in javascript. """
266
267
        return self.to_json()
268
269
    def __str__(self):
270
        return '<Mol: {}>'.format(self.to_smiles())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_smiles.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
271
272
def bind_constructor(constructor_name, name_to_bind=None):
273
274
    """ Bind an (rdkit) constructor to the class """
275
276
    @classmethod
277
    def constructor(_, in_arg, name=None, *args, **kwargs):
278
279
        """ The constructor to be bound. """
280
281
        m = getattr(rdkit.Chem, 'MolFrom' + constructor_name)(in_arg, *args, **kwargs)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
282
        if m is None:
283
            raise ValueError('Failed to parse molecule, {}'.format(in_arg))
284
        m = Mol.from_super(m)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
285
        m.name = name
286
        return m
287
288
    setattr(Mol, 'from_{}'.format(constructor_name).lower() \
289
        if name_to_bind is None else name_to_bind, constructor)
290
291
def bind_serializer(serializer_name, name_to_bind=None):
292
293
    """ Bind an (rdkit) serializer to the class """
294
295
    def serializer(self, *args, **kwargs):
296
297
        """ The serializer to be bound. """
298
        return getattr(rdkit.Chem, 'MolTo' + serializer_name)(self, *args, **kwargs)
299
300
    setattr(Mol, 'to_{}'.format(serializer_name).lower() \
301
        if name_to_bind is None else name_to_bind, serializer)
302
303
CONSTRUCTORS = ['Inchi', 'Smiles', 'Mol2Block', 'Mol2File', 'MolBlock', \
304
                    'MolFile', 'PDBBlock', 'PDBFile', 'Smarts', 'TPLBlock', 'TPLFile']
305
SERIALIZERS = ['Inchi', 'Smiles', 'MolBlock', 'PDBBlock', 'Smarts', 'TPLBlock', 'TPLFile']
306
307
list(map(bind_constructor, CONSTRUCTORS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
308
list(map(bind_serializer, SERIALIZERS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
309