Completed
Push — master ( 86beb5...b25ebe )
by Rich
01:28
created

Mol.remove_hs()   A

Complexity

Conditions 2

Size

Total Lines 21

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 0 Features 1
Metric Value
c 1
b 0
f 1
dl 0
loc 21
rs 9.3142
cc 2
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2015-2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.core.mol
8
9
Defining molecules in scikit-chem.
10
"""
11
12
import rdkit.Chem
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
13
import rdkit.Chem.inchi
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.inchi could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
14
from rdkit.Chem import AddHs, RemoveHs
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
15
from rdkit.Chem.rdDepictor import Compute2DCoords
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdDepictor could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
16
from rdkit.Chem.rdMolDescriptors import CalcMolFormula, CalcExactMolWt
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdMolDescriptors could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
17
18
import json
19
20
from . import Atom, Bond, Conformer
0 ignored issues
show
Unused Code introduced by
The import Atom seems to be unused.
Loading history...
21
from .base import ChemicalObject, AtomView, PropertyView
22
from ..utils import Suppressor
0 ignored issues
show
Unused Code introduced by
Unused Suppressor imported from utils
Loading history...
23
24
class Mol(rdkit.Chem.rdchem.Mol, ChemicalObject):
25
26
    """Class representing a Molecule in scikit-chem.
27
28
    Mol objects inherit directly from rdkit Mol objects.  Therefore, they
29
    contain atom and bond information, and may also include properties and
30
    atom bookmarks.
31
32
    Example:
33
        Constructors are implemented as class methods with the `from_` prefix.
34
35
        ```python
36
        m = Mol.from_smiles('c1ccccc1')
37
        ```
38
39
        Serializers are implemented as instance methods with the `to_` prefix.
40
41
        ```python
42
        m.to_smiles()
43
        ```
44
45
    """
46
47
    def __init__(self, *args, **kwargs):
48
49
        """
50
        The default constructor.
51
52
        Note:
53
            This will be rarely used, as it can only create an empty molecule.
54
55
        Args:
56
            *args: Arguments to be passed to the rdkit Mol constructor.
57
            **kwargs: Arguments to be passed to the rdkit Mol constructor.
58
        """
59
        super(Mol, self).__init__(*args, **kwargs)
60
        self.__two_d = None #set in constructor
61
62
    @property
63
    def name(self):
64
65
        """ str: The name of the molecule.
66
67
        Raises:
68
            KeyError"""
69
70
        try:
71
            return self.GetProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
72
        except KeyError:
73
            return None
74
75
    @name.setter
76
    def name(self, value):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
77
78
        if value is None:
79
            self.ClearProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ClearProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
80
        else:
81
            self.SetProp('_Name', value)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named SetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
82
83
    @property
84
    def atoms(self):
85
86
        """ List[skchem.Atom]: An iterable over the atoms of the molecule. """
87
88
        if not hasattr(self, '_atoms'):
89
            self._atoms = AtomView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
90
        return self._atoms
91
92
    @property
93
    def bonds(self):
94
95
        """ List[skchem.Bond]: An iterable over the bonds of the molecule. """
96
97
        return [Bond.from_super(self.GetBondWithIdx(i)) \
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetBondWithIdx.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
98
                for i in range(self.GetNumBonds())]
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetNumBonds.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
99
100
    @property
101
    def mass(self):
102
103
        """ float: the mass of the molecule. """
104
105
        return CalcExactMolWt(self)
106
107
    @property
108
    def props(self):
109
110
        """ PropertyView: A dictionary of the properties of the molecule. """
111
112
        if not hasattr(self, '_props'):
113
            self._props = PropertyView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _props was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
114
        return self._props
115
116
    @property
117
    def conformers(self):
118
119
        """ List[Conformer]: conformers of the molecule. """
120
121
        return [Conformer.from_super(self.GetConformer(i)) \
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetConformer.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
122
            for i in range(len(self.GetConformers()))]
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetConformers.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
123
124
    def to_formula(self):
125
126
        """ str: the chemical formula of the molecule.
127
128
        Raises:
129
            RuntimeError"""
130
131
        # formula may be undefined if atoms are uncertainly typed
132
        # e.g. if the molecule was initialize through SMARTS
133
        try:
134
            return CalcMolFormula(self)
135
        except RuntimeError:
136
            raise ValueError('Formula is undefined for {}'.format(self))
137
138
    def _two_d(self):
139
140
        """ Return a conformer with coordinates in two dimension. """
141
142
        if not hasattr(self, '__two_d'):
143
            self.__two_d = Compute2DCoords(self)
144
        return self.conformers[self.__two_d]
145
146
    def add_hs(self, inplace=False, add_coords=True, explicit_only=False, only_on_atoms=False):
147
        """
148
149
        Args:
150
            inplace (bool):
151
                Whether to add Hs to `Mol`, or return a new `Mol`. Default is `False`, return a new `Mol`.
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (106/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
152
            add_coords (bool):
153
                Whether to set 3D coordinate for added  Hs. Default is `True`.
154
            explicit_only (bool):
155
                Whether to add only explicit Hs, or also implicit ones. Default is `False`.
156
            only_on_atoms (iterable<bool>):
157
                An iterable specifying the atoms to add Hs.
158
        Returns:
159
            skchem.Mol:
160
                `Mol` with Hs added.
161
        """
162
        if inplace:
163
            raise NotImplementedError('Inplace addition of Hs is not yet supported.')
164
        return self.__class__.from_super(AddHs(self, addCoords=add_coords,
165
                                           onlyOnAtoms=only_on_atoms, explicitOnly=explicit_only))
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
onlyOnAtoms=only_on_atoms, explicitOnly=explicit_only))
^ |
Loading history...
166
167
    def remove_hs(self, inplace=False, sanitize=True, update_explicit=False, implicit_only=False):
168
169
        """
170
171
        Args:
172
            inplace (bool):
173
                Whether to add Hs to `Mol`, or return a new `Mol`. Default is `False`, return a new `Mol`.
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (106/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
174
            sanitize (bool):
175
                Whether to sanitize after Hs are removed. Default is `True`.
176
            update_explicit (bool):
177
                Whether to update explicit count after the removal. Default is `False`.
178
            implicit_only (bool):
179
                Whether to remove explict and implicit Hs, or Hs only. Default is `False`.
180
        Returns:
181
            skchem.Mol:
182
                `Mol` with Hs removed.
183
        """
184
        if inplace:
185
            raise NotImplementedError('Inplace removed of Hs is not yet supported.')
186
        return self.__class__.from_super(RemoveHs(self, implicitOnly=implicit_only,
187
                                              updateExplicitCount=update_explicit, sanitize=sanitize))
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
updateExplicitCount=update_explicit, sanitize=sanitize))
^ |
Loading history...
Coding Style introduced by
This line is too long as per the coding-style (102/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
188
189
    def to_dict(self, kind="chemdoodle"):
190
191
        """ A dictionary representation of the molecule.
192
193
        Args:
194
            kind (str):
195
                The type of representation to use.  Only `chemdoodle` is
196
                currently supported.
197
                Defaults to 'Chemdoodle'.
198
199
        Returns:
200
            dict:
201
                dictionary representation of the molecule."""
202
203
        if kind == "chemdoodle":
204
            return self._to_dict_chemdoodle()
205
206
        else:
207
            raise NotImplementedError
208
209
    def _to_dict_chemdoodle(self):
210
211
        """ Chemdoodle dict representation of the molecule.
212
213
        Documentation of the format may be found on the [chemdoodle website](https://web.chemdoodle.com/docs/chemdoodle-json-format/)"""
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (136/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
214
215
        atom_positions = [p.to_dict() for p in self._two_d().atom_positions]
216
        atom_elements = [a.element for a in self.atoms]
217
218
        for i, atom_position in enumerate(atom_positions):
219
            atom_position['l'] = atom_elements[i]
220
221
        bonds = [b.to_dict() for b in self.bonds]
222
223
        return {"m": [{"a": atom_positions, "b": bonds}]}
224
225
    def to_json(self, kind='chemdoodle'):
226
227
        """ Serialize a molecule using JSON.
228
229
        Args:
230
            kind (str):
231
                The type of serialization to use.  Only `chemdoodle` is
232
                currently supported.
233
234
        Returns:
235
            str: the json string. """
236
237
        return json.dumps(self.to_dict(kind=kind))
238
239
    def to_inchi_key(self):
240
241
        """ The InChI key of the molecule.
242
243
        Returns:
244
            str: the InChI key.
245
246
        Raises:
247
            RuntimeError"""
248
249
        if not rdkit.Chem.inchi.INCHI_AVAILABLE:
250
            raise ImportError("InChI module not available.")
251
252
        res = rdkit.Chem.InchiToInchiKey(self.to_inchi())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_inchi.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
253
254
        if res is None:
255
            raise RuntimeError("The molecule could not be encoded as InChI key.")
256
257
        return res
258
259
    def to_binary(self):
260
261
        """  Serialize the molecule to binary encoding.
262
263
        Args:
264
            None
265
266
        Returns:
267
            bytes: the molecule in bytes.
268
269
        Notes:
270
            Due to limitations in RDKit, not all data is serialized.  Notably,
271
            properties are not, so e.g. compound names are not saved."""
272
273
        return self.ToBinary()
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ToBinary.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
274
275
    @classmethod
276
    def from_binary(cls, binary):
277
278
        """ Decode a molecule from a binary serialization.
279
280
        Args:
281
            binary: The bytes string to decode.
282
283
        Returns:
284
            skchem.Mol: The molecule encoded in the binary."""
285
286
        return cls(binary)
287
288
    def __repr__(self):
289
        try:
290
            formula = self.to_formula()
291
        except ValueError:
292
            # if we can't generate the formula, just say it is unknown
293
            formula = 'unknown'
294
295
        return '<{klass} name="{name}" formula="{formula}" at {address}>'.format(
296
            klass=self.__class__.__name__,
297
            name=self.name,
298
            formula=formula,
299
            address=hex(id(self)))
300
301
    def __contains__(self, item):
302
        if isinstance(item, Mol):
303
            return self.HasSubstructMatch(item)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named HasSubstructMatch.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
304
        else:
305
            raise NotImplementedError('No way to check if {} contains {}'.format(self, item))
306
307
    def __eq__(self, item):
308
        if isinstance(item, self.__class__):
309
            return (self in item) and (item in self)
310
        else:
311
            return False
312
313
    def _repr_javascript(self):
314
315
        """ Rich printing in javascript. """
316
317
        return self.to_json()
318
319
    def __str__(self):
320
        return '<Mol: {}>'.format(self.to_smiles())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_smiles.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
321
322
def bind_constructor(constructor_name, name_to_bind=None):
323
324
    """ Bind an (rdkit) constructor to the class """
325
326
    @classmethod
327
    def constructor(_, in_arg, name=None, *args, **kwargs):
328
329
        """ The constructor to be bound. """
330
331
        m = getattr(rdkit.Chem, 'MolFrom' + constructor_name)(in_arg, *args, **kwargs)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
332
        if m is None:
333
            raise ValueError('Failed to parse molecule, {}'.format(in_arg))
334
        m = Mol.from_super(m)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
335
        m.name = name
336
        return m
337
338
    setattr(Mol, 'from_{}'.format(constructor_name).lower() \
339
        if name_to_bind is None else name_to_bind, constructor)
340
341
def bind_serializer(serializer_name, name_to_bind=None):
342
343
    """ Bind an (rdkit) serializer to the class """
344
345
    def serializer(self, *args, **kwargs):
346
347
        """ The serializer to be bound. """
348
        return getattr(rdkit.Chem, 'MolTo' + serializer_name)(self, *args, **kwargs)
349
350
    setattr(Mol, 'to_{}'.format(serializer_name).lower() \
351
        if name_to_bind is None else name_to_bind, serializer)
352
353
CONSTRUCTORS = ['Inchi', 'Smiles', 'Mol2Block', 'Mol2File', 'MolBlock', \
354
                    'MolFile', 'PDBBlock', 'PDBFile', 'Smarts', 'TPLBlock', 'TPLFile']
355
SERIALIZERS = ['Inchi', 'Smiles', 'MolBlock', 'MolFile', 'PDBBlock', 'Smarts', 'TPLBlock', 'TPLFile']
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (101/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
356
357
list(map(bind_constructor, CONSTRUCTORS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
358
list(map(bind_serializer, SERIALIZERS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
359