Completed
Push — master ( 9a5d50...d0f258 )
by Rich
13:27 queued 06:27
created

bind_constructor()   A

Complexity

Conditions 4

Size

Total Lines 19

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 10
CRAP Score 4

Importance

Changes 2
Bugs 0 Features 1
Metric Value
c 2
b 0
f 1
dl 0
loc 19
ccs 10
cts 10
cp 1
rs 9.2
cc 4
crap 4
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2015-2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6 1
"""
7
## skchem.core.mol
8
9
Defining molecules in scikit-chem.
10
"""
11
12 1
import copy
13
14 1
import rdkit.Chem
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
15 1
import rdkit.Chem.inchi
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.inchi could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
16 1
from rdkit.Chem import AddHs, RemoveHs
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
17 1
from rdkit.Chem.rdMolDescriptors import CalcMolFormula, CalcExactMolWt
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdMolDescriptors could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18
19 1
import json
20
21 1
from .atom import AtomView
22 1
from .bond import BondView
23 1
from .conformer import ConformerView
24 1
from .base import ChemicalObject, PropertyView
25 1
from ..utils import Suppressor
26
27
28 1
class Mol(rdkit.Chem.rdchem.Mol, ChemicalObject):
29
30
    """Class representing a Molecule in scikit-chem.
31
32
    Mol objects inherit directly from rdkit Mol objects.  Therefore, they
33
    contain atom and bond information, and may also include properties and
34
    atom bookmarks.
35
36
    Example:
37
        Constructors are implemented as class methods with the `from_` prefix.
38
39
        >>> import skchem
40
        >>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m # doctest: +ELLIPSIS
41
        <Mol name="None" formula="C2H3ClO" at ...>
42
43
        This is an rdkit Mol:
44
45
        >>> from rdkit.Chem import Mol as RDKMol
46
        >>> isinstance(m, RDKMol)
47
        True
48
49
        A name can be given at initialization:
50
        >>> m = skchem.Mol.from_smiles('CC(=O)Cl', name='acetyl chloride'); m # doctest: +ELLIPSIS
51
        <Mol name="acetyl chloride" formula="C2H3ClO" at ...>
52
53
        >>> m.name
54
        'acetyl chloride'
55
56
        Serializers are implemented as instance methods with the `to_` prefix.
57
58
        >>> m.to_smiles()
59
        'CC(=O)Cl'
60
61
        >>> m.to_inchi()
62
        'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
63
64
        >>> m.to_inchi_key()
65
        'WETWJCDKMRHUPV-UHFFFAOYSA-N'
66
67
        RDKit properties are accessible through the `props` property:
68
69
        >>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
70
        >>> m.props['example_key']
71
        'example_value'
72
73
        >>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
74
        >>> m.props['float_key']
75
        42
76
77
        They can be set too:
78
79
        >>> m.props['example_set'] = 'set_value'
80
        >>> m.GetProp('example_set') # getting with rdkit directly
81
        'set_value'
82
83
        We can export the properties into a dict or a pandas series:
84
85
        >>> m.props.to_series()
86
        example_key    example_value
87
        example_set        set_value
88
        float_key                 42
89
        dtype: object
90
91
        Atoms and bonds are provided in views:
92
93
        >>> m.atoms # doctest: +ELLIPSIS
94
        <AtomView values="['C', 'C', 'O', 'Cl']" at ...>
95
96
        >>> m.bonds # doctest: +ELLIPSIS
97
        <BondView values="['C-C', 'C=O', 'C-Cl']" at ...>
98
99
        These are iterable:
100
        >>> [a.symbol for a in m.atoms]
101
        ['C', 'C', 'O', 'Cl']
102
103
        The view provides shorthands for some attributes to get these as pandas
104
        objects:
105
106
        >>> m.atoms.symbol
107
        array(['C', 'C', 'O', 'Cl'],
108
              dtype='<U3')
109
110
        Atom and bond props can also be set:
111
112
        >>> m.atoms[0].props['atom_key'] = 'atom_value'
113
        >>> m.atoms[0].props['atom_key']
114
        'atom_value'
115
116
        The properties for atoms on the whole molecule can be accessed like so:
117
118
        >>> m.atoms.props # doctest: +ELLIPSIS
119
        <MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>
120
121
        The properties can be exported as a pandas dataframe
122
        >>> m.atoms.props.to_frame()
123
                    atom_key
124
        atom_idx
125
        0         atom_value
126
        1               None
127
        2               None
128
        3               None
129
130
    """
131
132 1
    def __init__(self, *args, **kwargs):
133
134
        """
135
        The default constructor.
136
137
        Note:
138
            This will be rarely used, as it can only create an empty molecule.
139
140
        Args:
141
            *args: Arguments to be passed to the rdkit Mol constructor.
142
            **kwargs: Arguments to be passed to the rdkit Mol constructor.
143
        """
144 1
        super(Mol, self).__init__(*args, **kwargs)
145 1
        self.__two_d = None # set in constructor
146
147 1
    @property
148
    def name(self):
149
150
        """ str: The name of the molecule.
151
152
        Raises:
153
            KeyError"""
154
155 1
        try:
156 1
            return self.GetProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
157 1
        except KeyError:
158 1
            return None
159
160 1
    @name.setter
161
    def name(self, value):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
162
163 1
        if value is None:
164 1
            self.ClearProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ClearProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
165
        else:
166 1
            self.SetProp('_Name', value)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named SetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
167
168 1
    @property
169
    def atoms(self):
170
171
        """ List[skchem.Atom]: An iterable over the atoms of the molecule. """
172
173 1
        if not hasattr(self, '_atoms'):
174 1
            self._atoms = AtomView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
175 1
        return self._atoms
176
177 1
    @property
178
    def bonds(self):
179
180
        """ List[skchem.Bond]: An iterable over the bonds of the molecule. """
181
182 1
        if not hasattr(self, '_bonds'):
183 1
            self._bonds = BondView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _bonds was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
184 1
        return self._bonds
185
186 1
    @property
187
    def mass(self):
188
189
        """ float: the mass of the molecule. """
190
191 1
        return CalcExactMolWt(self)
192
193 1
    @property
194
    def props(self):
195
196
        """ PropertyView: A dictionary of the properties of the molecule. """
197
198 1
        if not hasattr(self, '_props'):
199 1
            self._props = PropertyView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _props was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
200 1
        return self._props
201
202 1
    @property
203
    def conformers(self):
204
205
        """ List[Conformer]: conformers of the molecule. """
206
207 1
        if not hasattr(self, '_conformers'):
208 1
            self._conformers = ConformerView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _conformers was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
209 1
        return self._conformers
210
211 1
    def to_formula(self):
212
213
        """ str: the chemical formula of the molecule.
214
215
        Raises:
216
            RuntimeError"""
217
218
        # formula may be undefined if atoms are uncertainly typed
219
        # e.g. if the molecule was initialize through SMARTS
220 1
        try:
221 1
            with Suppressor():
222 1
                return CalcMolFormula(self)
223 1
        except RuntimeError:
224 1
            raise ValueError('Formula is undefined for {}'.format(self))
225
226 1
    def add_hs(self, inplace=False, add_coords=True, explicit_only=False,
227
               only_on_atoms=False):
228
        """ Add hydrogens to self.
229
230
        Args:
231
            inplace (bool):
232
                Whether to add Hs to `Mol`, or return a new `Mol`.
233
            add_coords (bool):
234
                Whether to set 3D coordinate for added Hs.
235
            explicit_only (bool):
236
                Whether to add only explicit Hs, or also implicit ones.
237
            only_on_atoms (iterable<bool>):
238
                An iterable specifying the atoms to add Hs.
239
        Returns:
240
            skchem.Mol:
241
                `Mol` with Hs added.
242
        """
243 1
        if inplace:
244 1
            msg = 'Inplace addition of Hs is not yet supported.'
245 1
            raise NotImplementedError(msg)
246 1
        raw = AddHs(self, addCoords=add_coords, onlyOnAtoms=only_on_atoms,
247
                    explicitOnly=explicit_only)
248 1
        return self.__class__.from_super(raw)
249
250 1
    def remove_hs(self, inplace=False, sanitize=True, update_explicit=False,
251
                  implicit_only=False):
252
253
        """ Remove hydrogens from self.
254
255
        Args:
256
            inplace (bool):
257
                Whether to add Hs to `Mol`, or return a new `Mol`.
258
            sanitize (bool):
259
                Whether to sanitize after Hs are removed.
260
            update_explicit (bool):
261
                Whether to update explicit count after the removal.
262
            implicit_only (bool):
263
                Whether to remove explict and implicit Hs, or Hs only.
264
        Returns:
265
            skchem.Mol:
266
                `Mol` with Hs removed.
267
        """
268 1
        if inplace:
269 1
            msg = 'Inplace removed of Hs is not yet supported.'
270 1
            raise NotImplementedError(msg)
271 1
        raw = RemoveHs(self, implicitOnly=implicit_only,
272
                       updateExplicitCount=update_explicit, sanitize=sanitize)
273 1
        return self.__class__.from_super(raw)
274
275 1
    def to_dict(self, kind="chemdoodle", conformer_id=-1):
276
277
        """ A dictionary representation of the molecule.
278
279
        Args:
280
            kind (str):
281
                The type of representation to use.  Only `chemdoodle` is
282
                currently supported.
283
284
        Returns:
285
            dict:
286
                dictionary representation of the molecule."""
287
288 1
        if kind == "chemdoodle":
289 1
            return self._to_dict_chemdoodle(conformer_id=conformer_id)
290
291
        else:
292 1
            raise NotImplementedError
293
294 1
    def _to_dict_chemdoodle(self, conformer_id=-1):
295
296
        """ Chemdoodle dict representation of the molecule.
297
298
        Documentation of the format may be found on the `chemdoodle website \
299
        <https://web.chemdoodle.com/docs/chemdoodle-json-format>`_"""
300
301 1
        try:
302 1
            pos = self.conformers[conformer_id].positions
303 1
        except IndexError as e:
0 ignored issues
show
Coding Style Naming introduced by
The name e does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
304 1
            if conformer_id == -1:
305
                # no conformers available, so we generate one with 2d coords,
306
                # save the positions, then delete the conf
307
308 1
                self.conformers.append_2d()
309 1
                pos = self.conformers[0].positions
310
311 1
                del self.conformers[0]
312
            else:
313 1
                raise e
314
315 1
        atoms = [{'x': p[0], 'y': p[1], 'z': p[2], 'l': s}
316
                 for s, p in zip(self.atoms.symbol, pos.round(4))]
317
318 1
        bonds = [b.to_dict() for b in self.bonds]
319
320 1
        return {"m": [{"a": atoms, "b": bonds}]}
321
322 1
    def to_json(self, kind='chemdoodle'):
323
324
        """ Serialize a molecule using JSON.
325
326
        Args:
327
            kind (str):
328
                The type of serialization to use.  Only `chemdoodle` is
329
                currently supported.
330
331
        Returns:
332
            str: the json string. """
333
334 1
        return json.dumps(self.to_dict(kind=kind))
335
336 1
    def to_inchi_key(self):
337
338
        """ The InChI key of the molecule.
339
340
        Returns:
341
            str: the InChI key.
342
343
        Raises:
344
            RuntimeError"""
345
346 1
        if not rdkit.Chem.inchi.INCHI_AVAILABLE:
347 1
            raise ImportError("InChI module not available.")
348
349 1
        res = rdkit.Chem.InchiToInchiKey(self.to_inchi())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_inchi.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
350
351 1
        if res is None:
352 1
            raise RuntimeError("An InChI key could not be generated.")
353
354 1
        return res
355
356 1
    def to_binary(self):
357
358
        """  Serialize the molecule to binary encoding.
359
360
        Returns:
361
            bytes: the molecule in bytes.
362
363
        Notes:
364
            Due to limitations in RDKit, not all data is serialized.  Notably,
365
            properties are not, so e.g. compound names are not saved."""
366
367 1
        return self.ToBinary()
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ToBinary.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
368
369 1
    @classmethod
370
    def from_binary(cls, binary):
371
372
        """ Decode a molecule from a binary serialization.
373
374
        Args:
375
            binary: The bytes string to decode.
376
377
        Returns:
378
            skchem.Mol: The molecule encoded in the binary."""
379
380 1
        return cls(binary)
381
382 1
    def copy(self):
383
384
        """ Return a copy of the molecule. """
385
386 1
        return Mol.from_super(copy.deepcopy(self))
387
388 1
    def __repr__(self):
389 1
        try:
390 1
            formula = self.to_formula()
391 1
        except ValueError:
392
            # if we can't generate the formula, just say it is unknown
393 1
            formula = 'unknown'
394
395 1
        return '<{klass} name="{name}" formula="{form}" at {address}>'.format(
396
            klass=self.__class__.__name__,
397
            name=self.name,
398
            form=formula,
399
            address=hex(id(self)))
400
401 1
    def __contains__(self, item):
402 1
        if isinstance(item, Mol):
403 1
            return self.HasSubstructMatch(item)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named HasSubstructMatch.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
404
        else:
405 1
            msg = 'No way to check if {} contains {}'.format(self, item)
406 1
            raise NotImplementedError(msg)
407
408 1
    def __eq__(self, item):
409 1
        if isinstance(item, self.__class__):
410 1
            return (self in item) and (item in self)
411
        else:
412 1
            return False
413
414 1
    def __str__(self):
415 1
        return '<Mol: {}>'.format(self.to_smiles())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_smiles.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
416
417
418 1
def bind_constructor(constructor_name, name_to_bind=None):
419
420
    """ Bind an (rdkit) constructor to the class """
421
422 1
    @classmethod
423 1
    def constructor(_, in_arg, name=None, *args, **kwargs):
424
425
        """ The constructor to be bound. """
426
427 1
        m = getattr(rdkit.Chem, 'MolFrom' + constructor_name)(in_arg, *args,
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
428
                                                              **kwargs)
429 1
        if m is None:
430 1
            raise ValueError('Failed to parse molecule, {}'.format(in_arg))
431 1
        m = Mol.from_super(m)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
432 1
        m.name = name
433 1
        return m
434
435 1
    setattr(Mol, 'from_{}'.format(constructor_name).lower()
436
                 if name_to_bind is None else name_to_bind, constructor)
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
if name_to_bind is None else name_to_bind, constructor)
| ^
Loading history...
437
438
439 1
def bind_serializer(serializer_name, name_to_bind=None):
440
441
    """ Bind an (rdkit) serializer to the class """
442
443 1
    def serializer(self, *args, **kwargs):
444
445
        """ The serializer to be bound. """
446 1
        with Suppressor():
447 1
            return getattr(rdkit.Chem, 'MolTo' + serializer_name)(self, *args,
448
                                                                  **kwargs)
449
450 1
    setattr(Mol, 'to_{}'.format(serializer_name).lower()
451
                 if name_to_bind is None else name_to_bind, serializer)
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
if name_to_bind is None else name_to_bind, serializer)
| ^
Loading history...
452
453 1
CONSTRUCTORS = ['Inchi', 'Smiles', 'Mol2Block', 'Mol2File', 'MolBlock',
454
                'MolFile', 'PDBBlock', 'PDBFile', 'Smarts', 'TPLBlock',
455
                'TPLFile']
456 1
SERIALIZERS = ['Inchi', 'Smiles', 'MolBlock', 'MolFile', 'PDBBlock', 'Smarts',
457
               'TPLBlock', 'TPLFile']
458
459 1
list(map(bind_constructor, CONSTRUCTORS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
460
list(map(bind_serializer, SERIALIZERS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
461