Completed
Push — master ( 87dea9...e88bec )
by Rich
15:57
created

Mol.add_hs()   A

Complexity

Conditions 2

Size

Total Lines 23

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 2
CRAP Score 3.4578

Importance

Changes 1
Bugs 0 Features 1
Metric Value
c 1
b 0
f 1
dl 0
loc 23
ccs 2
cts 7
cp 0.2857
rs 9.0856
cc 2
crap 3.4578
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2015-2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6 1
"""
7
## skchem.core.mol
8
9
Defining molecules in scikit-chem.
10
"""
11
12 1
import copy
13
14 1
import rdkit.Chem
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
15 1
import rdkit.Chem.inchi
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.inchi could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
16 1
from rdkit.Chem import AddHs, RemoveHs
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
17 1
from rdkit.Chem.rdMolDescriptors import CalcMolFormula, CalcExactMolWt
0 ignored issues
show
Configuration introduced by
The import rdkit.Chem.rdMolDescriptors could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18 1
19
import json
20 1
21
from .atom import AtomView
22 1
from .bond import BondView
23 1
from .conformer import ConformerView
24 1
from .base import ChemicalObject, PropertyView
25 1
from ..utils import Suppressor
26 1
27
28
class Mol(rdkit.Chem.rdchem.Mol, ChemicalObject):
29 1
30
    """Class representing a Molecule in scikit-chem.
31
32
    Mol objects inherit directly from rdkit Mol objects.  Therefore, they
33
    contain atom and bond information, and may also include properties and
34
    atom bookmarks.
35
36
    Example:
37
        Constructors are implemented as class methods with the `from_` prefix.
38
39
        >>> import skchem
40
        >>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m # doctest: +ELLIPSIS
41
        <Mol name="None" formula="C2H3ClO" at ...>
42
43
        This is an rdkit Mol:
44
45
        >>> from rdkit.Chem import Mol as RDKMol
46
        >>> isinstance(m, RDKMol)
47
        True
48
49
        A name can be given at initialization:
50
        >>> m = skchem.Mol.from_smiles('CC(=O)Cl', name='acetyl chloride'); m # doctest: +ELLIPSIS
51
        <Mol name="acetyl chloride" formula="C2H3ClO" at ...>
52
53
        >>> m.name
54
        'acetyl chloride'
55
56
        Serializers are implemented as instance methods with the `to_` prefix.
57
58
        >>> m.to_smiles()
59
        'CC(=O)Cl'
60
61
        >>> m.to_inchi()
62
        'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
63
64
        >>> m.to_inchi_key()
65
        'WETWJCDKMRHUPV-UHFFFAOYSA-N'
66
67
        RDKit properties are accessible through the `props` property:
68
69
        >>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
70
        >>> m.props['example_key']
71
        'example_value'
72
73
        >>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
74
        >>> m.props['float_key']
75
        42
76
77
        They can be set too:
78
79
        >>> m.props['example_set'] = 'set_value'
80
        >>> m.GetProp('example_set') # getting with rdkit directly
81
        'set_value'
82
83
        We can export the properties into a dict or a pandas series:
84
85
        >>> m.props.to_series()
86
        example_key    example_value
87
        example_set        set_value
88
        float_key                 42
89
        dtype: object
90
91
        Atoms and bonds are provided in views:
92
93
        >>> m.atoms # doctest: +ELLIPSIS
94
        <AtomView values="['C', 'C', 'O', 'Cl']" at ...>
95
96
        >>> m.bonds # doctest: +ELLIPSIS
97
        <BondView values="['C-C', 'C=O', 'C-Cl']" at ...>
98
99
        These are iterable:
100
        >>> [a.symbol for a in m.atoms]
101
        ['C', 'C', 'O', 'Cl']
102
103
        The view provides shorthands for some attributes to get these as pandas
104
        objects:
105
106
        >>> m.atoms.symbol
107
        array(['C', 'C', 'O', 'Cl'],
108
              dtype='<U3')
109
110
        Atom and bond props can also be set:
111
112
        >>> m.atoms[0].props['atom_key'] = 'atom_value'
113
        >>> m.atoms[0].props['atom_key']
114
        'atom_value'
115
116
        The properties for atoms on the whole molecule can be accessed like so:
117
118
        >>> m.atoms.props # doctest: +ELLIPSIS
119
        <MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>
120
121
        The properties can be exported as a pandas dataframe
122
        >>> m.atoms.props.to_frame()
123
                    atom_key
124
        atom_idx
125
        0         atom_value
126
        1               None
127
        2               None
128
        3               None
129
130
    """
131
132
    def __init__(self, *args, **kwargs):
133
134
        """
135
        The default constructor.
136
137 1
        Note:
138
            This will be rarely used, as it can only create an empty molecule.
139
140
        Args:
141
            *args: Arguments to be passed to the rdkit Mol constructor.
142
            **kwargs: Arguments to be passed to the rdkit Mol constructor.
143
        """
144
        super(Mol, self).__init__(*args, **kwargs)
145
        self.__two_d = None # set in constructor
146
147
    @property
148
    def name(self):
149 1
150 1
        """ str: The name of the molecule.
151
152 1
        Raises:
153
            KeyError"""
154
155
        try:
156
            return self.GetProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named GetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
157
        except KeyError:
158
            return None
159
160 1
    @name.setter
161 1
    def name(self, value):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
162 1
163 1
        if value is None:
164
            self.ClearProp('_Name')
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ClearProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
165 1
        else:
166
            self.SetProp('_Name', value)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named SetProp.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
167
168 1
    @property
169 1
    def atoms(self):
170
171 1
        """ List[skchem.Atom]: An iterable over the atoms of the molecule. """
172
173 1
        if not hasattr(self, '_atoms'):
174
            self._atoms = AtomView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
175
        return self._atoms
176
177
    @property
178 1
    def bonds(self):
179 1
180 1
        """ List[skchem.Bond]: An iterable over the bonds of the molecule. """
181
182 1
        if not hasattr(self, '_bonds'):
183
            self._bonds = BondView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _bonds was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
184
        return self._bonds
185
186
    @property
187 1
    def mass(self):
188 1
189 1
        """ float: the mass of the molecule. """
190
191 1
        return CalcExactMolWt(self)
192
193
    @property
194
    def props(self):
195
196 1
        """ PropertyView: A dictionary of the properties of the molecule. """
197
198 1
        if not hasattr(self, '_props'):
199
            self._props = PropertyView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _props was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
200
        return self._props
201
202
    @property
203 1
    def conformers(self):
204 1
205 1
        """ List[Conformer]: conformers of the molecule. """
206
207 1
        if not hasattr(self, '_conformers'):
208
            self._conformers = ConformerView(self)
0 ignored issues
show
Coding Style introduced by
The attribute _conformers was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
209
        return self._conformers
210
211
    def to_formula(self):
212
213
        """ str: the chemical formula of the molecule.
214
215 1
        Raises:
216
            RuntimeError"""
217
218
        # formula may be undefined if atoms are uncertainly typed
219
        # e.g. if the molecule was initialize through SMARTS
220
        try:
221
            with Suppressor():
222
                return CalcMolFormula(self)
223
        except RuntimeError:
224 1
            raise ValueError('Formula is undefined for {}'.format(self))
225 1
226
    def add_hs(self, inplace=False, add_coords=True, explicit_only=False,
227
               only_on_atoms=False):
228
        """ Add hydrogens to self.
229 1
230
        Args:
231
            inplace (bool):
232
                Whether to add Hs to `Mol`, or return a new `Mol`.
233
            add_coords (bool):
234
                Whether to set 3D coordinate for added Hs.
235
            explicit_only (bool):
236
                Whether to add only explicit Hs, or also implicit ones.
237 1
            only_on_atoms (iterable<bool>):
238
                An iterable specifying the atoms to add Hs.
239
        Returns:
240
            skchem.Mol:
241
                `Mol` with Hs added.
242
        """
243
        if inplace:
244
            msg = 'Inplace addition of Hs is not yet supported.'
245
            raise NotImplementedError(msg)
246
        raw = AddHs(self, addCoords=add_coords, onlyOnAtoms=only_on_atoms,
247
                    explicitOnly=explicit_only)
248
        return self.__class__.from_super(raw)
249
250
    def remove_hs(self, inplace=False, sanitize=True, update_explicit=False,
251
                  implicit_only=False):
252
253
        """ Remove hydrogens from self.
254
255
        Args:
256
            inplace (bool):
257
                Whether to add Hs to `Mol`, or return a new `Mol`.
258
            sanitize (bool):
259
                Whether to sanitize after Hs are removed.
260
            update_explicit (bool):
261 1
                Whether to update explicit count after the removal.
262
            implicit_only (bool):
263
                Whether to remove explict and implicit Hs, or Hs only.
264
        Returns:
265
            skchem.Mol:
266
                `Mol` with Hs removed.
267
        """
268
        if inplace:
269
            msg = 'Inplace removed of Hs is not yet supported.'
270
            raise NotImplementedError(msg)
271
        raw = RemoveHs(self, implicitOnly=implicit_only,
272
                       updateExplicitCount=update_explicit, sanitize=sanitize)
273
        return self.__class__.from_super(raw)
274
275
    def to_dict(self, kind="chemdoodle", conformer_id=-1):
276
277
        """ A dictionary representation of the molecule.
278
279
        Args:
280
            kind (str):
281
                The type of representation to use.  Only `chemdoodle` is
282
                currently supported.
283
284
        Returns:
285
            dict:
286 1
                dictionary representation of the molecule."""
287
288
        if kind == "chemdoodle":
289
            return self._to_dict_chemdoodle(conformer_id=conformer_id)
290
291
        else:
292
            raise NotImplementedError
293
294
    def _to_dict_chemdoodle(self, conformer_id=-1):
295
296
        """ Chemdoodle dict representation of the molecule.
297
298
        Documentation of the format may be found on the `chemdoodle website \
299
        <https://web.chemdoodle.com/docs/chemdoodle-json-format>`_"""
300
301
        try:
302
            pos = self.conformers[conformer_id].positions
303
        except IndexError as e:
0 ignored issues
show
Coding Style Naming introduced by
The name e does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
304
            if conformer_id == -1:
305 1
                # no conformers available, so we generate one with 2d coords,
306
                # save the positions, then delete the conf
307
308
                self.conformers.append_2d()
309
                pos = self.conformers[0].positions
310
311
                del self.conformers[0]
312
            else:
313
                raise e
314
315
        atoms = [{'x': p[0], 'y': p[1], 'z': p[2], 'l': s}
316
                 for s, p in zip(self.atoms.symbol, pos.round(4))]
317
318
        bonds = [b.to_dict() for b in self.bonds]
319
320
        return {"m": [{"a": atoms, "b": bonds}]}
321
322 1
    def to_json(self, kind='chemdoodle'):
323
324
        """ Serialize a molecule using JSON.
325
326
        Args:
327
            kind (str):
328
                The type of serialization to use.  Only `chemdoodle` is
329
                currently supported.
330
331
        Returns:
332
            str: the json string. """
333
334
        return json.dumps(self.to_dict(kind=kind))
335
336 1
    def to_inchi_key(self):
337
338
        """ The InChI key of the molecule.
339
340
        Returns:
341
            str: the InChI key.
342
343
        Raises:
344
            RuntimeError"""
345
346 1
        if not rdkit.Chem.inchi.INCHI_AVAILABLE:
347
            raise ImportError("InChI module not available.")
348
349 1
        res = rdkit.Chem.InchiToInchiKey(self.to_inchi())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_inchi.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
350
351 1
        if res is None:
352
            raise RuntimeError("An InChI key could not be generated.")
353
354 1
        return res
355
356 1
    def to_binary(self):
357
358
        """  Serialize the molecule to binary encoding.
359
360
        Returns:
361
            bytes: the molecule in bytes.
362
363
        Notes:
364
            Due to limitations in RDKit, not all data is serialized.  Notably,
365
            properties are not, so e.g. compound names are not saved."""
366
367 1
        return self.ToBinary()
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named ToBinary.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
368
369 1
    @classmethod
370
    def from_binary(cls, binary):
371
372
        """ Decode a molecule from a binary serialization.
373
374
        Args:
375
            binary: The bytes string to decode.
376
377
        Returns:
378
            skchem.Mol: The molecule encoded in the binary."""
379
380 1
        return cls(binary)
381
382 1
    def copy(self):
383
384
        """ Return a copy of the molecule. """
385
386
        return Mol.from_super(copy.deepcopy(self))
387
388 1
    def __repr__(self):
389 1
        try:
390 1
            formula = self.to_formula()
391
        except ValueError:
392
            # if we can't generate the formula, just say it is unknown
393
            formula = 'unknown'
394
395 1
        return '<{klass} name="{name}" formula="{form}" at {address}>'.format(
396
            klass=self.__class__.__name__,
397
            name=self.name,
398
            form=formula,
399
            address=hex(id(self)))
400
401 1
    def __contains__(self, item):
402 1
        if isinstance(item, Mol):
403 1
            return self.HasSubstructMatch(item)
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named HasSubstructMatch.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
404
        else:
405
            msg = 'No way to check if {} contains {}'.format(self, item)
406
            raise NotImplementedError(msg)
407
408 1
    def __eq__(self, item):
409 1
        if isinstance(item, self.__class__):
410 1
            return (self in item) and (item in self)
411
        else:
412
            return False
413
414 1
    def __str__(self):
415
        return '<Mol: {}>'.format(self.to_smiles())
0 ignored issues
show
Bug introduced by
The Instance of Mol does not seem to have a member named to_smiles.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
416
417
418
def bind_constructor(constructor_name, name_to_bind=None):
419
420 1
    """ Bind an (rdkit) constructor to the class """
421 1
422
    @classmethod
423
    def constructor(_, in_arg, name=None, *args, **kwargs):
424 1
425
        """ The constructor to be bound. """
426
427
        m = getattr(rdkit.Chem, 'MolFrom' + constructor_name)(in_arg, *args,
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
428 1
                                                              **kwargs)
429 1
        if m is None:
430
            raise ValueError('Failed to parse molecule, {}'.format(in_arg))
431
        m = Mol.from_super(m)
0 ignored issues
show
Coding Style Naming introduced by
The name m does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
432
        m.name = name
433 1
        return m
434
435 1
    setattr(Mol, 'from_{}'.format(constructor_name).lower()
436 1
                 if name_to_bind is None else name_to_bind, constructor)
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
if name_to_bind is None else name_to_bind, constructor)
| ^
Loading history...
437 1
438 1
439 1
def bind_serializer(serializer_name, name_to_bind=None):
440
441 1
    """ Bind an (rdkit) serializer to the class """
442
443
    def serializer(self, *args, **kwargs):
444
445 1
        """ The serializer to be bound. """
446
        with Suppressor():
447
            return getattr(rdkit.Chem, 'MolTo' + serializer_name)(self, *args,
448
                                                                  **kwargs)
449 1
450
    setattr(Mol, 'to_{}'.format(serializer_name).lower()
451
                 if name_to_bind is None else name_to_bind, serializer)
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
if name_to_bind is None else name_to_bind, serializer)
| ^
Loading history...
452 1
453 1
CONSTRUCTORS = ['Inchi', 'Smiles', 'Mol2Block', 'Mol2File', 'MolBlock',
454
                'MolFile', 'PDBBlock', 'PDBFile', 'Smarts', 'TPLBlock',
455
                'TPLFile']
456 1
SERIALIZERS = ['Inchi', 'Smiles', 'MolBlock', 'MolFile', 'PDBBlock', 'Smarts',
457
               'TPLBlock', 'TPLFile']
458
459 1
list(map(bind_constructor, CONSTRUCTORS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
460
list(map(bind_serializer, SERIALIZERS))
0 ignored issues
show
introduced by
Used builtin function 'map'
Loading history...
461