Completed
Push — master ( 1baf98...c0c140 )
by Rich
01:33
created

ChemAxonStandardizer   A

Complexity

Total Complexity 10

Size/Duplication

Total Lines 55
Duplicated Lines 0 %

Importance

Changes 1
Bugs 0 Features 1
Metric Value
c 1
b 0
f 1
dl 0
loc 55
rs 10
wmc 10

5 Methods

Rating   Name   Duplication   Size   Complexity  
A _transform_by() 0 13 2
A _transform_mol() 0 3 1
A __init__() 0 6 2
A _transform_ser() 0 6 1
A transform() 0 13 4
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.standardizers.chemaxon
8
9
Module wrapping ChemAxon Standardizer.  Must have standardizer installed and
10
license activated.
11
"""
12
13
import os
14
from tempfile import NamedTemporaryFile
15
import subprocess
16
import logging
17
18
logger = logging.getLogger(__file__)
0 ignored issues
show
Coding Style Naming introduced by
The name logger does not conform to the constant naming conventions ((([A-Z_][A-Z0-9_]*)|(__.*__))$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
19
20
import pandas as pd
0 ignored issues
show
Configuration introduced by
The import pandas could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
21
22
from .. import core
23
from .. import io
24
25
# ideally we will programatically build this file, but for now just use it.
26
DEFAULT_CONFIG = os.path.join(os.path.dirname(__file__), 'default_config.xml')
27
28
class ChemAxonStandardizer(object):
29
30
    """ Object wrapping the ChemAxon Standardizer, for standardizing molecules.
31
32
    Args:
33
        config_path (str):
34
            The path of the config_file. If None, use the default one.
35
    """
36
    def __init__(self, config_path=None, keep_failed=True):
37
38
        if not config_path:
39
            config_path = DEFAULT_CONFIG
40
        self.config_path = config_path
41
        self.keep_failed = keep_failed
42
43
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
44
        if isinstance(obj, core.Mol):
45
            return self._transform_mol(obj)
46
        elif isinstance(obj, pd.Series):
47
            return self._transform_ser(obj)
48
        elif isinstance(obj, pd.DataFrame):
49
            obj = obj.copy()
50
            res = self._transform_ser(obj.structure)
51
            obj.iloc['structure'] = res
52
            return obj
53
54
        else:
55
            raise NotImplementedError
56
57
    def _transform_mol(self, mol):
58
        mol = pd.DataFrame([mol], index=[mol.name], columns=['structure'])
59
        return self.transform(mol).structure.iloc[0]
60
61
62
    def _transform_by(self, X, by='sdf'):
0 ignored issues
show
Coding Style Naming introduced by
The name X does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
The name by does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
63
64
        with NamedTemporaryFile() as f_in, NamedTemporaryFile() as f_out:
65
            getattr(io, 'write_' + by)(X, f_in.name)
66
            args = ['standardize', f_in.name,
67
                             '-c', self.config_path,
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'-c', self.config_path,
| ^
Loading history...
68
                             '-f', 'sdf',
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'-f', 'sdf',
| ^
Loading history...
69
                             '-o', f_out.name,
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'-o', f_out.name,
| ^
Loading history...
70
                             '--ignore-error']
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'--ignore-error']
| ^
Loading history...
71
            logger.debug('Running %s', ' '.join(args))
72
            subprocess.call(args)
73
            out = getattr(io, 'read_' + by)(f_out.name).structure
74
        return out
75
76
77
    def _transform_ser(self, X, y=None):
0 ignored issues
show
Coding Style Naming introduced by
The name X does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
The name y does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Unused Code introduced by
The argument y seems to be unused.
Loading history...
78
        # TODO: should check for errors, and schedule multiple runs with
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
79
        # different serialization methods here.
80
81
        out = self._transform_by(X)
82
        return out
83
84
        # failed = X.ix[failed]
85
        # now_succ = self._transform_by(failed, 'smiles')
86
        # still_failed = set(now_succ.index).difference(set(failed.index))
87
        #
88
        # for fail in still_failed:
89
        #     logger.warn('%s failed to standardize.', fail)
90
        #
91
        # out = out.append(now_succ)
92
        # if self.keep_failed:
93
        #     out = out.append(X.ix[still_failed])
94
        # return out.structure
95