Completed
Push — master ( 01edc4...2b1c29 )
by Rich
01:29
created

ChemAxonFeatureCalculator.transform()   B

Complexity

Conditions 5

Size

Total Lines 11

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
dl 0
loc 11
rs 8.5454
c 0
b 0
f 0
cc 5
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.descriptors.atom
8
9
Module specifying atom based descriptor generators.
10
"""
11
12
import logging
13
import subprocess
14
import tempfile
15
16
import pandas as pd
0 ignored issues
show
Configuration introduced by
The import pandas could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
17
import numpy as np
0 ignored issues
show
Configuration introduced by
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18
import time
19
20
from ..io import write_sdf
21
from .. import core
22
from ..utils import NamedProgressBar, line_count
23
24
LOGGER = logging.getLogger(__file__)
25
26
# TODO: fix averagemicrospeciescharge
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
27
# TODO: fix logd logp logs
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
28
# TODO: oen (orbital electronegativity) - sigma + pi
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
29
# TODO: water accessible surface area
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
30
31
class ChemAxonFeatureCalculator(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
32
    _optimal_feats = ['acceptorcount', 'accsitecount', 'aliphaticatomcount', 'aliphaticbondcount', 'aliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
33
                      'aromaticatomcount', 'aromaticbondcount', 'aromaticringcount', 'asymmetricatomcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
34
                      'averagemolecularpolarizability', 'axxpol', 'ayypol', 'azzpol', 'balabanindex', 'bondcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
35
                      'carboaliphaticringcount', 'carboaromaticringcount', 'carboringcount', 'chainatomcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
36
                      'chainbondcount', 'charge', 'chiralcentercount', 'connectedgraph', 'cyclomaticnumber', 'dipole',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
37
                      'donorcount', 'donorsitecount', 'doublebondstereoisomercount', 'dreidingenergy', 'formalcharge',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
38
                      'fragmentcount', 'fsp3', 'fusedaliphaticringcount', 'fusedaromaticringcount', 'fusedringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
39
                      'hararyindex', 'heteroaliphaticringcount', 'heteroaromaticringcount', 'heteroringcount', 'hlb',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
40
                      'hmopienergy', 'hyperwienerindex', 'largestringsize', 'largestringsystemsize',
41
                      'markushenumerationcount', 'maximalprojectionarea', 'maximalprojectionradius',
42
                      'maximalprojectionsize', 'minimalprojectionarea', 'minimalprojectionradius',
43
                      'minimalprojectionsize', 'mmff94energy', 'molpol', 'pienergy', 'plattindex', 'psa', 'randicindex',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
44
                      'refractivity', 'resonantcount', 'ringatomcount', 'ringbondcount', 'ringcount', 'ringsystemcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
45
                      'rotatablebondcount', 'smallestatomringsize', 'smallestringsize', 'smallestringsystemsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (113/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
46
                      'stereodoublebondcount', 'stereoisomercount', 'szegedindex', 'tetrahedralstereoisomercount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
47
                      'vdwsa', 'volume', 'wateraccessiblesurfacearea', 'wienerindex', 'wienerpolarity']
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
48
49
    _all_feats = ['acc', 'acceptor', 'acceptorcount', 'acceptormultiplicity', 'acceptorsitecount', 'acceptortable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
50
                  'accsitecount', 'aliphaticatom', 'aliphaticatomcount', 'aliphaticbondcount', 'aliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
51
                  'aliphaticringcountofsize', 'aromaticatom', 'aromaticatomcount', 'aromaticbondcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
52
                  'aromaticelectrophilicityorder', 'aromaticnucleophilicityorder', 'aromaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
53
                  'aromaticringcountofsize', 'asa', 'asymmetricatom', 'asymmetricatomcount', 'asymmetricatoms',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
54
                  'atomicpolarizability', 'atompol', 'averagemicrospeciescharge', 'averagemolecularpolarizability',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
55
                  'averagepol', 'avgpol', 'axxpol', 'ayypol', 'azzpol', 'balabanindex', 'bondcount',
56
                  'canonicalresonant', 'canonicaltautomer', 'carboaliphaticringcount', 'carboaromaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
57
                  'carboringcount', 'chainatom', 'chainatomcount', 'chainbondcount', 'charge', 'chargedensity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
58
                  'chargedistribution', 'chiralcenter', 'chiralcentercount', 'chiralcenters', 'connectedgraph',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
59
                  'cyclomaticnumber', 'dipole', 'distancedegree', 'don', 'donor', 'donorcount', 'donormultiplicity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
60
                  'donorsitecount', 'donortable', 'donsitecount', 'doublebondstereoisomercount', 'dreidingenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
61
                  'eccentricity', 'electrondensity', 'electrophilicityorder', 'electrophiliclocalizationenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
62
                  'enumerationcount', 'enumerations', 'formalcharge', 'fragmentcount', 'fsp3',
63
                  'fusedaliphaticringcount', 'fusedaromaticringcount', 'fusedringcount', 'generictautomer',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
64
                  'hararyindex', 'hasvalidconformer', 'hbda', 'hbonddonoracceptor', 'heteroaliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
65
                  'heteroaromaticringcount', 'heteroringcount', 'hindrance', 'hlb', 'hmochargedensity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
66
                  'hmoelectrondensity', 'hmoelectrophilicityorder', 'hmoelectrophiliclocalizationenergy', 'hmohuckel',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
67
                  'hmohuckeleigenvalue', 'hmohuckeleigenvector', 'hmohuckelorbitals', 'hmohuckeltable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
68
                  'hmolocalizationenergy', 'hmonucleophilicityorder', 'hmonucleophiliclocalizationenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (106/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
69
                  'hmopienergy', 'huckel', 'huckeleigenvalue', 'huckeleigenvector', 'huckelorbitals', 'huckeltable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
70
                  'hyperwienerindex', 'ioncharge', 'isoelectricpoint', 'largestatomringsize', 'largestringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
71
                  'largestringsystemsize', 'localizationenergy', 'logd', 'logdcalculator', 'logp', 'logpcalculator',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
72
                  'logs', 'majormicrospecies', 'majorms', 'majorms2', 'majortautomer', 'markushenumerationcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (113/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
73
                  'markushenumerations', 'maximalprojectionarea', 'maximalprojectionradius', 'maximalprojectionsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
74
                  'minimalprojectionarea', 'minimalprojectionradius', 'minimalprojectionsize', 'mmff94energy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
75
                  'molecularpolarizability', 'molecularsurfacearea', 'molpol', 'moststabletautomer', 'msa', 'msacc',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
76
                  'msdon', 'name', 'nucleophilicityorder', 'nucleophiliclocalizationenergy', 'oen',
77
                  'orbitalelectronegativity', 'pi', 'pichargedensity', 'pienergy', 'pka', 'pkacalculator', 'pkat',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
78
                  'plattindex', 'pol', 'polarizability', 'polarsurfacearea', 'psa', 'randicindex',
79
                  'randommarkushenumerations', 'refractivity', 'resonantcount', 'resonants', 'ringatom',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (104/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
80
                  'ringatomcount', 'ringbondcount', 'ringcount', 'ringcountofatom', 'ringcountofsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (102/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
81
                  'ringsystemcount', 'ringsystemcountofsize', 'rotatablebondcount', 'smallestatomringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
82
                  'smallestringsize', 'smallestringsystemsize', 'stereodoublebondcount', 'stereoisomercount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (109/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
83
                  'stericeffectindex', 'sterichindrance', 'szegedindex', 'tautomercount', 'tautomers',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (102/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
84
                  'tetrahedralstereoisomercount', 'tholepolarizability', 'topanal', 'topologyanalysistable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (108/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
85
                  'totalchargedensity', 'tpol', 'tpolarizability', 'vdwsa', 'volume', 'wateraccessiblesurfacearea',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
86
                  'wienerindex', 'wienerpolarity']
87
88
    def __init__(self, feat_set='optimal'):
89 View Code Duplication
        if feat_set == 'all':
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
90
            self.index = self._all_feats
91
        elif feat_set == 'optimal':
92
            self.index = self._optimal_feats
93
        elif feat_set in self._all_feats:
94
            self.index = [feat_set]
95
        elif isinstance(feat_set, (list, tuple)):
96
            valid = np.array([feat in self._all_feats for feat in feat_set])
97
            if all(valid):
98
                self.index = feat_set
99
            else:
100
                self.index = feat_set
101
                raise NotImplementedError('Descriptor \'{}\' not available.'.format(np.array(feat_set)[~valid]))
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
102
        else:
103
            raise NotImplementedError('Feature set {} not available.'.format(feat_set))
104
105
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
106
        if isinstance(obj, core.Mol):
107
            return self._transform_series(pd.Series(obj)).iloc[0]
108
        elif isinstance(obj, pd.Series):
109
            return self._transform_series(obj)
110
        elif isinstance(obj, pd.DataFrame):
111
            return self._transform_series(obj.structure)
112
        elif isinstance(obj, (tuple, list)):
113
            return self._transform_series(obj)
114
        else:
115
            raise NotImplementedError
116
117
    def _transform_series(self, series):
118
119
        with tempfile.NamedTemporaryFile(suffix='.sdf') as in_file, tempfile.NamedTemporaryFile() as out_file:
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
120
            # write mols to file
121
            write_sdf(series, in_file.name)
122
            args = ['cxcalc', in_file.name, '-o', out_file.name] + self.index
123
124
            LOGGER.debug('Running: ' + ' '.join(args))
125
126
            # call command line
127
            bar = NamedProgressBar(name=self.__class__.__name__, max_value=len(series))
0 ignored issues
show
introduced by
Black listed name "bar"
Loading history...
128
            p = subprocess.Popen(args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
0 ignored issues
show
Coding Style Naming introduced by
The name p does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Bug introduced by
The Module subprocess does not seem to have a member named DEVNULL.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
129
130
            # monitor
131
            while p.poll() is None:
132
                bar.update(line_count(out_file.name))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
133
                time.sleep(1)
134
            bar.update(len(series))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
135
            p.wait()
136
137
            try:
138
                finished = pd.read_table(out_file.name).set_index('id')
139
            except Exception:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
140
                finished = None
141
        finished.index = series.index
142
        return finished
143
144
class ChemAxonAtomFeatureCalculator(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
This abstract class does not seem to be used anywhere.
Loading history...
145
146
    _all_feats = ['acceptormultiplicity', 'aliphaticatom', 'aromaticatom', 'aromaticelectrophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
147
                  'asymmetricatom', 'atomicpolarizability', 'chainatom', 'chargedensity', 'chiralcenter',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (105/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
148
                  'distancedegree', 'donormultiplicity', 'eccentricity', 'electrondensity',
149
                  'electrophiliclocalizationenergy', 'hindrance', 'hmochargedensity', 'hmoelectrondensity', 'hmoelectrophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (135/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
150
                  'hmoelectrophiliclocalizationenergy', 'hmonucleophilicityorder', 'hmonucleophiliclocalizationenergy', 'ioncharge', 'largestatomringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (155/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
151
                  'nucleophilicityorder', 'nucleophiliclocalizationenergy', 'oen', 'pichargedensity', 'ringatom', 'ringcountofatom',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (132/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
152
                  'stericeffectindex', 'totalchargedensity']
153
154
    _h_inc_feats = ['acc', 'atomicpolarizability', 'charge', 'distancedegree', 'don',
155
       'eccentricity', 'hindrance', 'largestatomringsize', 'oen',
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'eccentricity', 'hindrance', 'largestatomringsize', 'oen',
^ |
Loading history...
156
       'ringcountofatom', 'smallestatomringsize', 'stericeffectindex']
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'ringcountofatom', 'smallestatomringsize', 'stericeffectindex']
^ |
Loading history...
157
158
    def __init__(self, feat_set='all', include_hs=False, max_atoms=75):
0 ignored issues
show
Unused Code introduced by
The argument include_hs seems to be unused.
Loading history...
159
160
        """
161
        Args:
162
            feat_set (str or list<str>):
163
                The feature sets to calculate.
164
                - a single identifier as a `str`
165
                - a list of identifiers
166
                - 'h_inc' or those that also calculate for Hs.
167
                - 'all' for all
168
169
            max_atoms:
170
                - The maximum number of atoms available.
171
        """
172
        self.max_atoms = max_atoms
173
174 View Code Duplication
        if feat_set in self._all_feats:
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
175
            self.index = [feat_set]
176
        elif feat_set == 'h_inclusive':
177
            self.index = self._h_inc_feats
178
        elif feat_set == 'all':
179
            self.index = self._all_feats
180
        elif isinstance(feat_set, (list, tuple)):
181
            valid = np.array([feat in self._all_feats for feat in feat_set])
182
            if all(valid):
183
                self.index = feat_set
184
            else:
185
                raise NotImplementedError('Descriptor \'{}\' not available.'.format(np.array(feat_set)[~valid]))
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
186
            self.feature_names = feat_set
187
        else:
188
            raise NotImplementedError('{} feature set is not available'.format(feat_set))
189
190
    @property
191
    def feature_names(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
192
        return self.index
193
194
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
195
        if isinstance(obj, core.Atom):
196
            return self._transform_atom(obj)
197
        elif isinstance(obj, core.Mol):
198
            return self._transform_mol(obj)
199
        elif isinstance(obj, pd.Series):
200
            return self._transform_series(obj)
201
        elif isinstance(obj, pd.DataFrame):
202
            return self._transform_series(obj.structure)
203
        elif isinstance(obj, (tuple, list)):
204
            return self._transform_series(obj)
205
        else:
206
            raise NotImplementedError
207
208
    def _transform_atom(self, atom):
209
        raise NotImplementedError('Cannot calculate atom wise with Chemaxon')
210
211
    def _transform_mol(self, mol):
212
        # make into series then use self._transform_mol
213
        ser = pd.Series([mol], name=mol.name)
214
        res = self._transform_series(ser)
215
        return res.iloc[0]
216
217
    def _transform_series(self, series):
218
219
        with tempfile.NamedTemporaryFile(suffix='.sdf') as in_file, tempfile.NamedTemporaryFile() as out_file:
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
220
            # write mols to file
221
            write_sdf(series, in_file.name)
222
            args = ['cxcalc', in_file.name, '-o', out_file.name] + self.index
223
224
            LOGGER.debug('Running: ' + ' '.join(args))
225
226
            # call command line
227
            p = subprocess.Popen(args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
0 ignored issues
show
Coding Style Naming introduced by
The name p does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Bug introduced by
The Module subprocess does not seem to have a member named DEVNULL.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
228
            bar = NamedProgressBar(name=self.__class__.__name__, max_value=len(series))
0 ignored issues
show
introduced by
Black listed name "bar"
Loading history...
229
230
            while p.poll() is None:
231
                bar.update(line_count(out_file.name))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
232
                time.sleep(1)
233
            bar.update(len(series))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
234
            p.wait()
235
            finished = pd.read_table(out_file.name).set_index('id')
236
237
        def to_padded(s):
0 ignored issues
show
Coding Style Naming introduced by
The name s does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
238
            res = np.repeat(np.nan, self.max_atoms)
239
240
            def parse_string(s):
0 ignored issues
show
Coding Style Naming introduced by
The name s does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
241
                if s == '':
242
                    return np.nan
243
                elif s == 'false':
244
                    return 0
245
                elif s == 'true':
246
                    return 1
247
                else:
248
                    return float(s)
249
250
            ans = np.array([parse_string(i) for i in s.split(';')])
251
            res[:len(ans)] = ans
252
            return res
253
        res = np.array([[to_padded(i) for k, i in val.items()] for idx, val in finished.T.items()])
0 ignored issues
show
Unused Code introduced by
The variable idx seems to be unused.
Loading history...
Unused Code introduced by
The variable k seems to be unused.
Loading history...
254
        res = pd.Panel(res, items=series.index, major_axis=pd.Index(finished.columns, name='cx_atom_desc'),
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
255
                        minor_axis=pd.Index(range(self.max_atoms), name='atom_idx'))
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
minor_axis=pd.Index(range(self.max_atoms), name='atom_idx'))
|^
Loading history...
256
        return res.swapaxes(1, 2) # to be consistent with AtomFeatureCalculator
257