Completed
Push — master ( 4d093f...9b11f9 )
by Rich
01:42
created

ChemAxonAtomFeatureCalculator   A

Complexity

Total Complexity 26

Size/Duplication

Total Lines 106
Duplicated Lines 29.25 %

Importance

Changes 0
Metric Value
wmc 26
dl 31
loc 106
rs 10
c 0
b 0
f 0

8 Methods

Rating   Name   Duplication   Size   Complexity  
A parse_string() 0 9 4
B transform() 0 13 6
B to_padded() 0 16 6
F _transform_series() 0 33 10
C __init__() 31 31 7
A feature_names() 0 3 1
A _transform_mol() 0 4 1
A _transform_atom() 0 2 1

How to fix   Duplicated Code   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.descriptors.atom
8
9
Module specifying atom based descriptor generators.
10
"""
11
12
import logging
13
import subprocess
14
import tempfile
15
16
import pandas as pd
0 ignored issues
show
Configuration introduced by
The import pandas could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
17
import numpy as np
0 ignored issues
show
Configuration introduced by
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18
19
LOGGER = logging.getLogger(__file__)
20
from ..io import write_sdf, read_sdf
0 ignored issues
show
Unused Code introduced by
Unused read_sdf imported from io
Loading history...
21
from .. import core
22
23
atom_feats = ['acceptormultiplicity', 'aliphaticatom', 'aromaticatom', 'aromaticelectrophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
Coding Style Naming introduced by
The name atom_feats does not conform to the constant naming conventions ((([A-Z_][A-Z0-9_]*)|(__.*__))$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
24
              'asymmetricatom', 'atomicpolarizability', 'chainatom', 'chargedensity', 'chiralcenter', 'distancedegree',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (119/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
25
              'donormultiplicity', 'eccentricity', 'electrondensity', 'electrophiliclocalizationenergy', 'hindrance',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
26
              'hmochargedensity', 'hmoelectrondensity', 'hmoelectrophilicityorder', 'hmoelectrophiliclocalizationenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (121/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
27
              'hmonucleophilicityorder', 'hmonucleophiliclocalizationenergy', 'ioncharge', 'largestatomringsize', 'nucleophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (137/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
28
              'nucleophiliclocalizationenergy', 'oen', 'pichargedensity', 'ringatom', 'ringcountofatom', 'stericeffectindex',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (125/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
29
              'totalchargedensity']
30
31
# TODO: fix averagemicrospeciescharge
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
32
# TODO: fix logd logp logs
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
33
# TODO: oen (orbital electronegativity) - sigma + pi
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
34
# TODO: water accessible surface area
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
35
36
class ChemAxonFeatureCalculator(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
37
    _optimal_feats = ['acceptorcount', 'accsitecount', 'aliphaticatomcount', 'aliphaticbondcount', 'aliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
38
                      'aromaticatomcount', 'aromaticbondcount', 'aromaticringcount', 'asymmetricatomcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
39
                      'averagemolecularpolarizability', 'axxpol', 'ayypol', 'azzpol', 'balabanindex', 'bondcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
40
                      'carboaliphaticringcount', 'carboaromaticringcount', 'carboringcount', 'chainatomcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
41
                      'chainbondcount', 'charge', 'chiralcentercount', 'connectedgraph', 'cyclomaticnumber', 'dipole',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
42
                      'donorcount', 'donorsitecount', 'doublebondstereoisomercount', 'dreidingenergy', 'formalcharge',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
43
                      'fragmentcount', 'fsp3', 'fusedaliphaticringcount', 'fusedaromaticringcount', 'fusedringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
44
                      'hararyindex', 'heteroaliphaticringcount', 'heteroaromaticringcount', 'heteroringcount', 'hlb',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
45
                      'hmopienergy', 'hyperwienerindex', 'largestringsize', 'largestringsystemsize',
46
                      'markushenumerationcount', 'maximalprojectionarea', 'maximalprojectionradius',
47
                      'maximalprojectionsize', 'minimalprojectionarea', 'minimalprojectionradius',
48
                      'minimalprojectionsize', 'mmff94energy', 'molpol', 'pienergy', 'plattindex', 'psa', 'randicindex',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
49
                      'refractivity', 'resonantcount', 'ringatomcount', 'ringbondcount', 'ringcount', 'ringsystemcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (120/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
50
                      'rotatablebondcount', 'smallestatomringsize', 'smallestringsize', 'smallestringsystemsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (113/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
51
                      'stereodoublebondcount', 'stereoisomercount', 'szegedindex', 'tetrahedralstereoisomercount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
52
                      'vdwsa', 'volume', 'wateraccessiblesurfacearea', 'wienerindex', 'wienerpolarity']
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
53
54
    _all_feats = ['acc', 'acceptor', 'acceptorcount', 'acceptormultiplicity', 'acceptorsitecount', 'acceptortable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
55
                  'accsitecount', 'aliphaticatom', 'aliphaticatomcount', 'aliphaticbondcount', 'aliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
56
                  'aliphaticringcountofsize', 'aromaticatom', 'aromaticatomcount', 'aromaticbondcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
57
                  'aromaticelectrophilicityorder', 'aromaticnucleophilicityorder', 'aromaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
58
                  'aromaticringcountofsize', 'asa', 'asymmetricatom', 'asymmetricatomcount', 'asymmetricatoms',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
59
                  'atomicpolarizability', 'atompol', 'averagemicrospeciescharge', 'averagemolecularpolarizability',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
60
                  'averagepol', 'avgpol', 'axxpol', 'ayypol', 'azzpol', 'balabanindex', 'bondcount',
61
                  'canonicalresonant', 'canonicaltautomer', 'carboaliphaticringcount', 'carboaromaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
62
                  'carboringcount', 'chainatom', 'chainatomcount', 'chainbondcount', 'charge', 'chargedensity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
63
                  'chargedistribution', 'chiralcenter', 'chiralcentercount', 'chiralcenters', 'connectedgraph',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
64
                  'cyclomaticnumber', 'dipole', 'distancedegree', 'don', 'donor', 'donorcount', 'donormultiplicity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
65
                  'donorsitecount', 'donortable', 'donsitecount', 'doublebondstereoisomercount', 'dreidingenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
66
                  'eccentricity', 'electrondensity', 'electrophilicityorder', 'electrophiliclocalizationenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
67
                  'enumerationcount', 'enumerations', 'formalcharge', 'fragmentcount', 'fsp3',
68
                  'fusedaliphaticringcount', 'fusedaromaticringcount', 'fusedringcount', 'generictautomer',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
69
                  'hararyindex', 'hasvalidconformer', 'hbda', 'hbonddonoracceptor', 'heteroaliphaticringcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (111/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
70
                  'heteroaromaticringcount', 'heteroringcount', 'hindrance', 'hlb', 'hmochargedensity',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
71
                  'hmoelectrondensity', 'hmoelectrophilicityorder', 'hmoelectrophiliclocalizationenergy', 'hmohuckel',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (118/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
72
                  'hmohuckeleigenvalue', 'hmohuckeleigenvector', 'hmohuckelorbitals', 'hmohuckeltable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (103/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
73
                  'hmolocalizationenergy', 'hmonucleophilicityorder', 'hmonucleophiliclocalizationenergy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (106/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
74
                  'hmopienergy', 'huckel', 'huckeleigenvalue', 'huckeleigenvector', 'huckelorbitals', 'huckeltable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
75
                  'hyperwienerindex', 'ioncharge', 'isoelectricpoint', 'largestatomringsize', 'largestringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
76
                  'largestringsystemsize', 'localizationenergy', 'logd', 'logdcalculator', 'logp', 'logpcalculator',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
77
                  'logs', 'majormicrospecies', 'majorms', 'majorms2', 'majortautomer', 'markushenumerationcount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (113/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
78
                  'markushenumerations', 'maximalprojectionarea', 'maximalprojectionradius', 'maximalprojectionsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (117/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
79
                  'minimalprojectionarea', 'minimalprojectionradius', 'minimalprojectionsize', 'mmff94energy',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
80
                  'molecularpolarizability', 'molecularsurfacearea', 'molpol', 'moststabletautomer', 'msa', 'msacc',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (116/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
81
                  'msdon', 'name', 'nucleophilicityorder', 'nucleophiliclocalizationenergy', 'oen',
82
                  'orbitalelectronegativity', 'pi', 'pichargedensity', 'pienergy', 'pka', 'pkacalculator', 'pkat',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (114/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
83
                  'plattindex', 'pol', 'polarizability', 'polarsurfacearea', 'psa', 'randicindex',
84
                  'randommarkushenumerations', 'refractivity', 'resonantcount', 'resonants', 'ringatom',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (104/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
85
                  'ringatomcount', 'ringbondcount', 'ringcount', 'ringcountofatom', 'ringcountofsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (102/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
86
                  'ringsystemcount', 'ringsystemcountofsize', 'rotatablebondcount', 'smallestatomringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
87
                  'smallestringsize', 'smallestringsystemsize', 'stereodoublebondcount', 'stereoisomercount',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (109/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
88
                  'stericeffectindex', 'sterichindrance', 'szegedindex', 'tautomercount', 'tautomers',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (102/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
89
                  'tetrahedralstereoisomercount', 'tholepolarizability', 'topanal', 'topologyanalysistable',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (108/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
90
                  'totalchargedensity', 'tpol', 'tpolarizability', 'vdwsa', 'volume', 'wateraccessiblesurfacearea',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (115/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
91
                  'wienerindex', 'wienerpolarity']
92
93 View Code Duplication
    def __init__(self, feat_set='all'):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
94
        if feat_set == 'all':
95
            self.index = self._all_feats
96
        elif feat_set == 'optimal':
97
            self.index = self._optimal_feats
98
        elif feat_set in self._all_feats:
99
            self.index = [feat_set]
100
        elif isinstance(feat_set, (list, tuple)):
101
            valid = np.array([feat in self._all_feats for feat in feat_set])
102
            if all(valid):
103
                self.index = feat_set
104
            else:
105
                self.index = feat_set
106
                raise NotImplementedError('Descriptor \'{}\' not available.'.format(np.array(feat_set)[~valid]))
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
107
        else:
108
            raise NotImplementedError('Feature set {} not available.'.format(feat_set))
109
110
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
111
        if isinstance(obj, core.Mol):
112
            return self._transform_series(pd.Series(obj)).iloc[0]
113
        elif isinstance(obj, pd.Series):
114
            return self._transform_series(obj)
115
        elif isinstance(obj, pd.DataFrame):
116
            return self._transform_series(obj.structure)
117
        elif isinstance(obj, (tuple, list)):
118
            return self._transform_series(obj)
119
        else:
120
            raise NotImplementedError
121
122
    def _transform_series(self, series):
123
124
        with tempfile.NamedTemporaryFile(suffix='.sdf') as in_file, tempfile.NamedTemporaryFile() as out_file:
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
125
            # write mols to file
126
            write_sdf(series, in_file.name)
127
            args = ['cxcalc', in_file.name, '-o', out_file.name] + self.index
128
129
            LOGGER.info('Running: ' + ' '.join(args))
130
131
            # call command line
132
            subprocess.call(args)
133
            try:
134
                finished = pd.read_table(out_file.name).set_index('id')
135
            except Exception:
0 ignored issues
show
Best Practice introduced by
Catching very general exceptions such as Exception is usually not recommended.

Generally, you would want to handle very specific errors in the exception handler. This ensure that you do not hide other types of errors which should be fixed.

So, unless you specifically plan to handle any error, consider adding a more specific exception.

Loading history...
136
                finished = None
137
        finished.index = series.index
138
        return finished
139
140
class ChemAxonAtomFeatureCalculator(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
This abstract class does not seem to be used anywhere.
Loading history...
141
142
    _all_feats = ['acceptormultiplicity', 'aliphaticatom', 'aromaticatom', 'aromaticelectrophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
143
                  'asymmetricatom', 'atomicpolarizability', 'chainatom', 'chargedensity', 'chiralcenter',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (105/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
144
                  'distancedegree', 'donormultiplicity', 'eccentricity', 'electrondensity',
145
                  'electrophiliclocalizationenergy', 'hindrance', 'hmochargedensity', 'hmoelectrondensity', 'hmoelectrophilicityorder',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (135/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
146
                  'hmoelectrophiliclocalizationenergy', 'hmonucleophilicityorder', 'hmonucleophiliclocalizationenergy', 'ioncharge', 'largestatomringsize',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (155/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
147
                  'nucleophilicityorder', 'nucleophiliclocalizationenergy', 'oen', 'pichargedensity', 'ringatom', 'ringcountofatom',
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (132/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
148
                  'stericeffectindex', 'totalchargedensity']
149
150
    _h_inc_feats = ['acc', 'atomicpolarizability', 'charge', 'distancedegree', 'don',
151
       'eccentricity', 'hindrance', 'largestatomringsize', 'oen',
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'eccentricity', 'hindrance', 'largestatomringsize', 'oen',
^ |
Loading history...
152
       'ringcountofatom', 'smallestatomringsize', 'stericeffectindex']
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
'ringcountofatom', 'smallestatomringsize', 'stericeffectindex']
^ |
Loading history...
153
154 View Code Duplication
    def __init__(self, feat_set='all', include_hs=False, max_atoms=75):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
Unused Code introduced by
The argument include_hs seems to be unused.
Loading history...
155
156
        """
157
        Args:
158
            feat_set (str or list<str>):
159
                The feature sets to calculate.
160
                - a single identifier as a `str`
161
                - a list of identifiers
162
                - 'h_inc' or those that also calculate for Hs.
163
                - 'all' for all
164
165
            max_atoms:
166
                - The maximum number of atoms available.
167
        """
168
        self.max_atoms = max_atoms
169
170
        if feat_set in self._all_feats:
171
            self.index = [feat_set]
172
        elif feat_set == 'h_inclusive':
173
            self.index = self._h_inc_feats
174
        elif feat_set == 'all':
175
            self.index = self._all_feats
176
        elif isinstance(feat_set, (list, tuple)):
177
            valid = np.array([feat in self._all_feats for feat in feat_set])
178
            if all(valid):
179
                self.index = feat_set
180
            else:
181
                raise NotImplementedError('Descriptor \'{}\' not available.'.format(np.array(feat_set)[~valid]))
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (112/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
182
            self.feature_names = feat_set
183
        else:
184
            raise NotImplementedError('{} feature set is not available'.format(feat_set))
185
186
    @property
187
    def feature_names(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
188
        return self.index
189
190
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
191
        if isinstance(obj, core.Atom):
192
            return self._transform_atom(obj)
193
        elif isinstance(obj, core.Mol):
194
            return self._transform_mol(obj)
195
        elif isinstance(obj, pd.Series):
196
            return self._transform_series(obj)
197
        elif isinstance(obj, pd.DataFrame):
198
            return self._transform_series(obj.structure)
199
        elif isinstance(obj, (tuple, list)):
200
            return self._transform_series(obj)
201
        else:
202
            raise NotImplementedError
203
204
    def _transform_atom(self, atom):
205
        raise NotImplementedError('Cannot calculate atom wise with Chemaxon')
206
207
    def _transform_mol(self, mol):
208
        ser = pd.Series([mol], name=mol.name)
209
        res = self._transform_series(ser)
210
        return res.iloc[0]
211
        # make into series then use self._transform_mol
212
213
    def _transform_series(self, series):
214
215
        with tempfile.NamedTemporaryFile(suffix='.sdf') as in_file, tempfile.NamedTemporaryFile() as out_file:
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (110/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
216
            # write mols to file
217
            write_sdf(series, in_file.name)
218
            args = ['cxcalc', in_file.name, '-o', out_file.name] + self.index
219
220
            LOGGER.info('Running: ' + ' '.join(args))
221
222
            # call command line
223
            subprocess.call(args)
224
            finished = pd.read_table(out_file.name).set_index('id')
225
226
        def to_padded(s):
0 ignored issues
show
Coding Style Naming introduced by
The name s does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
227
            res = np.repeat(np.nan, self.max_atoms)
228
229
            def parse_string(s):
0 ignored issues
show
Coding Style Naming introduced by
The name s does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
230
                if s == '':
231
                    return np.nan
232
                elif s == 'false':
233
                    return 0
234
                elif s == 'true':
235
                    return 1
236
                else:
237
                    return float(s)
238
239
            ans = np.array([parse_string(i) for i in s.split(';')])
240
            res[:len(ans)] = ans
241
            return res
242
        res = np.array([[to_padded(i) for k, i in val.items()] for idx, val in finished.T.items()])
0 ignored issues
show
Unused Code introduced by
The variable idx seems to be unused.
Loading history...
Unused Code introduced by
The variable k seems to be unused.
Loading history...
243
        res = pd.Panel(res, items=series.index, major_axis=pd.Index(finished.columns, name='cx_atom_desc'),
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (107/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
244
                        minor_axis=pd.Index(range(self.max_atoms), name='atom_idx'))
0 ignored issues
show
Coding Style introduced by
Wrong continued indentation.
minor_axis=pd.Index(range(self.max_atoms), name='atom_idx'))
|^
Loading history...
245
        return res.swapaxes(1, 2) # to be consistent with AtomFeatureCalculator
246