Completed
Push — master ( 5bdf16...98b503 )
by Rich
30:19 queued 15:26
created

moran_coefficient()   B

Complexity

Conditions 2

Size

Total Lines 38

Duplication

Lines 0
Ratio 0 %

Importance

Changes 1
Bugs 0 Features 1
Metric Value
cc 2
c 1
b 0
f 1
dl 0
loc 38
rs 8.8571
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
# skchem.features.descriptors.autocorrelation
8
9
Autocorrelation descriptors for scikit-chem.
10
"""
11
12
from functools import partial
13
14
from .caching import cache
0 ignored issues
show
Configuration introduced by
Unable to import 'caching' (invalid syntax (<string>, line 109))

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
15
from .fundamentals import adjacency_matrix, distance_matrix, atom_props
0 ignored issues
show
Configuration introduced by
The import fundamentals could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
16
17
import numpy as np
0 ignored issues
show
Configuration introduced by
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18
19
20
@cache.inject(distance_matrix, atom_props)
21
def moreau_broto_autocorrelation(mol, dist_mat, prop, prop_name='atomic_mass',
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
best-practice introduced by
Too many arguments (7/5)
Loading history...
Unused Code introduced by
The argument prop_name seems to be unused.
Loading history...
Unused Code introduced by
The argument mol seems to be unused.
Loading history...
22
                                 c_scaled=False, centred=False,
0 ignored issues
show
Unused Code introduced by
The argument c_scaled seems to be unused.
Loading history...
23
                                 ks=range(1, 9)):
24
25
    """ The Moreau-Broto autocorrelation.
0 ignored issues
show
Bug introduced by
A suspicious escape sequence \h was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
Bug introduced by
A suspicious escape sequence \s was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
26
27
    $$ ATS_k = \frac{1}{2} \hdot \sum_{i=1}^A \sum_{j=1}^A $$
28
29
    With special case $$ ATS_0 = \sum_{i=1}^A w_i^2.
30
31
    Where $A$ is the number of atoms, and $w$ is an atomic property.
32
33
    Args:
34
        mol (skchem.Mol):
35
            The molecule for which to calculate the descriptor.
36
37
        prop (str):
38
            The atomic property.
39
40
        c_scaled (bool):
41
            Whether the properties should be scaled against sp3 carbon.
42
43
        centred (bool):
44
            Whether the descriptor should be divided by the number of
45
            contributions (avoids dependence on molecular size).
46
47
        ks (iterable):
48
            The lags to calculate the descriptor over.
49
50
    Returns:
51
        float
52
53
    Examples:
54
        >>> import skchem
55
        >>> m = skchem.Mol.from_smiles('CC(O)CCO')
56
        >>> moreau_broto_autocorrelation(m, centred=False,
57
        ...                              prop_name='atomic_mass',
58
        ...                              ks=range(4))  # doctest: +ELLIPSIS
59
        array([ 1088.99...,   817.12...,   865.02...,   528.59...])
60
    """
61
62
    div = np.array([(dist_mat == k).sum() for k in ks]) if centred else 1
0 ignored issues
show
Comprehensibility Bug introduced by
k is re-defining a name which is already available in the outer-scope (previously defined on line 238).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
63
64
    return np.array([(1 if k == 0 else 0.5) * prop.dot(dist_mat == k).dot(prop)
65
                     for k in ks]) / div
66
67
68
@cache.inject(distance_matrix, atom_props)
69
def moran_coefficient(mol, dist_mat, prop, prop_name='atomic_mass',
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
best-practice introduced by
Too many arguments (6/5)
Loading history...
Unused Code introduced by
The argument prop_name seems to be unused.
Loading history...
Unused Code introduced by
The argument mol seems to be unused.
Loading history...
70
                      c_scaled=False, ks=range(1, 9)):
0 ignored issues
show
Unused Code introduced by
The argument c_scaled seems to be unused.
Loading history...
71
72
    """ Moran coefficient for lags ks.
73
74
    Args:
75
        mol (skchem.Mol):
76
            The molecule for which to calculate the descriptor.
77
78
        prop (str):
79
            The atomic property.
80
81
        c_scaled (bool):
82
            Whether the properties should be scaled against sp3 carbon.
83
84
        centered (bool):
85
            Whether the descriptor should be divided by the number of
86
            contributions (avoids dependence on molecular size.
87
88
        ks (iterable):
89
            The lags to calculate the descriptor over.
90
91
    Returns:
92
        float
93
    """
94
95
    prop = prop - prop.mean()
96
97
    res = []
98
99
    for k in ks:
0 ignored issues
show
Comprehensibility Bug introduced by
k is re-defining a name which is already available in the outer-scope (previously defined on line 238).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
100
        geodesic = dist_mat == k
101
        num = prop.dot(geodesic).dot(prop) / geodesic.sum()
102
        denom = (prop ** 2).sum() / len(prop)
103
        res.append(num / denom)
104
105
    return np.array(res)
106
107
108
@cache.inject(distance_matrix, atom_props)
109
def geary_coefficient(mol, dist_mat, prop, prop_name='atomic_mass',
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
best-practice introduced by
Too many arguments (6/5)
Loading history...
Unused Code introduced by
The argument prop_name seems to be unused.
Loading history...
Unused Code introduced by
The argument mol seems to be unused.
Loading history...
110
                      c_scaled=False, ks=range(1, 9)):
0 ignored issues
show
Unused Code introduced by
The argument c_scaled seems to be unused.
Loading history...
111
112
    """ The geary coefficient for *ks* lags.
113
114
    Args:
115
        mol (skchem.Mol):
116
            The molecule for which to calculate the descriptor.
117
118
        prop (str):
119
            The atomic property.
120
121
        c_scaled (bool):
122
            Whether the properties should be scaled against sp3 carbon.
123
124
        centered (bool):
125
            Whether the descriptor should be divided by the number of
126
            contributions (avoids dependence on molecular size.
127
128
        ks (iterable):
129
            The lags to calculate the descriptor over.
130
131
    Returns:
132
        float
133
134
    """
135
136
    res = []
137
    for k in ks:
0 ignored issues
show
Comprehensibility Bug introduced by
k is re-defining a name which is already available in the outer-scope (previously defined on line 238).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
138
        geodesic = dist_mat == k
139
        num = 0.5 * ((prop - prop[:, np.newaxis]) ** 2 * geodesic).sum() / geodesic.sum()
140
        denom = ((prop - prop.mean()) ** 2).sum() / (len(prop) - 1)
141
        res.append(num / denom)
142
143
    return np.array(res)
144
145
146
@cache
147
@cache.inject(adjacency_matrix, distance_matrix)
148
def galvez_matrix(mol, dist_mat, adj_mat):
149
150
    """ The galvez matrix.
151
152
    Args:
153
        mol (skchem.Mol):
154
            The molecule for which to calculate the matrix.
155
156
    Returns:
157
        np.array
158
    """
159
160
    temp = dist_mat ** -2
161
    np.fill_diagonal(temp, 0)
162
    galvez_mat = temp.dot(adj_mat) + np.diag(mol.atoms.valence_vertex_degree)
163
    return galvez_mat
164
165
166
@cache
167
@cache.inject(galvez_matrix)
168
def charge_matrix(mol, galvez_mat):
169
170
    """ The charge matrix.
171
172
    Args:
173
        mol (skchem.Mol):
174
            The molecule for which to calculate the matrix.
175
176
    Returns:
177
        np.array
178
179
    """
180
181
    ct_mat = galvez_mat - galvez_mat.T
182
    ct_mat[np.diag_indices_from(ct_mat)] = mol.atoms.depleted_degree
183
184
    return ct_mat
185
186
187
@cache
188
@cache.inject(charge_matrix, distance_matrix)
189
def topological_charge_index(mol, c_mat, dist_mat, ks=range(11)):
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Unused Code introduced by
The argument mol seems to be unused.
Loading history...
190
191
    """ The Galvez tologogical charge index for lags ks.
192
193
    Args:
194
        mol (skchem.Mol):
195
            The molecule for which to calculate the descriptor.
196
197
        ks (iterable):
198
            The lags for which to calculate the descriptor.
199
200
    Returns:
201
        np.array
202
203
    """
204
    return np.array([0.5 * np.abs(c_mat)[dist_mat == k].sum() for i in ks])
0 ignored issues
show
Unused Code introduced by
The variable i seems to be unused.
Loading history...
205
206
207
@cache.inject(topological_charge_index)
208
def mean_topological_charge_index(mol, tci, ks=range(11)):
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Unused Code introduced by
The argument ks seems to be unused.
Loading history...
209
210
    """ Mean topological charge index for lags ks.
211
212
    Args:
213
        mol (skchem.Mol):
214
            The molecule for which to calculate the descriptor.
215
216
        ks (iterable):
217
            The lags for which to calculate the descriptor.
218
219
    Returns:
220
        np.array
221
    """
222
223
    return tci / (len(mol.atoms) - 1)
224
225
226
@cache.inject(topological_charge_index)
227
def total_charge_index(mol, tci, ks=range(11)):
0 ignored issues
show
Coding Style Naming introduced by
The name ks does not conform to the argument naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
The argument ks seems to be unused.
Loading history...
228
229
    return tci.sum() / (len(mol.atoms) - 1)
230
231
232
PROPS = ['atomic_mass', 'van_der_waals_volume', 'sanderson_electronegativity',
233
         'polarisability', 'ionisation_energy', 'intrinsic_state']
234
235
KS = range(1, 9)
236
237
DESCRIPTORS = [partial(moreau_broto_autocorrelation, k=k, prop=p, centered=c)
238
               for k in KS for c in (False, True) for p in PROPS]
239
240
FS = (moran_coefficient, geary_coefficient)
241
242
DESCRIPTORS += [partial(f, k=k, prop=p) for f in FS for k in KS for p in PROPS]
243
244
__all__ = ['moreau_broto_autocorrelation', 'moran_coefficient',
245
           'geary_coefficient', 'topological_charge_index',
246
           'mean_topological_charge_index', 'total_charge_index']
247