Completed
Push — master ( 01edc4...2b1c29 )
by Rich
01:29
created

ChemAxonNMRPredictor   A

Complexity

Total Complexity 28

Size/Duplication

Total Lines 109
Duplicated Lines 0 %

Importance

Changes 1
Bugs 0 Features 1
Metric Value
wmc 28
c 1
b 0
f 1
dl 0
loc 109
rs 10

9 Methods

Rating   Name   Duplication   Size   Complexity  
B transform() 0 11 5
A max_atoms() 0 3 2
B _transform_series() 0 23 4
A feature_names() 0 3 1
A index() 0 3 1
A __init__() 0 4 1
A _transform_mol() 0 5 1
F parse_output() 0 27 9
A element() 0 3 2
1
#! /usr/bin/env python
2
#
3
# Copyright (C) 2015-2016 Rich Lewis <[email protected]>
4
# License: 3-clause BSD
5
6
"""
7
## skchem.descriptors.nmr
8
9
Module for NMR prediction.
10
"""
11
import tempfile
12
import subprocess
13
import re
14
import logging
15
import time
16
17
import numpy as np
0 ignored issues
show
Configuration introduced by
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
18
import pandas as pd
0 ignored issues
show
Configuration introduced by
The import pandas could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
19
20
from ..io import write_sdf
21
from ..utils import NamedProgressBar
22
from .. import core
23
24
LOGGER = logging.getLogger(__name__)
25
26
def mol_count(filename):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
27
    return sum(1 for l in open(filename, 'rb') if l == b'##PEAKASSIGNMENTS=(XYMA)\r\n')
28
29
30
class ChemAxonNMRPredictor(object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
31
32
    def __init__(self, element='C', max_atoms='auto'):
33
        self.element = element
34
        self._detect_max_atoms = False
35
        self.max_atoms = max_atoms
36
37
    @property
38
    def element(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
39
        return self._element
40
41
    @element.setter
42
    def element(self, val):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
43
        val = val.upper()
44
        if val in ('C', 'H'):
45
            self._element = val
0 ignored issues
show
Coding Style introduced by
The attribute _element was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
46
        else:
47
            raise ValueError('ChemAxon can only predict 1H or 13C shifts.')
48
49
    @property
50
    def max_atoms(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
51
        return self._max_atoms
52
53
    @max_atoms.setter
54
    def max_atoms(self, arg):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
55
        if arg == 'auto':
56
            self._detect_max_atoms = True
57
            self._max_atoms = -1
0 ignored issues
show
Coding Style introduced by
The attribute _max_atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
58
        else:
59
            self._max_atoms = arg
0 ignored issues
show
Coding Style introduced by
The attribute _max_atoms was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
60
61
    @property
62
    def index(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
63
        return pd.RangeIndex(self.max_atoms, name='atom_idx')
64
65
    @property
66
    def feature_names(self):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
67
        return self.index
68
69
    def transform(self, obj):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
70
        if isinstance(obj, core.Mol):
71
            return self._transform_mol(obj)
72
        elif isinstance(obj, pd.Series):
73
            return self._transform_series(obj)
74
        elif isinstance(obj, pd.DataFrame):
75
            return self._transform_series(obj.structure)
76
        elif isinstance(obj, (tuple, list)):
77
            return self._transform_series(obj)
78
        else:
79
            raise NotImplementedError
80
81
    def _transform_mol(self, mol):
82
        # make into series then use self._transform_mol
83
        ser = pd.Series([mol], name=mol.name)
84
        res = self._transform_series(ser)
85
        return res.iloc[0]
86
87
    def _transform_series(self, ser):
88
89
        if self._detect_max_atoms:
90
            self.max_atoms = ser.ms.atoms.apply(len).max()
91
92
        with tempfile.NamedTemporaryFile(suffix='.sdf') as infile, tempfile.NamedTemporaryFile() as outfile:
0 ignored issues
show
Coding Style introduced by
This line is too long as per the coding-style (108/100).

This check looks for lines that are too long. You can specify the maximum line length.

Loading history...
93
94
            write_sdf(ser, infile.name)
95
            args = ['cxcalc', infile.name, '-o', outfile.name] + [self.element.lower() + 'nmr']
96
97
            LOGGER.info('Running: ' + ' '.join(args))
98
99
            p = subprocess.Popen(args, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
0 ignored issues
show
Coding Style Naming introduced by
The name p does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Bug introduced by
The Module subprocess does not seem to have a member named DEVNULL.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
100
            bar = NamedProgressBar(name=self.__class__.__name__, max_value=len(ser))
0 ignored issues
show
introduced by
Black listed name "bar"
Loading history...
101
102
            while p.poll() is None:
103
                bar.update(mol_count(outfile.name))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
104
                time.sleep(1)
105
            bar.update(len(ser))
0 ignored issues
show
Bug introduced by
The Instance of NamedProgressBar does not seem to have a member named update.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
106
            p.wait()
107
108
            res = self.parse_output(outfile.name, n_mols=len(ser))
109
            return pd.DataFrame(res, index=ser.index, columns=self.index)
110
111
112
    def parse_output(self, outfile, n_mols=None):
113
114
        """ Read the NMR output file. """
115
116
        if n_mols is None:
117
            n_mols = mol_count(outfile)
118
119
        res = np.repeat(np.nan, n_mols * self.max_atoms).reshape(n_mols, self.max_atoms)
120
        regex = re.compile(b'\((-?\d+.\d+),\d+,[A-Z],<([0-9\,]+)>\)\r\n')
0 ignored issues
show
Bug introduced by
A suspicious escape sequence \( was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
Bug introduced by
A suspicious escape sequence \d was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
Bug introduced by
A suspicious escape sequence \, was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
Bug introduced by
A suspicious escape sequence \) was found. Did you maybe forget to add an r prefix?

Escape sequences in Python are generally interpreted according to rules similar to standard C. Only if strings are prefixed with r or R are they interpreted as regular expressions.

The escape sequence that was used indicates that you might have intended to write a regular expression.

Learn more about the available escape sequences. in the Python documentation.

Loading history...
121
122
        mol_idx = 0
123
124
        with open(outfile, 'rb') as f:
0 ignored issues
show
Coding Style Naming introduced by
The name f does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
125
            # loop through the file - inner loop will also advance the pointer
126
            for l in f:
0 ignored issues
show
Coding Style Naming introduced by
The name l does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
127
                if l == b'##PEAKASSIGNMENTS=(XYMA)\r\n':
128
                    for row in f:
129
                        if row == b'##END=\r\n':
130
                            break
131
                        else:
132
                            LOGGER.debug('Row to parse: %s', row)
133
                            shift, idxs = regex.match(row).groups()
134
                            shift, idxs = float(shift), [int(idx) for idx in idxs.split(b',')]
135
                            for atom_idx in idxs:
136
                                res[mol_idx, atom_idx] = shift
137
                    mol_idx += 1
138
        return res
139
140
141
142
143