ChEMBLConverter.__init__() - Code Metrics - Inspection of "removed some default features from chembl" - richlewis42/scikit-chem - Measure and Improve Code Quality continuously with Scrutinizer

Completed

Push — master ( 475af1...18e358 )

by Rich

created 2016-08-15 13:37 UTC

ChEMBLConverter.init() B

↳ Parent: ChEMBLConverter

Complexity

Conditions

Size

Total Lines

Duplication

Lines	0
Ratio	0 %

Code Coverage

Tests	1
CRAP Score	1.7703

Importance

Changes	2
Bugs	0	Features	1

Metric	Value
c	2
b	0
f	1
dl	0
loc	33
ccs	1
cts	12
cp	0.0833
rs	8.8571
cc	1
crap	1.7703

#! /usr/bin/env python
#
# Copyright (C) 2016 Rich Lewis <[email protected]>
# License: 3-clause BSD

"""
# skchem.data.converters.chembl

Dataset constructor for ChEMBL
"""

import pandas as pd
# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
import os

from .base import Converter, default_pipeline, contiguous_order, Feature
from ...cross_validation import SimThresholdSplit
from ... import descriptors


class ChEMBLConverter(Converter):

    """ Converter for the ChEMBL dataset. """

    def __init__(self, directory, output_directory, output_filename='chembl.h5'):
class SomeParent:
    def __init__(self):
        self.x = 1

class SomeChild(SomeParent):
    def __init__(self):
        # Initialize the super class
        SomeParent.__init__(self)

        output_path = os.path.join(output_directory, output_filename)

        infile = os.path.join(directory, 'chembl_raw.h5')
        ms, y = self.parse_infile(infile)


        pipeline = default_pipeline()

        ms, y = pipeline.transform_filter(ms, y)


        cv = SimThresholdSplit(min_threshold=0.6, n_jobs=-1).fit(ms)

        train, valid, test = cv.split((70, 15, 15))
        (ms, y, train, valid, test) = contiguous_order((ms, y, train, valid, test), (train, valid, test))

        splits = (('train', train), ('valid', valid), ('test', test))

        features = (

        Feature(fper=descriptors.MorganFeaturizer(),

                key='X_morg',
                axis_names=['batch', 'features']),
        Feature(fper=descriptors.PhysicochemicalFeaturizer(),

                key='X_pc',
                axis_names=['batch', 'features']),
        Feature(fper=descriptors.AtomFeaturizer(max_atoms=100),

                key='A',
                axis_names=['batch', 'atom_idx', 'features']),
        Feature(fper=descriptors.GraphDistanceTransformer(max_atoms=100),

                key='G',
                axis_names=['batch', 'atom_idx', 'atom_idx']),
        Feature(fper=descriptors.SpacialDistanceTransformer(max_atoms=100),

                key='G_d'))

        self.run(ms, y, output_path, splits=splits)


    def parse_infile(self, filename):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""

        ms = pd.read_hdf(filename, 'structure')

        y = pd.read_hdf(filename, 'targets/Y')

        return ms, y


Push — master ( 475af1...18e358 )

ChEMBLConverter.init() B

Complexity

Size

Duplication

Code Coverage

Importance

1. Missing Dependencies

2. Missing init.py files

1		#! /usr/bin/env python
2		#
3		# Copyright (C) 2016 Rich Lewis <[email protected]>
4		# License: 3-clause BSD
5
6	1	"""
7		# skchem.data.converters.chembl
8
9		Dataset constructor for ChEMBL
10		"""
11
12	1	import pandas as pd
		0 ignored issues – show Configuration introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The import `pandas` could not be resolved. This can be caused by one of the following: 1. Missing Dependencies This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands. # .scrutinizer.yml before_commands: - sudo pip install abc # Python2 - sudo pip3 install abc # Python3 Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version. 2. Missing __init__.py files This error could also result from missing `__init__.py` files in your module folders. Make sure that you place one file in each sub-folder. Loading history...
13	1	import os
14
15	1	from .base import Converter, default_pipeline, contiguous_order, Feature
16	1	from ...cross_validation import SimThresholdSplit
17	1	from ... import descriptors
18
19
20	1	class ChEMBLConverter(Converter):
21
22		""" Converter for the ChEMBL dataset. """
23
24	1	def __init__(self, directory, output_directory, output_filename='chembl.h5'):
		0 ignored issues – show Bug introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The `__init__` method of the super-class `Converter` is not called. It is generally advisable to initialize the super-class by calling its `__init__` method: class SomeParent: def __init__(self): self.x = 1 class SomeChild(SomeParent): def __init__(self): # Initialize the super class SomeParent.__init__(self) Loading history...
25
26		output_path = os.path.join(output_directory, output_filename)
27
28		infile = os.path.join(directory, 'chembl_raw.h5')
29		ms, y = self.parse_infile(infile)
		0 ignored issues – show Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `ms` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history... Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `y` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
30
31		pipeline = default_pipeline()
32
33		ms, y = pipeline.transform_filter(ms, y)
		0 ignored issues – show Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `ms` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history... Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `y` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
34
35		cv = SimThresholdSplit(min_threshold=0.6, n_jobs=-1).fit(ms)
		0 ignored issues – show Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `cv` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
36		train, valid, test = cv.split((70, 15, 15))
37		(ms, y, train, valid, test) = contiguous_order((ms, y, train, valid, test), (train, valid, test))
		0 ignored issues – show Coding Style introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report This line is too long as per the coding-style (105/100). This check looks for lines that are too long. You can specify the maximum line length. Loading history... Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `ms` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history... Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `y` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
38		splits = (('train', train), ('valid', valid), ('test', test))
39
40		features = (
		0 ignored issues – show Unused Code introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report The variable `features` seems to be unused. Loading history...
41		Feature(fper=descriptors.MorganFeaturizer(),
		0 ignored issues – show Coding Style introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report Wrong hanging indentation. Feature(fper=descriptors.MorganFeaturizer(), ^ \| Loading history...
42		key='X_morg',
43		axis_names=['batch', 'features']),
44		Feature(fper=descriptors.PhysicochemicalFeaturizer(),
		0 ignored issues – show Coding Style introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report Wrong hanging indentation. Feature(fper=descriptors.PhysicochemicalFeaturizer(), ^ \| Loading history...
45		key='X_pc',
46		axis_names=['batch', 'features']),
47		Feature(fper=descriptors.AtomFeaturizer(max_atoms=100),
		0 ignored issues – show Coding Style introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report Wrong hanging indentation. Feature(fper=descriptors.AtomFeaturizer(max_atoms=100), ^ \| Loading history...
48		key='A',
49		axis_names=['batch', 'atom_idx', 'features']),
50		Feature(fper=descriptors.GraphDistanceTransformer(max_atoms=100),
		0 ignored issues – show Coding Style introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report Wrong hanging indentation. Feature(fper=descriptors.GraphDistanceTransformer(max_atoms=100), ^ \| Loading history...
51		key='G',
52		axis_names=['batch', 'atom_idx', 'atom_idx']),
53		Feature(fper=descriptors.SpacialDistanceTransformer(max_atoms=100),
		0 ignored issues – show Coding Style introduced 2016-08-15 13:41 UTC by Report Bug Copy Issue Report Wrong hanging indentation. Feature(fper=descriptors.SpacialDistanceTransformer(max_atoms=100), ^ \| Loading history...
54		key='G_d'))
55
56		self.run(ms, y, output_path, splits=splits)
57
58
59	1	def parse_infile(self, filename):
		0 ignored issues – show Coding Style introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history... Coding Style introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report This method could be written as a function/class method. If a method does not access any attributes of the class, it could also be implemented as a function or static method. This can help improve readability. For example class Foo: def some_method(self, x, y): return x + y; could be written as class Foo: @classmethod def some_method(cls, x, y): return x + y; Loading history...
60
61		ms = pd.read_hdf(filename, 'structure')
		0 ignored issues – show Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `ms` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
62		y = pd.read_hdf(filename, 'targets/Y')
		0 ignored issues – show Coding Style Naming introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report The name `y` does not conform to the variable naming conventions (`[a-z_][a-z0-9_]{2,30}$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
63		return ms, y
		0 ignored issues – show Coding Style introduced 2016-08-15 13:30 UTC by Report Bug Copy Issue Report Final newline missing Loading history...

richlewis42 / scikit-chem

Push — master ( 475af1...18e358 )

ChEMBLConverter.__init__() B

Complexity

Size

Duplication

Code Coverage

Importance

1. Missing Dependencies

2. Missing __init__.py files

Duplication Side-by-Side

Filter issues like

ChEMBLConverter.init() B

2. Missing init.py files