TokenizerRegistry - Code Metrics - ContinuumIO/topik - Measure and Improve Code Quality continuously with Scrutinizer

TokenizerRegistry A
last analyzed 2016-04-26 18:00 UTC

↳ Parent: Project

Complexity

Total Complexity

Size/Duplication

Total Lines	8
Duplicated Lines	0 %

Importance

Changes	2
Bugs	1	Features	0

Metric	Value
wmc	1
c	2
b	1
f	0
dl	0
loc	8
rs	10

1 Method

Rating	Name	Duplication	Size	Complexity
A	__init__()	0	3	1

from six.moves import UserDict
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
from functools import partial

from topik.singleton_registry import _base_register_decorator


# This subclass serves to establish a new singleon instance of functions
#    for this particular step in topic modeling.  No implementation necessary.
class TokenizerRegistry(UserDict, object):

    """Uses Borg design pattern.  Core idea is that there is a global registry for each step's
    possible methods
    """
    __shared_state = {}
    def __init__(self, *args, **kwargs):
        self.__dict__ = self.__shared_state

        super(TokenizerRegistry, self).__init__(*args, **kwargs)



# a nicer, more pythonic handle to our singleton instance
registered_tokenizers = TokenizerRegistry()


# fill in the registration function
register = partial(_base_register_decorator, registered_tokenizers)



def tokenize(corpus, method="simple", **kwargs):
    """Break documents up into component words, optionally eliminating stopwords.

    Output from this function is used as input to vectorization steps.

    raw_data: iterable corpus object containing the text to be processed.
        Each iteration call should return a new document's content.
    method: string id of tokenizer to use.  For keys, see
        topik.tokenizers.registered_tokenizers (which is a dictionary of functions)
    kwargs: arbitrary dicionary of extra parameters.
    """
    return registered_tokenizers[method](corpus, **kwargs)



TokenizerRegistry A
last analyzed 2016-04-26 18:00 UTC

Complexity

Size/Duplication

Importance

1 Method

1. Missing Dependencies

2. Missing init.py files

1			from six.moves import UserDict
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This module should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history... Configuration introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report The import `six.moves` could not be resolved. This can be caused by one of the following: 1. Missing Dependencies This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands. # .scrutinizer.yml before_commands: - sudo pip install abc # Python2 - sudo pip3 install abc # Python3 Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version. 2. Missing __init__.py files This error could also result from missing `__init__.py` files in your module folders. Make sure that you place one file in each sub-folder. Loading history...
2			from functools import partial
3
4			from topik.singleton_registry import _base_register_decorator
5
6
7			# This subclass serves to establish a new singleon instance of functions
8			# for this particular step in topic modeling. No implementation necessary.
9			class TokenizerRegistry(UserDict, object):
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `object` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `UserDict` does not seem to be defined. Loading history...
10			"""Uses Borg design pattern. Core idea is that there is a global registry for each step's
11			possible methods
12			"""
13			__shared_state = {}
14			def __init__(self, args, *kwargs):
15			self.__dict__ = self.__shared_state
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
16			super(TokenizerRegistry, self).__init__(args, *kwargs)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `args` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `kwargs` does not seem to be defined. Loading history...
17
18
19			# a nicer, more pythonic handle to our singleton instance
20			registered_tokenizers = TokenizerRegistry()
			0 ignored issues – show Coding Style Naming introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report The name `registered_tokenizers` does not conform to the constant naming conventions (`(([A-Z_][A-Z0-9_])\|(__.__))$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history...
21
22			# fill in the registration function
23			register = partial(_base_register_decorator, registered_tokenizers)
			0 ignored issues – show Coding Style Naming introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report The name `register` does not conform to the constant naming conventions (`(([A-Z_][A-Z0-9_])\|(__.__))$`). This check looks for invalid names for a range of different identifiers. You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements. If your project includes a Pylint configuration file, the settings contained in that file take precedence. To find out more about Pylint, please refer to their site. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `_base_register_decorator` does not seem to be defined. Loading history...
24
25
26			def tokenize(corpus, method="simple", **kwargs):
27			"""Break documents up into component words, optionally eliminating stopwords.
28
29			Output from this function is used as input to vectorization steps.
30
31			raw_data: iterable corpus object containing the text to be processed.
32			Each iteration call should return a new document's content.
33			method: string id of tokenizer to use. For keys, see
34			topik.tokenizers.registered_tokenizers (which is a dictionary of functions)
35			kwargs: arbitrary dicionary of extra parameters.
36			"""
37			return registered_tokenizers[method](corpus, **kwargs)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `kwargs` does not seem to be defined. Loading history...
38

ContinuumIO / topik

TokenizerRegistry A last analyzed 2016-04-26 18:00 UTC

Complexity

Size/Duplication

Importance

1 Method

1. Missing Dependencies

2. Missing __init__.py files

Duplication Side-by-Side

Filter issues like

TokenizerRegistry A
last analyzed 2016-04-26 18:00 UTC

2. Missing init.py files