InMemoryOutput   A
last analyzed

Complexity

Total Complexity 17

Size/Duplication

Total Lines 54
Duplicated Lines 0 %

Importance

Changes 3
Bugs 1 Features 1
Metric Value
wmc 17
c 3
b 1
f 1
dl 0
loc 54
rs 10

5 Methods

Rating   Name   Duplication   Size   Complexity  
B get_filtered_data() 0 8 5
B import_from_iterable() 0 14 5
A get_date_filtered_data() 0 4 1
A save() 0 7 1
B __init__() 0 13 5
1
from six.moves import UserDict
0 ignored issues
show
Coding Style introduced by
This module should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Configuration introduced by
The import six.moves could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
2
import types
3
4
from ._registry import register_output
5
from .base_output import OutputInterface
6
7
8
class GreedyDict(UserDict, object):
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Comprehensibility Best Practice introduced by
The variable UserDict does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable object does not seem to be defined.
Loading history...
9
    def __setitem__(self, key, value):
10
        if isinstance(value, types.GeneratorType):
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable types does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable value does not seem to be defined.
Loading history...
11
            value = [val for val in value]
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable val does not seem to be defined.
Loading history...
12
        super(GreedyDict, self).__setitem__(key, value)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable key does not seem to be defined.
Loading history...
13
14
    def __iter__(self):
15
        for val in self.data.values():
0 ignored issues
show
Bug introduced by
The Instance of GreedyDict does not seem to have a member named data.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
16
            yield val
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable val does not seem to be defined.
Loading history...
17
18
19
@register_output
0 ignored issues
show
Coding Style introduced by
This class should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
This interface does not seem to be used anywhere.
Loading history...
Comprehensibility Best Practice introduced by
The variable register_output does not seem to be defined.
Loading history...
20
class InMemoryOutput(OutputInterface):
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable OutputInterface does not seem to be defined.
Loading history...
21
    def __init__(self, iterable=None, hash_field=None,
22
                 tokenized_corpora=None,
23
                 vectorized_corpora=None, modeled_corpora=None):
24
        super(InMemoryOutput, self).__init__()
25
26
        self.corpus = GreedyDict()
27
28
        if iterable:
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable iterable does not seem to be defined.
Loading history...
29
            self.import_from_iterable(iterable, hash_field)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable hash_field does not seem to be defined.
Loading history...
30
31
        self.tokenized_corpora = tokenized_corpora if tokenized_corpora else GreedyDict()
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable tokenized_corpora does not seem to be defined.
Loading history...
32
        self.vectorized_corpora = vectorized_corpora if vectorized_corpora else GreedyDict()
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable vectorized_corpora does not seem to be defined.
Loading history...
33
        self.modeled_corpora = modeled_corpora if modeled_corpora else GreedyDict()
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable modeled_corpora does not seem to be defined.
Loading history...
34
35
    def import_from_iterable(self, iterable, field_to_hash):
36
        """
37
        iterable: generally a list of dicts, but possibly a list of strings
38
            This is your data.  Your dictionary structure defines the schema
39
            of the elasticsearch index.
40
        """
41
        self.hash_field=field_to_hash
0 ignored issues
show
Coding Style introduced by
Exactly one space required around assignment
self.hash_field=field_to_hash
^
Loading history...
Comprehensibility Best Practice introduced by
The variable field_to_hash does not seem to be defined.
Loading history...
42
        for item in iterable:
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable iterable does not seem to be defined.
Loading history...
43
            if isinstance(item, basestring):
0 ignored issues
show
Comprehensibility Best Practice introduced by
Undefined variable 'basestring'
Loading history...
Comprehensibility Best Practice introduced by
The variable item does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable basestring does not seem to be defined.
Loading history...
44
                item = {field_to_hash: item}
45
            elif field_to_hash not in item and field_to_hash in item.values()[0]:
46
                item = item.values()[0]
47
            id = hash(item[field_to_hash])
0 ignored issues
show
Bug Best Practice introduced by
This seems to re-define the built-in id.

It is generally discouraged to redefine built-ins as this makes code very hard to read.

Loading history...
Coding Style Naming introduced by
The name id does not conform to the variable naming conventions ([a-z_][a-z0-9_]{2,30}$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
48
            self.corpus[id] = item
49
50
    # TODO: generalize for datetimes
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
51
    # TODO: validate input data to ensure that it has valid year data
0 ignored issues
show
Coding Style introduced by
TODO and FIXME comments should generally be avoided.
Loading history...
52
    def get_date_filtered_data(self, field_to_get, start, end, filter_field="year"):
0 ignored issues
show
Coding Style introduced by
This method should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
53
        return self.get_filtered_data(field_to_get,
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable field_to_get does not seem to be defined.
Loading history...
54
                                      "{}<=int({}['{}'])<={}".format(start, "{}",
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable start does not seem to be defined.
Loading history...
55
                                                                     filter_field, end))
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable filter_field does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable end does not seem to be defined.
Loading history...
56
57
    def get_filtered_data(self, field_to_get, filter=""):
0 ignored issues
show
Bug Best Practice introduced by
This seems to re-define the built-in filter.

It is generally discouraged to redefine built-ins as this makes code very hard to read.

Loading history...
58
        if not filter:
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable filter does not seem to be defined.
Loading history...
59
            for doc_id, doc in self.corpus.items():
0 ignored issues
show
Bug introduced by
The Instance of GreedyDict does not seem to have a member named items.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
60
                yield doc_id, doc[field_to_get]
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable doc_id does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable field_to_get does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable doc does not seem to be defined.
Loading history...
61
        else:
62
            for doc_id, doc in self.corpus.items():
0 ignored issues
show
Bug introduced by
The Instance of GreedyDict does not seem to have a member named items.

This check looks for calls to members that are non-existent. These calls will fail.

The member could have been renamed or removed.

Loading history...
63
                if eval(filter.format(doc)):
0 ignored issues
show
Security Best Practice introduced by
eval should generally only be used if absolutely necessary.

The usage of eval might allow for executing arbitrary code if a user manages to inject dynamic input. Please use this language feature with care and only when you are sure of the input.

Loading history...
64
                    yield doc_id, doc[field_to_get]
65
66
    def save(self, filename):
0 ignored issues
show
Bug introduced by
Arguments number differs from overridden 'save' method
Loading history...
67
        saved_data = {"iterable": self.corpus,
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable self does not seem to be defined.
Loading history...
68
                      "hash_field": self.hash_field,
69
                      "modeled_corpora": self.modeled_corpora,
70
                      "vectorized_corpora": self.vectorized_corpora,
71
                      "tokenized_corpora": self.tokenized_corpora}
72
        return super(InMemoryOutput, self).save(filename, saved_data)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable filename does not seem to be defined.
Loading history...
Comprehensibility Best Practice introduced by
The variable saved_data does not seem to be defined.
Loading history...
73