VectorizerOutput - Code Metrics - ContinuumIO/topik - Measure and Improve Code Quality continuously with Scrutinizer

VectorizerOutput A
last analyzed 2016-04-26 18:00 UTC

↳ Parent: Project

Complexity

Total Complexity

Size/Duplication

Total Lines	58
Duplicated Lines	0 %

Importance

Changes	7
Bugs	4	Features	0

Metric	Value
wmc	21
c	7
b	4
f	0
dl	0
loc	58
rs	10

10 Methods

Rating	Name	Size	Complexity
A	get_vectors()	3	2
A	term_frequency()	3	1
A	global_term_count()	3	1
A	__len__()	2	1
A	vectors()	3	1
D	__init__()	21	11
A	id_term_map()	3	1
A	document_term_counts()	3	1
A	term_id_map()	3	1
A	doc_lengths()	3	1

from collections import Counter
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
import itertools

def _accumulate_terms(tokenized_corpus):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
    global_terms=set()

    document_term_counts = {}
    doc_lengths = {}
    global_term_frequency_counter = Counter()
    for doc_id, doc in tokenized_corpus:
        doc_terms = set(doc)
        global_terms.update(doc_terms)
        doc_lengths[doc_id] = len(doc)
        document_term_counts[doc_id] = len(doc_terms)
        global_term_frequency_counter.update(doc)
    id_term_map = {}
    global_term_frequency = {}
    for term_id, term in enumerate(global_terms):
        id_term_map[term_id] = term
        global_term_frequency[term_id] = global_term_frequency_counter[term]

    return id_term_map, document_term_counts, doc_lengths, global_term_frequency


class VectorizerOutput(object):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
    def __init__(self, tokenized_corpus=None, vectorizer_func=None,
                 id_term_map=None, document_term_counts=None, doc_lengths=None,
                 term_frequency=None, vectors=None):
        if tokenized_corpus and vectorizer_func and not vectors:

            iter1, iter2 = itertools.tee(tokenized_corpus)
            self._id_term_map, self._document_term_counts, self._doc_lengths, \
                self._term_frequency = _accumulate_terms(iter1)

            self._term_id_map = {term: id

                                 for id, term in self._id_term_map.items()}
            self._vectors = vectorizer_func(iter2, self)

        elif id_term_map and document_term_counts and doc_lengths and \

                term_frequency and vectors:

            self._id_term_map = id_term_map
            self._term_id_map = {term: id for id, term in self._id_term_map.items()}
            self._document_term_counts = document_term_counts
            self._doc_lengths = doc_lengths
            self._term_frequency = term_frequency
            self._vectors = vectors
        else:
            raise ValueError(
                "Must provide either tokenized corpora and vectorizer func, "
                "or global term collection, document term counts, and vectors.")

    def get_vectors(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        for doc_id, vector in self._vectors.items():
            yield doc_id, vector


    def __len__(self):
        return len(self._vectors)


    @property

    def id_term_map(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._id_term_map


    @property

    def term_id_map(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._term_id_map


    @property

    def global_term_count(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return len(self.id_term_map)


    @property

    def document_term_counts(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._document_term_counts


    @property

    def doc_lengths(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._doc_lengths


    @property

    def term_frequency(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._term_frequency


    @property

    def vectors(self):
class SomeClass:
    def some_method(self):
        """Do x and return foo."""
        return self._vectors



1			from collections import Counter
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This module should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
2			import itertools
3
4			def _accumulate_terms(tokenized_corpus):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This function should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
5			global_terms=set()
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report Exactly one space required around assignment global_terms=set() ^ Loading history...
6			document_term_counts = {}
7			doc_lengths = {}
8			global_term_frequency_counter = Counter()
9			for doc_id, doc in tokenized_corpus:
10			doc_terms = set(doc)
11			global_terms.update(doc_terms)
12			doc_lengths[doc_id] = len(doc)
13			document_term_counts[doc_id] = len(doc_terms)
14			global_term_frequency_counter.update(doc)
15			id_term_map = {}
16			global_term_frequency = {}
17			for term_id, term in enumerate(global_terms):
18			id_term_map[term_id] = term
19			global_term_frequency[term_id] = global_term_frequency_counter[term]
20
21			return id_term_map, document_term_counts, doc_lengths, global_term_frequency
22
23
24			class VectorizerOutput(object):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This class should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `object` does not seem to be defined. Loading history...
25			def __init__(self, tokenized_corpus=None, vectorizer_func=None,
26			id_term_map=None, document_term_counts=None, doc_lengths=None,
27			term_frequency=None, vectors=None):
28			if tokenized_corpus and vectorizer_func and not vectors:
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `vectors` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `tokenized_corpus` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `vectorizer_func` does not seem to be defined. Loading history...
29			iter1, iter2 = itertools.tee(tokenized_corpus)
30			self._id_term_map, self._document_term_counts, self._doc_lengths, \
31			self._term_frequency = _accumulate_terms(iter1)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `iter1` does not seem to be defined. Loading history...
32			self._term_id_map = {term: id
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `id` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `term` does not seem to be defined. Loading history...
33			for id, term in self._id_term_map.items()}
34			self._vectors = vectorizer_func(iter2, self)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `iter2` does not seem to be defined. Loading history...
35			elif id_term_map and document_term_counts and doc_lengths and \
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `id_term_map` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `doc_lengths` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `document_term_counts` does not seem to be defined. Loading history...
36			term_frequency and vectors:
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `term_frequency` does not seem to be defined. Loading history...
37			self._id_term_map = id_term_map
38			self._term_id_map = {term: id for id, term in self._id_term_map.items()}
39			self._document_term_counts = document_term_counts
40			self._doc_lengths = doc_lengths
41			self._term_frequency = term_frequency
42			self._vectors = vectors
43			else:
44			raise ValueError(
45			"Must provide either tokenized corpora and vectorizer func, "
46			"or global term collection, document term counts, and vectors.")
47
48			def get_vectors(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
49			for doc_id, vector in self._vectors.items():
50			yield doc_id, vector
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `vector` does not seem to be defined. Loading history... Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `doc_id` does not seem to be defined. Loading history...
51
52			def __len__(self):
53			return len(self._vectors)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
54
55			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
56			def id_term_map(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
57			return self._id_term_map
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
58
59			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
60			def term_id_map(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
61			return self._term_id_map
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
62
63			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
64			def global_term_count(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
65			return len(self.id_term_map)
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
66
67			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
68			def document_term_counts(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
69			return self._document_term_counts
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
70
71			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
72			def doc_lengths(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 15:20 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
73			return self._doc_lengths
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
74
75			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
76			def term_frequency(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 15:20 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
77			return self._term_frequency
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
78
79			@property
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `property` does not seem to be defined. Loading history...
80			def vectors(self):
			0 ignored issues – show Coding Style introduced 2015-11-23 14:51 UTC by Report Bug Copy Issue Report This method should have a docstring. The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods: class SomeClass: def some_method(self): """Do x and return foo.""" If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions. Loading history...
81			return self._vectors
			0 ignored issues – show Comprehensibility Best Practice introduced 2016-04-26 18:00 UTC by Report Bug Copy Issue Report The variable `self` does not seem to be defined. Loading history...
82

ContinuumIO / topik

VectorizerOutput A last analyzed 2016-04-26 18:00 UTC

Complexity

Size/Duplication

Importance

10 Methods

Duplication Side-by-Side

Filter issues like

VectorizerOutput A
last analyzed 2016-04-26 18:00 UTC