annif.analyzer.spacy.SpacyAnalyzer.tokenize_words() - Code Metrics - Inspection of "Add spaCy analyzer" - NatLibFi/Annif - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Pull Request — master (#527)

by Osma

created 2021-11-22 14:21 UTC

SpacyAnalyzer.tokenize_words() A

↳ Parent: annif.analyzer.spacy

Complexity

Conditions

Size

Total Lines	3
Code Lines	3

Duplication

Lines	0
Ratio	0 %

Importance

Changes

Metric	Value
cc	1
eloc	3
nop	2
dl	0
loc	3
rs	10
c	0
b	0
f	0

"""Simple analyzer for Annif. Only folds words to lower case."""

import spacy
from . import analyzer


class SpacyAnalyzer(analyzer.Analyzer):
    name = "spacy"

    def __init__(self, param, **kwargs):
        self.param = param
        self.nlp = spacy.load(param, exclude=['ner', 'parser'])
        super().__init__(**kwargs)

    def tokenize_words(self, text):
        return [lemma for lemma in (token.lemma_ for token in self.nlp(text))
                if self.is_valid_token(lemma)]

    def normalize_word(self, word):
        doc = self.nlp(word)
        return doc[:].lemma_


1			"""Simple analyzer for Annif. Only folds words to lower case."""
2
3			import spacy
4			from . import analyzer
5
6
7			class SpacyAnalyzer(analyzer.Analyzer):
8			name = "spacy"
9
10			def __init__(self, param, **kwargs):
11			self.param = param
12			self.nlp = spacy.load(param, exclude=['ner', 'parser'])
13			super().__init__(**kwargs)
14
15			def tokenize_words(self, text):
16			return [lemma for lemma in (token.lemma_ for token in self.nlp(text))
17			if self.is_valid_token(lemma)]
18
19			def normalize_word(self, word):
20			doc = self.nlp(word)
21			return doc[:].lemma_
22

NatLibFi / Annif

Pull Request — master (#527)

SpacyAnalyzer.tokenize_words() A

Complexity

Size

Duplication

Importance

Duplication Side-by-Side

Filter issues like