annif.analyzer.estnltk.EstNLTKAnalyzer.tokenize_words() - Code Metrics - Inspection of "[WIP] EstNLTK analyzer" - NatLibFi/Annif - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Pull Request — main (#818)

by Osma

created 2024-11-12 19:07 UTC

EstNLTKAnalyzer.tokenize_words() A

↳ Parent: annif.analyzer.estnltk

Complexity

Conditions

Size

Total Lines	11
Code Lines	9

Duplication

Lines	0
Ratio	0 %

Importance

Changes

Metric	Value
cc	1
eloc	9
nop	3
dl	0
loc	11
rs	9.95
c	0
b	0
f	0

"""EstNLTK analyzer for Annif which uses EstNLTK for lemmatization"""

from __future__ import annotations

import annif.util
from annif.exception import OperationFailedException

from . import analyzer


class EstNLTKAnalyzer(analyzer.Analyzer):
    name = "estnltk"

    def __init__(self, param: str, **kwargs) -> None:
        self.param = param
        super().__init__(**kwargs)

    def tokenize_words(self, text: str, filter: bool = True) -> list[str]:
        import estnltk

        txt = estnltk.Text(text.strip())
        txt.tag_layer()
        lemmas = [
            lemma
            for lemma in [l[0] for l in txt.lemma]
            if (not filter or self.is_valid_token(lemma))
        ]
        return lemmas


1			"""EstNLTK analyzer for Annif which uses EstNLTK for lemmatization"""
2
3			from __future__ import annotations
4
5			import annif.util
6			from annif.exception import OperationFailedException
7
8			from . import analyzer
9
10
11			class EstNLTKAnalyzer(analyzer.Analyzer):
12			name = "estnltk"
13
14			def __init__(self, param: str, **kwargs) -> None:
15			self.param = param
16			super().__init__(**kwargs)
17
18			def tokenize_words(self, text: str, filter: bool = True) -> list[str]:
19			import estnltk
20
21			txt = estnltk.Text(text.strip())
22			txt.tag_layer()
23			lemmas = [
24			lemma
25			for lemma in [l[0] for l in txt.lemma]
26			if (not filter or self.is_valid_token(lemma))
27			]
28			return lemmas
29

NatLibFi / Annif

Pull Request — main (#818)

EstNLTKAnalyzer.tokenize_words() A

Complexity

Size

Duplication

Importance

Duplication Side-by-Side

Filter issues like