annif.analyzer.estnltk - Code Metrics - Inspection of "[WIP] EstNLTK analyzer" - NatLibFi/Annif - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Pull Request — main (#818)

by Osma

created 2024-11-12 19:14 UTC

annif.analyzer.estnltk A

↳ Parent: Project

Complexity

Total Complexity

Size/Duplication

Total Lines	24
Duplicated Lines	0 %

Importance

Changes

Metric	Value
eloc	16
dl	0
loc	24
rs	10
c	0
b	0
f	0
wmc	2

2 Methods

Rating	Name	Duplication	Size	Complexity
A	EstNLTKAnalyzer.__init__()	0	3	1
A	EstNLTKAnalyzer.tokenize_words()	0	9	1

"""EstNLTK analyzer for Annif which uses EstNLTK for lemmatization"""

from __future__ import annotations

from . import analyzer


class EstNLTKAnalyzer(analyzer.Analyzer):
    name = "estnltk"

    def __init__(self, param: str, **kwargs) -> None:
        self.param = param
        super().__init__(**kwargs)

    def tokenize_words(self, text: str, filter: bool = True) -> list[str]:
        import estnltk

        txt = estnltk.Text(text.strip())
        txt.tag_layer()
        return [
            lemma
            for lemma in [lemmas[0] for lemmas in txt.lemma]
            if (not filter or self.is_valid_token(lemma))
        ]


1			"""EstNLTK analyzer for Annif which uses EstNLTK for lemmatization"""
2
3			from __future__ import annotations
4
5			from . import analyzer
6
7
8			class EstNLTKAnalyzer(analyzer.Analyzer):
9			name = "estnltk"
10
11			def __init__(self, param: str, **kwargs) -> None:
12			self.param = param
13			super().__init__(**kwargs)
14
15			def tokenize_words(self, text: str, filter: bool = True) -> list[str]:
16			import estnltk
17
18			txt = estnltk.Text(text.strip())
19			txt.tag_layer()
20			return [
21			lemma
22			for lemma in [lemmas[0] for lemmas in txt.lemma]
23			if (not filter or self.is_valid_token(lemma))
24			]
25

NatLibFi / Annif

Pull Request — main (#818)

annif.analyzer.estnltk A

Complexity

Size/Duplication

Importance

2 Methods

Duplication Side-by-Side

Filter issues like