Completed
Branch master (db5e7a)
by Osma
09:07 queued 05:09
created

parse_backend_params()   A

Complexity

Conditions 2

Size

Total Lines 9

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 2
c 0
b 0
f 0
dl 0
loc 9
rs 9.6666
1
"""Definitions for command-line (Click) commands for invoking Annif
2
operations and printing the results to console."""
3
4
5
import collections
6
import logging
0 ignored issues
show
Unused Code introduced by
The import logging seems to be unused.
Loading history...
7
import sys
8
import click
9
import click_log
10
from flask.cli import FlaskGroup
11
import annif
12
import annif.corpus
13
import annif.eval
14
import annif.project
15
from annif import logger
16
17
click_log.basic_config(logger)
18
19
cli = FlaskGroup(create_app=annif.create_app)
0 ignored issues
show
Coding Style Naming introduced by
The name cli does not conform to the constant naming conventions ((([A-Z_][A-Z0-9_]*)|(__.*__))$).

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
20
21
22
def get_project(project_id):
23
    """Helper function to get a project by ID and bail out if it doesn't exist"""
24
    try:
25
        return annif.project.get_project(project_id)
26
    except ValueError:
27
        click.echo(
28
            "No projects found with id \'{0}\'.".format(project_id),
29
            err=True)
30
        sys.exit(1)
31
32
33
def parse_backend_params(backend_param):
34
    """Parse a list of backend parameters given with the --backend-param
35
    option into a nested dict structure"""
36
    backend_params = collections.defaultdict(dict)
37
    for beparam in backend_param:
38
        backend, param = beparam.split('.', 1)
39
        key, val = param.split('=', 1)
40
        backend_params[backend][key] = val
41
    return backend_params
42
43
44
@cli.command('list-projects')
45
def run_list_projects():
46
    """
47
    List available projects.
48
49
    Usage: annif list-projects
50
    """
51
52
    template = "{0: <15}{1: <15}"
53
54
    header = template.format("Project ID", "Language")
55
    click.echo(header)
56
    click.echo("-" * len(header))
57
58
    for proj in annif.project.get_projects().values():
59
        click.echo(template.format(proj.project_id, proj.language))
60
61
62
@cli.command('show-project')
63
@click.argument('project_id')
64
def run_show_project(project_id):
65
    """
66
    Show project information.
67
68
    Usage: annif show-project <project_id>
69
70
    Outputs a human-readable string representation formatted as follows:
71
72
    Project ID:    testproj
73
    Language:      fi
74
    """
75
76
    proj = get_project(project_id)
77
78
    template = "{0:<15}{1}"
79
80
    click.echo(template.format('Project ID:', proj.project_id))
81
    click.echo(template.format('Language:', proj.language))
82
83
84
@cli.command('load')
85
@click_log.simple_verbosity_option(logger)
86
@click.argument('project_id')
87
@click.argument('directory')
88
def run_load(project_id, directory):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
89
    proj = get_project(project_id)
90
    subjects = annif.corpus.SubjectDirectory(directory)
91
    proj.load_subjects(subjects)
92
93
94
@cli.command('list-subjects')
95
@click.argument('project_id')
96
def run_list_subjects():
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
97
    click.echo("TODO")
98
99
100
@cli.command('show-subject')
101
@click.argument('project_id')
102
@click.argument('subject_id')
103
def run_show_subject(project_id, subject_id):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
The argument subject_id seems to be unused.
Loading history...
Unused Code introduced by
The argument project_id seems to be unused.
Loading history...
104
    click.echo("TODO")
105
106
107
@cli.command('create-subject')
108
@click.argument('project_id')
109
@click.argument('subject_id')
110
def run_create_subject(project_id, subject_id):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
The argument subject_id seems to be unused.
Loading history...
Unused Code introduced by
The argument project_id seems to be unused.
Loading history...
111
    click.echo("TODO")
112
113
114
@cli.command('drop-subject')
115
@click.argument('project_id')
116
@click.argument('subject_id')
117
def run_drop_subject(project_id, subject_id):
0 ignored issues
show
Coding Style introduced by
This function should have a docstring.

The coding style of this project requires that you add a docstring to this code element. Below, you find an example for methods:

class SomeClass:
    def some_method(self):
        """Do x and return foo."""

If you would like to know more about docstrings, we recommend to read PEP-257: Docstring Conventions.

Loading history...
Unused Code introduced by
The argument subject_id seems to be unused.
Loading history...
Unused Code introduced by
The argument project_id seems to be unused.
Loading history...
118
    click.echo("TODO")
119
120
121
@cli.command('analyze')
122
@click_log.simple_verbosity_option(logger)
123
@click.argument('project_id')
124
@click.option('--limit', default=10)
125
@click.option('--threshold', default=0.0)
126
@click.option('--backend-param', '-b', multiple=True)
127
def run_analyze(project_id, limit, threshold, backend_param):
128
    """"
129
    Analyze a document.
130
131
    USAGE: annif analyze <project_id> [--limit=N] [--threshold=N] <document.txt
132
    """
133
    project = get_project(project_id)
134
    text = sys.stdin.read()
135
    backend_params = parse_backend_params(backend_param)
136
    hits = project.analyze(text, limit, threshold, backend_params)
137
    for hit in hits:
138
        click.echo("{}\t<{}>\t{}".format(hit.score, hit.uri, hit.label))
139
140
141
@cli.command('eval')
142
@click_log.simple_verbosity_option(logger)
143
@click.argument('project_id')
144
@click.argument('subject_file')
145
@click.option('--limit', default=10)
146
@click.option('--threshold', default=0.0)
147
@click.option('--backend-param', '-b', multiple=True)
148
def run_eval(project_id, subject_file, limit, threshold, backend_param):
149
    """"
150
    Evaluate the analysis result for a document against a gold standard
151
    given in a subject file.
152
153
    USAGE: annif eval <project_id> <subject_file> [--limit=N]
154
           [--threshold=N] <document.txt
155
    """
156
    project = get_project(project_id)
157
    text = sys.stdin.read()
158
    backend_params = parse_backend_params(backend_param)
159
    hits = project.analyze(text, limit, threshold, backend_params)
160
    with open(subject_file) as subjfile:
161
        gold_subjects = annif.corpus.SubjectSet(subjfile.read())
162
163
    template = "{0:<20}\t{1}"
164
    for metric, result, merge_function in annif.eval.evaluate_hits(
0 ignored issues
show
Unused Code introduced by
The variable merge_function seems to be unused.
Loading history...
165
            hits, gold_subjects):
166
        click.echo(template.format(metric + ":", result))
167
168
169
@cli.command('evaldir')
170
@click_log.simple_verbosity_option(logger)
171
@click.argument('project_id')
172
@click.argument('directory')
173
@click.option('--limit', default=10)
174
@click.option('--threshold', default=0.0)
175
@click.option('--backend-param', '-b', multiple=True)
176
def run_evaldir(project_id, directory, limit, threshold, backend_param):
0 ignored issues
show
Comprehensibility introduced by
This function exceeds the maximum number of variables (21/15).
Loading history...
177
    """"
178
    Evaluate the analysis results for a directory with documents against a
179
    gold standard given in subject files.
180
181
    USAGE: annif evaldir <project_id> <directory> [--limit=N]
182
           [--threshold=N]
183
    """
184
    project = get_project(project_id)
185
    backend_params = parse_backend_params(backend_param)
186
187
    measures = collections.OrderedDict()
188
    merge_functions = {}
189
    for docfilename, subjectfilename in annif.corpus.DocumentDirectory(
190
            directory, require_subjects=True):
191
        with open(docfilename) as docfile:
192
            text = docfile.read()
193
        hits = project.analyze(text, limit, threshold, backend_params)
194
        with open(subjectfilename) as subjfile:
195
            gold_subjects = annif.corpus.SubjectSet(subjfile.read())
196
197
        for metric, result, merge_function in annif.eval.evaluate_hits(
198
                hits, gold_subjects):
199
            measures.setdefault(metric, [])
200
            measures[metric].append(result)
201
            merge_functions[metric] = merge_function
202
203
    template = "{0:<20}\t{1}"
204
    for metric, results in measures.items():
205
        result = merge_functions[metric](results)
206
        click.echo(template.format(metric + ":", result))
207
208
209
if __name__ == '__main__':
210
    cli()
211