e2edutch.metrics.muc()   A
last analyzed

Complexity

Conditions 4

Size

Total Lines 13
Code Lines 12

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
eloc 12
dl 0
loc 13
rs 9.8
c 0
b 0
f 0
cc 4
nop 2
1
import numpy as np
0 ignored issues
show
introduced by
Missing module docstring
Loading history...
2
from collections import Counter
0 ignored issues
show
introduced by
standard import "from collections import Counter" should be placed before "import numpy as np"
Loading history...
3
from scipy.optimize import linear_sum_assignment as linear_assignment
4
5
6
def f1(p_num, p_den, r_num, r_den, beta=1):
0 ignored issues
show
Coding Style Naming introduced by
Function name "f1" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing function or method docstring
Loading history...
7
    p = 0 if p_den == 0 else p_num / float(p_den)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "p" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
8
    r = 0 if r_den == 0 else r_num / float(r_den)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "r" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
9
    return 0 if p + r == 0 else (1 + beta * beta) * \
10
        p * r / (beta * beta * p + r)
11
12
13
class CorefEvaluator(object):
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
introduced by
Class 'CorefEvaluator' inherits from object, can be safely removed from bases in python3
Loading history...
14
    def __init__(self):
15
        self.evaluators = [Evaluator(m) for m in (muc, b_cubed, ceafe)]
16
17
    def update(self, predicted, gold, mention_to_predicted, mention_to_gold):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
18
        for e in self.evaluators:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "e" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
19
            e.update(predicted, gold, mention_to_predicted, mention_to_gold)
20
21
    def get_f1(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
22
        return sum(e.get_f1() for e in self.evaluators) / len(self.evaluators)
23
24
    def get_recall(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
25
        return sum(e.get_recall()
26
                   for e in self.evaluators) / len(self.evaluators)
27
28
    def get_precision(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
29
        return sum(e.get_precision()
30
                   for e in self.evaluators) / len(self.evaluators)
31
32
    def get_prf(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
33
        return self.get_precision(), self.get_recall(), self.get_f1()
34
35
36
class Evaluator(object):
0 ignored issues
show
introduced by
Missing class docstring
Loading history...
introduced by
Class 'Evaluator' inherits from object, can be safely removed from bases in python3
Loading history...
37
    def __init__(self, metric, beta=1):
38
        self.p_num = 0
39
        self.p_den = 0
40
        self.r_num = 0
41
        self.r_den = 0
42
        self.metric = metric
43
        self.beta = beta
44
45
    def update(self, predicted, gold, mention_to_predicted, mention_to_gold):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
46
        if self.metric == ceafe:
0 ignored issues
show
introduced by
Comparing against a callable, did you omit the parenthesis?
Loading history...
47
            pn, pd, rn, rd = self.metric(predicted, gold)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "pn" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "pd" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "rn" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "rd" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
48
        else:
49
            pn, pd = self.metric(predicted, mention_to_gold)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "pn" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "pd" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
50
            rn, rd = self.metric(gold, mention_to_predicted)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "rn" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "rd" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
51
        self.p_num += pn
52
        self.p_den += pd
53
        self.r_num += rn
54
        self.r_den += rd
55
56
    def get_f1(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
57
        return f1(self.p_num, self.p_den, self.r_num,
58
                  self.r_den, beta=self.beta)
59
60
    def get_recall(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
61
        return 0 if self.r_num == 0 else self.r_num / float(self.r_den)
62
63
    def get_precision(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
64
        return 0 if self.p_num == 0 else self.p_num / float(self.p_den)
65
66
    def get_prf(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
67
        return self.get_precision(), self.get_recall(), self.get_f1()
68
69
    def get_counts(self):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
70
        return self.p_num, self.p_den, self.r_num, self.r_den
71
72
73
def evaluate_documents(documents, metric, beta=1):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
74
    evaluator = Evaluator(metric, beta=beta)
75
    for document in documents:
76
        evaluator.update(document)
0 ignored issues
show
Bug introduced by
It seems like a value for argument gold is missing in the method call.
Loading history...
Bug introduced by
It seems like a value for argument mention_to_predicted is missing in the method call.
Loading history...
Bug introduced by
It seems like a value for argument mention_to_gold is missing in the method call.
Loading history...
77
    return evaluator.get_precision(), evaluator.get_recall(), evaluator.get_f1()
78
79
80
def b_cubed(clusters, mention_to_gold):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
81
    num, dem = 0, 0
82
83
    for c in clusters:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "c" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
84
        if len(c) == 1:
85
            continue
86
87
        gold_counts = Counter()
88
        correct = 0
89
        for m in c:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
90
            if m in mention_to_gold:
91
                gold_counts[tuple(mention_to_gold[m])] += 1
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable tuple does not seem to be defined.
Loading history...
92
        for c2, count in gold_counts.items():
0 ignored issues
show
Coding Style Naming introduced by
Variable name "c2" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
93
            if len(c2) != 1:
94
                correct += count * count
95
96
        num += correct / float(len(c))
97
        dem += len(c)
98
99
    return num, dem
100
101
102
def muc(clusters, mention_to_gold):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
103
    tp, p = 0, 0
0 ignored issues
show
Coding Style Naming introduced by
Variable name "tp" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Variable name "p" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
104
    for c in clusters:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "c" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
105
        p += len(c) - 1
0 ignored issues
show
Coding Style Naming introduced by
Variable name "p" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
106
        tp += len(c)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "tp" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
107
        linked = set()
108
        for m in c:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
109
            if m in mention_to_gold:
110
                linked.add(mention_to_gold[m])
111
            else:
112
                tp -= 1
0 ignored issues
show
Coding Style Naming introduced by
Variable name "tp" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
113
        tp -= len(linked)
0 ignored issues
show
Coding Style Naming introduced by
Variable name "tp" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
114
    return tp, p
115
116
117
def phi4(c1, c2):
0 ignored issues
show
Coding Style Naming introduced by
Argument name "c1" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
Coding Style Naming introduced by
Argument name "c2" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
introduced by
Missing function or method docstring
Loading history...
118
    return 2 * len([m for m in c1 if m in c2]) / float(len(c1) + len(c2))
119
120
121
def ceafe(clusters, gold_clusters):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
122
    clusters = [c for c in clusters if len(c) != 1]
123
    scores = np.zeros((len(gold_clusters), len(clusters)))
124
    for i in range(len(gold_clusters)):
0 ignored issues
show
unused-code introduced by
Consider using enumerate instead of iterating with range and len
Loading history...
125
        for j in range(len(clusters)):
0 ignored issues
show
unused-code introduced by
Consider using enumerate instead of iterating with range and len
Loading history...
126
            scores[i, j] = phi4(gold_clusters[i], clusters[j])
127
    matching_row, matching_col = linear_assignment(-scores)
128
    similarity = sum(scores[matching_row, matching_col])
129
    return similarity, len(clusters), similarity, len(gold_clusters)
130
131
132
def lea(clusters, mention_to_gold):
0 ignored issues
show
introduced by
Missing function or method docstring
Loading history...
133
    num, dem = 0, 0
134
135
    for c in clusters:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "c" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
136
        if len(c) == 1:
137
            continue
138
139
        common_links = 0
140
        all_links = len(c) * (len(c) - 1) / 2.0
141
        for i, m in enumerate(c):
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
142
            if m in mention_to_gold:
143
                for m2 in c[i + 1:]:
0 ignored issues
show
Coding Style Naming introduced by
Variable name "m2" doesn't conform to snake_case naming style ('([^\\W\\dA-Z][^\\WA-Z]2,|_[^\\WA-Z]*|__[^\\WA-Z\\d_][^\\WA-Z]+__)$' pattern)

This check looks for invalid names for a range of different identifiers.

You can set regular expressions to which the identifiers must conform if the defaults do not match your requirements.

If your project includes a Pylint configuration file, the settings contained in that file take precedence.

To find out more about Pylint, please refer to their site.

Loading history...
144
                    if m2 in mention_to_gold and mention_to_gold[m] == mention_to_gold[m2]:
145
                        common_links += 1
146
147
        num += len(c) * common_links / float(all_links)
148
        dem += len(c)
149
150
    return num, dem
151