GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.
Completed
Push — master ( a2486a...0c98d4 )
by Oana
15:29
created

Metrics.annotation_quality_score()   B

Complexity

Conditions 8

Size

Total Lines 56
Code Lines 25

Duplication

Lines 0
Ratio 0 %

Code Coverage

Tests 20
CRAP Score 8

Importance

Changes 0
Metric Value
eloc 25
dl 0
loc 56
ccs 20
cts 20
cp 1
rs 7.3333
c 0
b 0
f 0
cc 8
nop 4
crap 8

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
"""
2
Initialization of CrowdTruth metrics
3
"""
4 1
import logging
5 1
import math
6
7 1
from collections import Counter
8
9 1
import numpy as np
10 1
import pandas as pd
11
12 1
SMALL_NUMBER_CONST = 0.00000001
13
14 1
class Metrics():
15
    """
16
    Computes and applies the CrowdTruth metrics for evaluating units, workers and annotations.
17
    """
18
19
    # Unit Quality Score
20 1
    @staticmethod
21
    def unit_quality_score(unit_id, unit_work_ann_dict, wqs, aqs):
22
        """
23
        Computes the unit quality score.
24
25
        The unit quality score (UQS) is computed as the average cosine similarity between
26
        all worker vectors for a given unit, weighted by the worker quality (WQS) and the
27
        annotation quality (AQS). The goal is to capture the degree of agreement in annotating
28
        the media unit.
29
30
        Through the weighted average, workers and annotations with lower quality will have
31
        less of an impact on the final score.
32
33
        To weigh the metrics with the annotation quality, we compute weighted_cosine, the weighted
34
        version of the cosine similarity.
35
36
        Args:
37
            unit_id: Unit id.
38
            unit_work_ann_dict: A dictionary that contains all the workers judgments for the unit.
39
            aqs: Dict of annotation_id (string) that contains the annotation quality score (float)
40
            wqs: Dict of worker_id (string) that contains the worker quality score (float)
41
42
        Returns:
43
            The quality score (UQS) of the given unit.
44
        """
45
46 1
        uqs_numerator = 0.0
47 1
        uqs_denominator = 0.0
48 1
        worker_ids = list(unit_work_ann_dict[unit_id].keys())
49
50 1
        for worker_i in range(len(worker_ids) - 1):
51 1
            for worker_j in range(worker_i + 1, len(worker_ids)):
52
                # print worker_ids[i] + " - " + worker_ids[j] + "\n"
53 1
                numerator = 0.0
54 1
                denominator_i = 0.0
55 1
                denominator_j = 0.0
56
57 1
                worker_i_vector = unit_work_ann_dict[unit_id][worker_ids[worker_i]]
58 1
                worker_j_vector = unit_work_ann_dict[unit_id][worker_ids[worker_j]]
59
60 1
                for ann in worker_i_vector:
61 1
                    worker_i_vector_ann = worker_i_vector[ann]
62 1
                    worker_j_vector_ann = worker_j_vector[ann]
63 1
                    numerator += aqs[ann] * (worker_i_vector_ann * worker_j_vector_ann)
64 1
                    denominator_i += aqs[ann] * (worker_i_vector_ann * worker_i_vector_ann)
65 1
                    denominator_j += aqs[ann] * (worker_j_vector_ann * worker_j_vector_ann)
66
67 1
                weighted_cosine = numerator / math.sqrt(denominator_i * denominator_j)
68
69 1
                uqs_numerator += weighted_cosine * wqs[worker_ids[worker_i]] * \
70
                                 wqs[worker_ids[worker_j]]
71 1
                uqs_denominator += wqs[worker_ids[worker_i]] * wqs[worker_ids[worker_j]]
72
73 1
        if uqs_denominator < SMALL_NUMBER_CONST:
74 1
            uqs_denominator = SMALL_NUMBER_CONST
75 1
        return uqs_numerator / uqs_denominator
76
77
78
    # Worker - Unit Agreement
79 1
    @staticmethod
80
    def worker_unit_agreement(worker_id, unit_ann_dict, work_unit_ann_dict, uqs, aqs, wqs):
81
        """
82
        Computes the worker agreement on a unit.
83
84
        The worker unit agreement (WUA) is the average cosine distance between the annotations
85
        of a worker i and all the other annotations for the units they have worked on,
86
        weighted by the unit and annotation quality. It calculates how much a worker disagrees
87
        with the crowd on a unit basis.
88
89
        Through the weighted average, units and anntation with lower quality will have less
90
        of an impact on the final score.
91
92
        Args:
93
            worker_id: Worker id.
94
            unit_ann_dict: Dictionary of units and their aggregated annotations.
95
            work_unit_ann_dict: Dictionary of units (and its annotation) annotated by the worker.
96
            uqs: Dict unit_id that contains the unit quality scores (float).
97
            aqs: Dict of annotation_id (string) that contains the annotation quality scores (float).
98
            wqs: Dict of worker_id (string) that contains the worker quality scores (float).
99
100
        Returns:
101
            The worker unit agreement score for the given worker.
102
        """
103
104 1
        wsa_numerator = 0.0
105 1
        wsa_denominator = 0.0
106 1
        work_unit_ann_dict_worker_id = work_unit_ann_dict[worker_id]
107
108 1
        for unit_id in work_unit_ann_dict_worker_id:
109 1
            numerator = 0.0
110 1
            denominator_w = 0.0
111 1
            denominator_s = 0.0
112
113 1
            worker_vector = work_unit_ann_dict[worker_id][unit_id]
114 1
            unit_vector = unit_ann_dict[unit_id]
115
116 1
            for ann in worker_vector:
117 1
                worker_vector_ann = worker_vector[ann] * wqs
118 1
                unit_vector_ann = unit_vector[ann]
119
120 1
                numerator += aqs[ann] * worker_vector_ann * \
121
                    (unit_vector_ann - worker_vector_ann)
122 1
                denominator_w += aqs[ann] * \
123
                    (worker_vector_ann * worker_vector_ann)
124 1
                denominator_s += aqs[ann] * ( \
125
                    (unit_vector_ann - worker_vector_ann) * \
126
                    (unit_vector_ann - worker_vector_ann))
127 1
            weighted_cosine = None
128 1
            if math.sqrt(denominator_w * denominator_s) < SMALL_NUMBER_CONST:
129 1
                weighted_cosine = SMALL_NUMBER_CONST
130
            else:
131 1
                weighted_cosine = numerator / math.sqrt(denominator_w * denominator_s)
132 1
            wsa_numerator += weighted_cosine * uqs[unit_id]
133 1
            wsa_denominator += uqs[unit_id]
134 1
        if wsa_denominator < SMALL_NUMBER_CONST:
135 1
            wsa_denominator = SMALL_NUMBER_CONST
136 1
        return wsa_numerator / wsa_denominator
137
138
    # Worker - Worker Agreement
139 1
    @staticmethod
140
    def worker_worker_agreement(worker_id, work_unit_ann_dict, unit_work_ann_dict, wqs, uqs, aqs):
141
        """
142
        Computes the agreement between every two workers.
143
144
        The worker-worker agreement (WWA) is the average cosine distance between the annotations of
145
        a worker i and all other workers that have worked on the same media units as worker i,
146
        weighted by the worker and annotation qualities.
147
148
        The metric gives an indication as to whether there are consisently like-minded workers.
149
        This is useful for identifying communities of thought.
150
151
        Through the weighted average, workers and annotations with lower quality will have less
152
        of an impact on the final score of the given worker.
153
154
        Args:
155
            worker_id: Worker id.
156
            work_unit_ann_dict: Dictionary of worker annotation vectors on annotated units.
157
            unit_work_ann_dict: Dictionary of unit annotation vectors.
158
            uqs: Dict unit_id that contains the unit quality scores (float).
159
            aqs: Dict of annotation_id (string) that contains the annotation quality scores (float).
160
            wqs: Dict of worker_id (string) that contains the worker quality scores (float).
161
162
        Returns:
163
            The worker worker agreement score for the given worker.
164
        """
165
166 1
        wwa_numerator = 0.0
167 1
        wwa_denominator = 0.0
168
169 1
        worker_vector = work_unit_ann_dict[worker_id]
170 1
        unit_ids = list(work_unit_ann_dict[worker_id].keys())
171
172 1
        for unit_id in unit_ids:
173 1
            wv_unit_id = worker_vector[unit_id]
174 1
            unit_work_ann_dict_unit_id = unit_work_ann_dict[unit_id]
175 1
            for other_workid in unit_work_ann_dict_unit_id:
176 1
                if worker_id != other_workid:
177 1
                    numerator = 0.0
178 1
                    denominator_w = 0.0
179 1
                    denominator_ow = 0.0
180
181 1
                    unit_work_ann_dict_uid_oworkid = unit_work_ann_dict_unit_id[other_workid]
182 1
                    for ann in wv_unit_id:
183 1
                        unit_work_ann_dict_uid_oworkid_ann = unit_work_ann_dict_uid_oworkid[ann]
184 1
                        wv_unit_id_ann = wv_unit_id[ann]
185
186 1
                        numerator += aqs[ann] * (wv_unit_id_ann * \
187
                                     unit_work_ann_dict_uid_oworkid_ann)
188
189 1
                        denominator_w += aqs[ann] * (wv_unit_id_ann * wv_unit_id_ann)
190
191 1
                        denominator_ow += aqs[ann] * \
192
                                         (unit_work_ann_dict_uid_oworkid_ann *\
193
                                          unit_work_ann_dict_uid_oworkid_ann)
194
195 1
                    weighted_cosine = numerator / math.sqrt(denominator_w * denominator_ow)
196
                    # pdb.set_trace()
197 1
                    wwa_numerator += weighted_cosine * wqs[other_workid] * uqs[unit_id]
198 1
                    wwa_denominator += wqs[other_workid] * uqs[unit_id]
199 1
        if wwa_denominator < SMALL_NUMBER_CONST:
200 1
            wwa_denominator = SMALL_NUMBER_CONST
201 1
        return wwa_numerator / wwa_denominator
202
203
204
205
    # Unit - Annotation Score (UAS)
206 1
    @staticmethod
207
    def unit_annotation_score(unit_id, annotation, unit_work_annotation_dict, wqs):
208
        """
209
        Computes the unit annotation score.
210
211
        The unit - annotation score (UAS) calculates the likelihood that annotation a
212
        is expressed in unit u. It is the ratio of the number of workers that picked
213
        annotation a over all workers that annotated the unit, weighted by the worker quality.
214
215
        Args:
216
            unit_id: Unit id.
217
            annotation: Annotation.
218
            unit_work_annotation_dict: Dictionary of unit annotation vectors.
219
            wqs: Dict of worker_id (string) that contains the worker quality scores (float).
220
221
        Returns:
222
            The unit annotation score for the given unit and annotation.
223
        """
224
225 1
        uas_numerator = 0.0
226 1
        uas_denominator = 0.0
227
228 1
        worker_ids = unit_work_annotation_dict[unit_id]
229 1
        for worker_id in worker_ids:
230 1
            uas_numerator += worker_ids[worker_id][annotation] * wqs[worker_id]
231 1
            uas_denominator += wqs[worker_id]
232 1
        if uas_denominator < SMALL_NUMBER_CONST:
233 1
            uas_denominator = SMALL_NUMBER_CONST
234 1
        return uas_numerator / uas_denominator
235
236 1
    @staticmethod
237
    def compute_ann_quality_factors(numerator, denominator, work_unit_ann_dict_worker_i, \
238
                                    work_unit_ann_dict_worker_j, ann, uqs):
239
        """
240
        Computes the factors for each unit annotation.
241
242
        Args:
243
            numerator: Current numerator
244
            denominator: Current denominator
245
            work_unit_ann_dict_worker_i: Dict of worker i annotation vectors on annotated units.
246
            work_unit_ann_dict_worker_j: Dict of worker j annotation vectors on annotated units.
247
            ann: Annotation value
248
            uqs: Dict unit_id that contains the unit quality scores (float).
249
250
        Returns:
251
            The annotation quality factors.
252
        """
253 1
        for unit_id, work_unit_ann_dict_work_i_unit in work_unit_ann_dict_worker_i.items():
254 1
            if unit_id in work_unit_ann_dict_worker_j:
255 1
                work_unit_ann_dict_work_j_unit = work_unit_ann_dict_worker_j[unit_id]
256
257 1
                work_unit_ann_dict_wj_unit_ann = work_unit_ann_dict_work_j_unit[ann]
258
259 1
                def compute_numerator_aqs(unit_id_ann_value, worker_i_ann_value, \
260
                                                          worker_j_ann_value):
261
                    """ compute numerator """
262 1
                    numerator = unit_id_ann_value * worker_i_ann_value * \
263
                                                worker_j_ann_value
264 1
                    return numerator
265
266 1
                def compute_denominator_aqs(unit_id_ann_value, worker_j_ann_value):
267
                    """ compute denominator """
268 1
                    denominator = unit_id_ann_value * worker_j_ann_value
269 1
                    return denominator
270
271 1
                numerator += compute_numerator_aqs(uqs[unit_id], \
272
                                                    work_unit_ann_dict_work_i_unit[ann], \
273
                                                    work_unit_ann_dict_wj_unit_ann)
274 1
                denominator += compute_denominator_aqs(uqs[unit_id], \
275
                                                        work_unit_ann_dict_wj_unit_ann)
276 1
        return numerator, denominator
277
278 1
    @staticmethod
279
    def aqs_dict(annotations, aqs_numerator, aqs_denominator):
280
        """
281
        Create the dictionary of annotation quality score values.
282
283
        Args:
284
            annotations: Dictionary of annotations.
285
            aqs_numerator: Annotation numerator.
286
            aqs_denominator: Annotation denominator.
287
288
289
        Returns:
290
            The dictionary of annotation quality scores.
291
        """
292
293 1
        aqs = dict()
294 1
        for ann in annotations:
295 1
            if aqs_denominator[ann] > SMALL_NUMBER_CONST:
296 1
                aqs[ann] = aqs_numerator[ann] / aqs_denominator[ann]
297
                # prevent division by zero by storing very small value instead
298 1
                if aqs[ann] < SMALL_NUMBER_CONST:
299 1
                    aqs[ann] = SMALL_NUMBER_CONST
300
            else:
301 1
                aqs[ann] = SMALL_NUMBER_CONST
302 1
        return aqs
303
304
305
    # Annotation Quality Score (AQS)
306 1
    @staticmethod
307
    def annotation_quality_score(annotations, work_unit_ann_dict, uqs, wqs):
308
        """
309
        Computes the annotation quality score.
310
311
        The annotation quality score AQS calculates the agreement of selecting an annotation a,
312
        over all the units it appears in. Therefore, it is only applicable to closed tasks, where
313
        the same annotation set is used for all units. It is based on the probability that if a
314
        worker j annotates annotation a in a unit, worker i will also annotate it.
315
316
        The annotation quality score is the weighted average of these probabilities for all possible
317
        pairs of workers. Through the weighted average, units and workers with lower quality will
318
        have less of an impact on the final score of the annotation.
319
320
        Args:
321
            annotations: Possible annotations.
322
            work_unit_annotation_dict: Dictionary of worker annotation vectors on annotated units.
323
            uqs: Dict unit_id that contains the unit quality scores (float).
324
            wqs: Dict of worker_id (string) that contains the worker quality scores (float).
325
326
        Returns:
327
            The worker worker agreement score for the given worker.
328
        """
329
330 1
        aqs_numerator = dict()
331 1
        aqs_denominator = dict()
332
333 1
        for ann in annotations:
334 1
            aqs_numerator[ann] = 0.0
335 1
            aqs_denominator[ann] = 0.0
336
337 1
        for worker_i, work_unit_ann_dict_worker_i in work_unit_ann_dict.items():
338
            #work_unit_ann_dict_worker_i = work_unit_ann_dict[worker_i]
339 1
            work_unit_ann_dict_i_keys = list(work_unit_ann_dict_worker_i.keys())
340 1
            for worker_j, work_unit_ann_dict_worker_j in work_unit_ann_dict.items():
341
                #work_unit_ann_dict_worker_j = work_unit_ann_dict[worker_j]
342 1
                work_unit_ann_dict_j_keys = list(work_unit_ann_dict_worker_j.keys())
343
344 1
                length_keys = len(np.intersect1d(np.array(work_unit_ann_dict_i_keys), \
345
                                                 np.array(work_unit_ann_dict_j_keys)))
346
347 1
                if worker_i != worker_j and length_keys > 0:
348 1
                    for ann in annotations:
349 1
                        numerator = 0.0
350 1
                        denominator = 0.0
351
352 1
                        numerator, denominator = Metrics.compute_ann_quality_factors(numerator, \
353
                                                    denominator, work_unit_ann_dict_worker_i, \
354
                                                    work_unit_ann_dict_worker_j, ann, uqs)
355
356 1
                        if denominator > 0:
357 1
                            aqs_numerator[ann] += wqs[worker_i] * wqs[worker_j] * \
358
                                                        numerator / denominator
359 1
                            aqs_denominator[ann] += wqs[worker_i] * wqs[worker_j]
360
361 1
        return Metrics.aqs_dict(annotations, aqs_numerator, aqs_denominator)
362
363
364 1
    @staticmethod
365 1
    def run(results, config, max_delta=0.001):
366
        '''
367
        iteratively run the CrowdTruth metrics
368
        '''
369
370 1
        judgments = results['judgments'].copy()
371 1
        units = results['units'].copy()
372
373
        # unit_work_ann_dict, work_unit_ann_dict, unit_ann_dict
374
        # to be done: change to use all vectors in one unit
375 1
        col = list(config.output.values())[0]
376 1
        unit_ann_dict = dict(units.copy()[col])
377
378 1
        def expanded_vector(worker, unit):
379
            '''
380
            expand the vector of a worker on a given unit
381
            '''
382 1
            vector = Counter()
383 1
            for ann in unit:
384 1
                if ann in worker:
385 1
                    vector[ann] = worker[ann]
386
                else:
387 1
                    vector[ann] = 0
388 1
            return vector
389
390
        # fill judgment vectors with unit keys
391 1
        for index, row in judgments.iterrows():
392 1
            judgments.at[index, col] = expanded_vector(row[col], units.at[row['unit'], col])
393
394 1
        unit_work_ann_dict = judgments[['unit', 'worker', col]].copy().groupby('unit')
395 1
        unit_work_ann_dict = {name : group.set_index('worker')[col].to_dict() \
396
                                for name, group in unit_work_ann_dict}
397
398 1
        work_unit_ann_dict = judgments[['worker', 'unit', col]].copy().groupby('worker')
399 1
        work_unit_ann_dict = {name : group.set_index('unit')[col].to_dict() \
400
                                for name, group in work_unit_ann_dict}
401
402
        #initialize data structures
403 1
        uqs_list = list()
404 1
        wqs_list = list()
405 1
        wwa_list = list()
406 1
        wsa_list = list()
407 1
        aqs_list = list()
408
409 1
        uqs = dict((unit_id, 1.0) for unit_id in unit_work_ann_dict)
410 1
        wqs = dict((worker_id, 1.0) for worker_id in work_unit_ann_dict)
411 1
        wwa = dict((worker_id, 1.0) for worker_id in work_unit_ann_dict)
412 1
        wsa = dict((worker_id, 1.0) for worker_id in work_unit_ann_dict)
413
414 1
        uqs_list.append(uqs.copy())
415 1
        wqs_list.append(wqs.copy())
416 1
        wwa_list.append(wwa.copy())
417 1
        wsa_list.append(wsa.copy())
418
419 1
        def init_aqs(config, unit_ann_dict):
420
            """ initialize aqs depending on whether or not it is an open ended task """
421 1
            aqs = dict()
422 1
            if not config.open_ended_task:
423 1
                aqs_keys = list(unit_ann_dict[list(unit_ann_dict.keys())[0]].keys())
424 1
                for ann in aqs_keys:
425 1
                    aqs[ann] = 1.0
426
            else:
427 1
                for unit_id in unit_ann_dict:
428 1
                    for ann in unit_ann_dict[unit_id]:
429 1
                        aqs[ann] = 1.0
430 1
            return aqs
431
432 1
        aqs = init_aqs(config, unit_ann_dict)
433 1
        aqs_list.append(aqs.copy())
434
435 1
        uqs_len = len(list(uqs.keys())) * 1.0
436 1
        wqs_len = len(list(wqs.keys())) * 1.0
437 1
        aqs_len = len(list(aqs.keys())) * 1.0
438
439
        # compute metrics until stable values
440 1
        iterations = 0
441 1
        while max_delta >= 0.001:
442 1
            uqs_new = dict()
443 1
            wqs_new = dict()
444 1
            wwa_new = dict()
445 1
            wsa_new = dict()
446
447 1
            avg_uqs_delta = 0.0
448 1
            avg_wqs_delta = 0.0
449 1
            avg_aqs_delta = 0.0
450 1
            max_delta = 0.0
451
452
            # pdb.set_trace()
453
454 1
            def compute_wqs(wwa_new, wsa_new, wqs_new, work_unit_ann_dict, unit_ann_dict, \
455
                            unit_work_ann_dict, wqs_list, uqs_list, aqs_list, wqs_len, \
456
                            max_delta, avg_wqs_delta):
457
                """ compute worker quality score (WQS) """
458 1
                for worker_id, _ in work_unit_ann_dict.items():
459 1
                    wwa_new[worker_id] = Metrics.worker_worker_agreement( \
460
                             worker_id, work_unit_ann_dict, \
461
                             unit_work_ann_dict, \
462
                             wqs_list[len(wqs_list) - 1], \
463
                             uqs_list[len(uqs_list) - 1], \
464
                             aqs_list[len(aqs_list) - 1])
465 1
                    wsa_new[worker_id] = Metrics.worker_unit_agreement( \
466
                             worker_id, \
467
                             unit_ann_dict, \
468
                             work_unit_ann_dict, \
469
                             uqs_list[len(uqs_list) - 1], \
470
                             aqs_list[len(aqs_list) - 1], \
471
                             wqs_list[len(aqs_list) - 1][worker_id])
472 1
                    wqs_new[worker_id] = wwa_new[worker_id] * wsa_new[worker_id]
473 1
                    max_delta = max(max_delta, \
474
                                abs(wqs_new[worker_id] - wqs_list[len(wqs_list) - 1][worker_id]))
475 1
                    avg_wqs_delta += abs(wqs_new[worker_id] - \
476
                                         wqs_list[len(wqs_list) - 1][worker_id])
477 1
                avg_wqs_delta /= wqs_len
478
479 1
                return wwa_new, wsa_new, wqs_new, max_delta, avg_wqs_delta
480
481 1
            def reconstruct_unit_ann_dict(unit_ann_dict, work_unit_ann_dict, wqs_new):
482
                """ reconstruct unit_ann_dict with worker scores """
483 1
                new_unit_ann_dict = dict()
484 1
                for unit_id, ann_dict in unit_ann_dict.items():
485 1
                    new_unit_ann_dict[unit_id] = dict()
486 1
                    for ann, _ in ann_dict.items():
487 1
                        new_unit_ann_dict[unit_id][ann] = 0.0
488 1
                for work_id, srd in work_unit_ann_dict.items():
489 1
                    wqs_work_id = wqs_new[work_id]
490 1
                    for unit_id, ann_dict in srd.items():
491 1
                        for ann, score in ann_dict.items():
492 1
                            new_unit_ann_dict[unit_id][ann] += score * wqs_work_id
493
494 1
                return new_unit_ann_dict
495
496 1
            def compute_aqs(aqs, work_unit_ann_dict, uqs_list, wqs_list, aqs_list, aqs_len, max_delta, avg_aqs_delta):
497
                """ compute annotation quality score (aqs) """
498 1
                aqs_new = Metrics.annotation_quality_score(list(aqs.keys()), work_unit_ann_dict, \
499
                                                        uqs_list[len(uqs_list) - 1], \
500
                                                        wqs_list[len(wqs_list) - 1])
501 1
                for ann, _ in aqs_new.items():
502 1
                    max_delta = max(max_delta, abs(aqs_new[ann] - aqs_list[len(aqs_list) - 1][ann]))
503 1
                    avg_aqs_delta += abs(aqs_new[ann] - aqs_list[len(aqs_list) - 1][ann])
504 1
                avg_aqs_delta /= aqs_len
505 1
                return aqs_new, max_delta, avg_aqs_delta
506
507 1
            def compute_uqs(uqs_new, unit_work_ann_dict, wqs_list, aqs_list, uqs_list, uqs_len, max_delta, avg_uqs_delta):
508
                """ compute unit quality score (uqs) """
509 1
                for unit_id, _ in unit_work_ann_dict.items():
510 1
                    uqs_new[unit_id] = Metrics.unit_quality_score(unit_id, unit_work_ann_dict, \
511
                                                                      wqs_list[len(wqs_list) - 1], \
512
                                                                      aqs_list[len(aqs_list) - 1])
513 1
                    max_delta = max(max_delta, \
514
                                abs(uqs_new[unit_id] - uqs_list[len(uqs_list) - 1][unit_id]))
515 1
                    avg_uqs_delta += abs(uqs_new[unit_id] - uqs_list[len(uqs_list) - 1][unit_id])
516 1
                avg_uqs_delta /= uqs_len
517 1
                return uqs_new, max_delta, avg_uqs_delta
518
519 1
            if not config.open_ended_task:
520
                # compute annotation quality score (aqs)
521 1
                aqs_new, max_delta, avg_aqs_delta = compute_aqs(aqs, work_unit_ann_dict, \
522
                                uqs_list, wqs_list, aqs_list, aqs_len, max_delta, avg_aqs_delta)
523
524
            # compute unit quality score (uqs)
525 1
            uqs_new, max_delta, avg_uqs_delta = compute_uqs(uqs_new, unit_work_ann_dict, \
526
                                    wqs_list, aqs_list, uqs_list, uqs_len, max_delta, avg_uqs_delta)
527
528
            # compute worker quality score (WQS)
529 1
            wwa_new, wsa_new, wqs_new, max_delta, avg_wqs_delta = compute_wqs(\
530
                        wwa_new, wsa_new, wqs_new, \
531
                        work_unit_ann_dict, unit_ann_dict, unit_work_ann_dict, wqs_list, \
532
                        uqs_list, aqs_list, wqs_len, max_delta, avg_wqs_delta)
533
534
            # save results for current iteration
535 1
            uqs_list.append(uqs_new.copy())
536 1
            wqs_list.append(wqs_new.copy())
537 1
            wwa_list.append(wwa_new.copy())
538 1
            wsa_list.append(wsa_new.copy())
539 1
            if not config.open_ended_task:
540 1
                aqs_list.append(aqs_new.copy())
0 ignored issues
show
introduced by
The variable aqs_new does not seem to be defined for all execution paths.
Loading history...
541 1
            iterations += 1
542
543 1
            unit_ann_dict = reconstruct_unit_ann_dict(unit_ann_dict, work_unit_ann_dict, wqs_new)
544
545 1
            logging.info(str(iterations) + " iterations; max d= " + str(max_delta) + \
546
                        " ; wqs d= " + str(avg_wqs_delta) + "; uqs d= " + str(avg_uqs_delta) + \
547
                        "; aqs d= " + str(avg_aqs_delta))
548
549 1
        def save_unit_ann_score(unit_ann_dict, unit_work_ann_dict, iteration_value):
550
            """ save the unit annotation score for print """
551 1
            uas = Counter()
552 1
            for unit_id in unit_ann_dict:
553 1
                uas[unit_id] = Counter()
554 1
                for ann in unit_ann_dict[unit_id]:
555 1
                    uas[unit_id][ann] = Metrics.unit_annotation_score(unit_id, \
556
                                                ann, unit_work_ann_dict, \
557
                                                iteration_value)
558 1
            return uas
559
560 1
        uas = save_unit_ann_score(unit_ann_dict, unit_work_ann_dict, wqs_list[len(wqs_list) - 1])
561 1
        uas_initial = save_unit_ann_score(unit_ann_dict, unit_work_ann_dict, wqs_list[0])
562
563 1
        results['units']['uqs'] = pd.Series(uqs_list[-1])
564 1
        results['units']['unit_annotation_score'] = pd.Series(uas)
565 1
        results['workers']['wqs'] = pd.Series(wqs_list[-1])
566 1
        results['workers']['wwa'] = pd.Series(wwa_list[-1])
567 1
        results['workers']['wsa'] = pd.Series(wsa_list[-1])
568 1
        if not config.open_ended_task:
569 1
            results['annotations']['aqs'] = pd.Series(aqs_list[-1])
570
571 1
        results['units']['uqs_initial'] = pd.Series(uqs_list[1])
572 1
        results['units']['unit_annotation_score_initial'] = pd.Series(uas_initial)
573 1
        results['workers']['wqs_initial'] = pd.Series(wqs_list[1])
574 1
        results['workers']['wwa_initial'] = pd.Series(wwa_list[1])
575 1
        results['workers']['wsa_initial'] = pd.Series(wsa_list[1])
576 1
        if not config.open_ended_task:
577 1
            results['annotations']['aqs_initial'] = pd.Series(aqs_list[1])
578
        return results
579