Conditions | 26 |
Total Lines | 58 |
Code Lines | 51 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
Complex classes like annif.eval.EvaluationBatch._evaluate_samples() often do a lot of different things. To break such a class down, we need to identify a cohesive component within that class. A common approach to find such a component is to look for fields/methods that share the same prefixes, or suffixes.
Once you have determined the fields that belong together, you can apply the Extract Class refactoring. If the component makes sense as a sub-class, Extract Subclass is also a candidate, and is often faster.
1 | """Evaluation metrics for Annif""" |
||
98 | def _evaluate_samples(self, y_true, y_pred, metrics='all'): |
||
99 | y_pred_binary = y_pred > 0.0 |
||
100 | |||
101 | # define the available metrics as lazy lambda functions |
||
102 | # so we can execute only the ones actually requested |
||
103 | all_metrics = { |
||
104 | 'Precision (doc avg)': lambda: precision_score( |
||
105 | y_true, y_pred_binary, average='samples'), |
||
106 | 'Recall (doc avg)': lambda: recall_score( |
||
107 | y_true, y_pred_binary, average='samples'), |
||
108 | 'F1 score (doc avg)': lambda: f1_score( |
||
109 | y_true, y_pred_binary, average='samples'), |
||
110 | 'Precision (subj avg)': lambda: precision_score( |
||
111 | y_true, y_pred_binary, average='macro'), |
||
112 | 'Recall (subj avg)': lambda: recall_score( |
||
113 | y_true, y_pred_binary, average='macro'), |
||
114 | 'F1 score (subj avg)': lambda: f1_score( |
||
115 | y_true, y_pred_binary, average='macro'), |
||
116 | 'Precision (weighted subj avg)': lambda: precision_score( |
||
117 | y_true, y_pred_binary, average='weighted'), |
||
118 | 'Recall (weighted subj avg)': lambda: recall_score( |
||
119 | y_true, y_pred_binary, average='weighted'), |
||
120 | 'F1 score (weighted subj avg)': lambda: f1_score( |
||
121 | y_true, y_pred_binary, average='weighted'), |
||
122 | 'Precision (microavg)': lambda: precision_score( |
||
123 | y_true, y_pred_binary, average='micro'), |
||
124 | 'Recall (microavg)': lambda: recall_score( |
||
125 | y_true, y_pred_binary, average='micro'), |
||
126 | 'F1 score (microavg)': lambda: f1_score( |
||
127 | y_true, y_pred_binary, average='micro'), |
||
128 | 'F1@5': lambda: f1_score( |
||
129 | y_true, filter_pred_top_k(y_pred, 5) > 0.0, average='samples'), |
||
130 | 'NDCG': lambda: ndcg_score(y_true, y_pred), |
||
131 | 'NDCG@5': lambda: ndcg_score(y_true, y_pred, limit=5), |
||
132 | 'NDCG@10': lambda: ndcg_score(y_true, y_pred, limit=10), |
||
133 | 'Precision@1': lambda: precision_at_k_score( |
||
134 | y_true, y_pred, limit=1), |
||
135 | 'Precision@3': lambda: precision_at_k_score( |
||
136 | y_true, y_pred, limit=3), |
||
137 | 'Precision@5': lambda: precision_at_k_score( |
||
138 | y_true, y_pred, limit=5), |
||
139 | 'LRAP': lambda: label_ranking_average_precision_score( |
||
140 | y_true, y_pred), |
||
141 | 'True positives': lambda: true_positives( |
||
142 | y_true, y_pred_binary), |
||
143 | 'False positives': lambda: false_positives( |
||
144 | y_true, y_pred_binary), |
||
145 | 'False negatives': lambda: false_negatives( |
||
146 | y_true, y_pred_binary), |
||
147 | } |
||
148 | |||
149 | if metrics == 'all': |
||
150 | metrics = all_metrics.keys() |
||
151 | |||
152 | with warnings.catch_warnings(): |
||
153 | warnings.simplefilter('ignore') |
||
154 | |||
155 | return {metric: all_metrics[metric]() for metric in metrics} |
||
156 | |||
225 |