1
|
|
|
""" |
2
|
|
|
Demographic Classification Fairness Criteria. |
3
|
|
|
|
4
|
|
|
The objectives of the demographic classification fairness criteria |
5
|
|
|
is to measure unfairness towards sensitive attribute valuse. |
6
|
|
|
|
7
|
|
|
The metrics have the same interface and behavior as the ones in |
8
|
|
|
:mod:`sklearn.metrics` |
9
|
|
|
(e.g., using ``y_true``, ``y_pred`` and ``y_score``). |
10
|
|
|
|
11
|
|
|
One should keep in mind that the criteria are intended |
12
|
|
|
to *measure unfairness, rather than to prove fairness*, as it stated in |
13
|
|
|
the paper `Equality of opportunity in supervised learning <https://arxiv.org/abs/1610.02413>`_ |
14
|
|
|
by Hardt et al. (2016): |
15
|
|
|
|
16
|
|
|
... satisfying [the demographic criteria] should not be |
17
|
|
|
considered a conclusive proof of fairness. |
18
|
|
|
Similarly, violations of our condition are not meant |
19
|
|
|
to be a proof of unfairness. |
20
|
|
|
Rather we envision our framework as providing a reasonable way |
21
|
|
|
of discovering and measuring potential concerns that require |
22
|
|
|
further scrutiny. We believe that resolving fairness concerns is |
23
|
|
|
ultimately impossible without substantial domain-specific |
24
|
|
|
investigation. |
25
|
|
|
|
26
|
|
|
The output of binary classifiers can come in two forms, either giving |
27
|
|
|
a binary outcome prediction for input or producing |
28
|
|
|
a real number score, which the common one is the probability |
29
|
|
|
for the positive or negative label |
30
|
|
|
(such as the method ``proba`` of an ``Estimator`` in ``sklearn``). |
31
|
|
|
Therefore, the criteria come in two flavors, one for **binary** output, |
32
|
|
|
and the second for **score** output. |
33
|
|
|
|
34
|
|
|
The fundamental concept for defining the fairness criteria |
35
|
|
|
is `conditional independence <https://en.wikipedia.org/wiki/Conditional_independence>`_. |
36
|
|
|
Using *Machine Learning and Fairness* book's notions: |
37
|
|
|
|
38
|
|
|
- ``A`` - Sensitive attribute |
39
|
|
|
- ``Y`` - Binary ground truth (correct) target |
40
|
|
|
- ``R`` - Estimated binary targets or score as returned by a classifier |
41
|
|
|
|
42
|
|
|
There are three demographic fairness criteria for classification: |
43
|
|
|
|
44
|
|
|
1. Independence - R⊥A |
45
|
|
|
|
46
|
|
|
2. Separation - R⊥A∣Y |
47
|
|
|
|
48
|
|
|
3. Sufficiency - Y⊥A∣R |
49
|
|
|
|
50
|
|
|
""" |
51
|
|
|
|
52
|
|
|
|
53
|
|
|
from ethically.fairness.metrics.binary import ( |
54
|
|
|
independence_binary, report_binary, separation_binary, sufficiency_binary, |
55
|
|
|
) |
56
|
|
|
from ethically.fairness.metrics.score import ( |
57
|
|
|
independence_score, roc_auc_score_by_attr, roc_curve_by_attr, |
58
|
|
|
separation_score, sufficiency_score, |
59
|
|
|
) |
60
|
|
|
from ethically.fairness.metrics.visualization import ( |
61
|
|
|
distplot_by, plot_roc_by_attr, plot_roc_curves, |
62
|
|
|
) |
63
|
|
|
|