Passed
Push — dev ( da1a83...c90b28 )
by Shlomi
01:50
created

BiasWordsEmbedding.generate_analogies()   B

Complexity

Conditions 6

Size

Total Lines 70
Code Lines 38

Duplication

Lines 0
Ratio 0 %

Importance

Changes 0
Metric Value
cc 6
eloc 38
nop 5
dl 0
loc 70
rs 8.0346
c 0
b 0
f 0

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
# pylint: disable=too-many-lines
2
"""
3
Measuring and adjusting bias in word embedding by Bolukbasi (2016).
4
5
References:
6
    - Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V.,
7
      & Kalai, A. T. (2016).
8
      `Man is to computer programmer as woman is to homemaker?
9
      debiasing word embeddings <https://arxiv.org/abs/1607.06520>`_.
10
      In Advances in neural information processing systems
11
      (pp. 4349-4357).
12
13
    - The code and data is based on the GitHub repository:
14
      https://github.com/tolga-b/debiaswe (MIT License).
15
16
    - Gonen, H., & Goldberg, Y. (2019).
17
      `Lipstick on a Pig:
18
      Debiasing Methods Cover up Systematic Gender Biases
19
      in Word Embeddings But do not Remove Them
20
      <https://arxiv.org/abs/1903.03862>`_.
21
      arXiv preprint arXiv:1903.03862.
22
23
    - Nissim, M., van Noord, R., van der Goot, R. (2019).
24
      `Fair is Better than Sensational: Man is to Doctor
25
      as Woman is to Doctor <https://arxiv.org/abs/1905.09866>`_.
26
27
Usage
28
~~~~~
29
30
.. code:: python
31
32
   >>> from ethically.we import GenderBiasWE
33
   >>> from gensim import downloader
34
   >>> w2v_model = downloader.load('word2vec-google-news-300')
35
   >>> w2v_gender_bias_we = GenderBiasWE(w2v_model)
36
   >>> w2v_gender_bias_we.calc_direct_bias()
37
   0.07307904249481942
38
   >>> w2v_gender_bias_we.debias()
39
   >>> w2v_gender_bias_we.calc_direct_bias()
40
   1.7964246601064155e-09
41
42
Types of Bias
43
~~~~~~~~~~~~~
44
45
Direct Bias
46
^^^^^^^^^^^
47
48
1. Associations
49
    Words that are closer to one end (e.g., *he*) than to
50
    the other end (*she*).
51
    For example, occupational stereotypes (page 7).
52
    Calculated by
53
    :meth:`~ethically.we.bias.BiasWordEmbedding.calc_direct_bias`.
54
55
2. Analogies
56
    Analogies of *he:x::she:y*.
57
    For example analogies exhibiting stereotypes (page 7).
58
    Generated by
59
    :meth:`~ethically.we.bias.BiasWordEmbedding.generate_analogies`.
60
61
62
Indirect Bias
63
^^^^^^^^^^^^^
64
65
Projection of a neutral words into a two neutral words direction
66
is explained in a great portion by a shared bias direction projection.
67
68
Calculated by
69
:meth:`~ethically.we.bias.BiasWordEmbedding.calc_indirect_bias`
70
and
71
:meth:`~ethically.we.bias.GenderBiasWE.generate_closest_words_indirect_bias`.
72
73
"""
74
75
import copy
76
import warnings
77
78
import matplotlib.pylab as plt
79
import numpy as np
80
import pandas as pd
81
import seaborn as sns
82
from scipy.stats import spearmanr
83
from sklearn.decomposition import PCA
84
from sklearn.metrics.pairwise import euclidean_distances
85
from sklearn.svm import LinearSVC
86
from tqdm import tqdm
87
88
from tabulate import tabulate
89
90
from ..consts import RANDOM_STATE
91
from .benchmark import evaluate_word_embedding
92
from .data import BOLUKBASI_DATA
93
from .utils import (
94
    assert_gensim_keyed_vectors, cosine_similarity, generate_one_word_forms,
95
    generate_words_forms, get_seed_vector, most_similar, normalize,
96
    plot_clustering_as_classification, project_params, project_reject_vector,
97
    project_vector, reject_vector, round_to_extreme,
98
    take_two_sides_extreme_sorted, update_word_vector,
99
)
100
101
102
DIRECTION_METHODS = ['single', 'sum', 'pca']
103
DEBIAS_METHODS = ['neutralize', 'hard', 'soft']
104
FIRST_PC_THRESHOLD = 0.5
105
MAX_NON_SPECIFIC_EXAMPLES = 1000
106
107
__all__ = ['GenderBiasWE', 'BiasWordEmbedding']
108
109
# https://stackoverflow.com/questions/2187269/print-only-the-message-on-warnings
110
formatwarning_orig = warnings.formatwarning
111
# pylint: disable=line-too-long
112
warnings.formatwarning = lambda message, category, filename, lineno, line=None: \
113
    formatwarning_orig(message, category, filename, lineno, line='')
114
115
116
class BiasWordEmbedding:
117
    """Measure and adjust a bias in English word embedding.
118
119
    :param model: Word embedding model of ``gensim.model.KeyedVectors``
120
    :param bool only_lower: Whether the word embedding contrains
121
                            only lower case words
122
    :param bool verbose: Set verbosity
123
    :param bool to_normalize: Whether to normalize all the vectors
124
                              (recommended!)
125
    """
126
127
    def __init__(self, model, only_lower=False, verbose=False,
128
                 identify_direction=False, to_normalize=True):
129
        assert_gensim_keyed_vectors(model)
130
131
        # TODO: this is bad Python, ask someone about it
132
        # probably should be a better design
133
        # identify_direction doesn't have any meaning
134
        # for the class BiasWordEmbedding
135
        if self.__class__ == __class__ and identify_direction is not False:
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable __class__ does not seem to be defined.
Loading history...
136
            raise ValueError('identify_direction must be False'
137
                             ' for an instance of {}'
138
                             .format(__class__))
139
140
        self.model = model
141
142
        # TODO: write unitest for when it is False
143
        self.only_lower = only_lower
144
145
        self._verbose = verbose
146
147
        self.direction = None
148
        self.positive_end = None
149
        self.negative_end = None
150
151
        if to_normalize:
152
            self.model.init_sims(replace=True)
153
154
    def __copy__(self):
155
        bias_word_embedding = self.__class__(self.model,
156
                                             self.only_lower,
157
                                             self._verbose,
158
                                             identify_direction=False)
159
        bias_word_embedding.direction = copy.deepcopy(self.direction)
160
        bias_word_embedding.positive_end = copy.deepcopy(self.positive_end)
161
        bias_word_embedding.negative_end = copy.deepcopy(self.negative_end)
162
        return bias_word_embedding
163
164
    def __deepcopy__(self, memo):
165
        bias_word_embedding = copy.copy(self)
166
        bias_word_embedding.model = copy.deepcopy(bias_word_embedding.model)
167
        return bias_word_embedding
168
169
    def __getitem__(self, key):
170
        return self.model[key]
171
172
    def __contains__(self, item):
173
        return item in self.model
174
175
    def _filter_words_by_model(self, words):
176
        return [word for word in words if word in self]
177
178
    def _is_direction_identified(self):
179
        if self.direction is None:
180
            raise RuntimeError('The direction was not identified'
181
                               ' for this {} instance'
182
                               .format(self.__class__.__name__))
183
184
    # There is a mistake in the article
185
    # it is written (section 5.1):
186
    # "To identify the gender subspace, we took the ten gender pair difference
187
    # vectors and computed its principal components (PCs)"
188
    # however in the source code:
189
    # https://github.com/tolga-b/debiaswe/blob/10277b23e187ee4bd2b6872b507163ef4198686b/debiaswe/we.py#L235-L245
190
    def _identify_subspace_by_pca(self, definitional_pairs, n_components):
191
        matrix = []
192
193
        for word1, word2 in definitional_pairs:
194
            vector1 = normalize(self[word1])
195
            vector2 = normalize(self[word2])
196
197
            center = (vector1 + vector2) / 2
198
199
            matrix.append(vector1 - center)
200
            matrix.append(vector2 - center)
201
202
        pca = PCA(n_components=n_components)
203
        pca.fit(matrix)
204
205
        if self._verbose:
206
            table = enumerate(pca.explained_variance_ratio_, start=1)
207
            headers = ['Principal Component',
208
                       'Explained Variance Ratio']
209
            print(tabulate(table, headers=headers))
210
211
        return pca
212
213
    # TODO: add the SVD method from section 6 step 1
214
    # It seems there is a mistake there, I think it is the same as PCA
215
    # just with replacing it with SVD
216
    def _identify_direction(self, positive_end, negative_end,
217
                            definitional, method='pca'):
218
        if method not in DIRECTION_METHODS:
219
            raise ValueError('method should be one of {}, {} was given'.format(
220
                DIRECTION_METHODS, method))
221
222
        if positive_end == negative_end:
223
            raise ValueError('positive_end and negative_end'
224
                             'should be different, and not the same "{}"'
225
                             .format(positive_end))
226
        if self._verbose:
227
            print('Identify direction using {} method...'.format(method))
228
229
        direction = None
230
231
        if method == 'single':
232
            direction = normalize(normalize(self[definitional[0]])
233
                                  - normalize(self[definitional[1]]))
234
235
        elif method == 'sum':
236
            group1_sum_vector = np.sum([self[word]
237
                                        for word in definitional[0]], axis=0)
238
            group2_sum_vector = np.sum([self[word]
239
                                        for word in definitional[1]], axis=0)
240
241
            diff_vector = (normalize(group1_sum_vector)
242
                           - normalize(group2_sum_vector))
243
244
            direction = normalize(diff_vector)
245
246
        elif method == 'pca':
247
            pca = self._identify_subspace_by_pca(definitional, 10)
248
            if pca.explained_variance_ratio_[0] < FIRST_PC_THRESHOLD:
249
                raise RuntimeError('The Explained variance'
250
                                   'of the first principal component should be'
251
                                   'at least {}, but it is {}'
252
                                   .format(FIRST_PC_THRESHOLD,
253
                                           pca.explained_variance_ratio_[0]))
254
            direction = pca.components_[0]
255
256
            # if direction is opposite (e.g. we cannot control
257
            # what the PCA will return)
258
            ends_diff_projection = cosine_similarity((self[positive_end]
259
                                                      - self[negative_end]),
260
                                                     direction)
261
            if ends_diff_projection < 0:
262
                direction = -direction  # pylint: disable=invalid-unary-operand-type
263
264
        self.direction = direction
265
        self.positive_end = positive_end
266
        self.negative_end = negative_end
267
268
    def project_on_direction(self, word):
269
        """Project the normalized vector of the word on the direction.
270
271
        :param str word: The word tor project
272
        :return float: The projection scalar
273
        """
274
275
        self._is_direction_identified()
276
277
        vector = self[word]
278
        projection_score = self.model.cosine_similarities(self.direction,
279
                                                          [vector])[0]
280
        return projection_score
281
282
    def _calc_projection_scores(self, words):
283
        self._is_direction_identified()
284
285
        df = pd.DataFrame({'word': words})
286
287
        # TODO: maybe using cosine_similarities on all the vectors?
288
        # it might be faster
289
        df['projection'] = df['word'].apply(self.project_on_direction)
290
        df = df.sort_values('projection', ascending=False)
291
292
        return df
293
294
    def calc_projection_data(self, words):
295
        """
296
        Calculate projection, projected and rejected vectors of a words list.
297
298
        :param list words: List of words
299
        :return: :class:`pandas.DataFrame` of the projection,
300
                 projected and rejected vectors of the words list
301
        """
302
        projection_data = []
303
        for word in words:
304
            vector = self[word]
305
            projection = self.project_on_direction(word)
306
            normalized_vector = normalize(vector)
307
308
            (projection,
309
             projected_vector,
310
             rejected_vector) = project_params(normalized_vector,
311
                                               self.direction)
312
313
            projection_data.append({'word': word,
314
                                    'vector': vector,
315
                                    'projection': projection,
316
                                    'projected_vector': projected_vector,
317
                                    'rejected_vector': rejected_vector})
318
319
        return pd.DataFrame(projection_data)
320
321
    def plot_projection_scores(self, words, n_extreme=10,
322
                               ax=None, axis_projection_step=None):
323
        """Plot the projection scalar of words on the direction.
324
325
        :param list words: The words tor project
326
        :param int or None n_extreme: The number of extreme words to show
327
        :return: The ax object of the plot
328
        """
329
330
        self._is_direction_identified()
331
332
        projections_df = self._calc_projection_scores(words)
333
        projections_df['projection'] = projections_df['projection'].round(2)
334
335
        if n_extreme is not None:
336
            projections_df = take_two_sides_extreme_sorted(projections_df,
337
                                                           n_extreme=n_extreme)
338
339
        if ax is None:
340
            _, ax = plt.subplots(1)
341
342
        if axis_projection_step is None:
343
            axis_projection_step = 0.1
344
345
        cmap = plt.get_cmap('RdBu')
346
        projections_df['color'] = ((projections_df['projection'] + 0.5)
347
                                   .apply(cmap))
348
349
        most_extream_projection = (projections_df['projection']
350
                                   .abs()
351
                                   .max()
352
                                   .round(1))
353
354
        sns.barplot(x='projection', y='word', data=projections_df,
355
                    palette=projections_df['color'])
356
357
        plt.xticks(np.arange(-most_extream_projection,
358
                             most_extream_projection + axis_projection_step,
359
                             axis_projection_step))
360
        plt.title('← {} {} {} →'.format(self.negative_end,
361
                                        ' ' * 20,
362
                                        self.positive_end))
363
364
        plt.xlabel('Direction Projection')
365
        plt.ylabel('Words')
366
367
        return ax
368
369
    def plot_dist_projections_on_direction(self, word_groups, ax=None):
370
        """Plot the projection scalars distribution on the direction.
371
372
        :param dict word_groups word: The groups to projects
373
        :return float: The ax object of the plot
374
        """
375
376
        if ax is None:
377
            _, ax = plt.subplots(1)
378
379
        names = sorted(word_groups.keys())
380
381
        for name in names:
382
            words = word_groups[name]
383
            label = '{} (#{})'.format(name, len(words))
384
            vectors = [self[word] for word in words]
385
            projections = self.model.cosine_similarities(self.direction,
386
                                                         vectors)
387
            sns.distplot(projections, hist=False, label=label, ax=ax)
388
389
        plt.axvline(0, color='k', linestyle='--')
390
391
        plt.title('← {} {} {} →'.format(self.negative_end,
392
                                        ' ' * 20,
393
                                        self.positive_end))
394
        plt.xlabel('Direction Projection')
395
        plt.ylabel('Density')
396
        ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))
397
398
        return ax
399
400
    @classmethod
401
    def _calc_bias_across_word_embeddings(cls,
402
                                          word_embedding_bias_dict,
403
                                          words):
404
        """
405
        Calculate to projections and rho of words for two word embeddings.
406
407
        :param dict word_embedding_bias_dict: ``WordsEmbeddingBias`` objects
408
                                               as values,
409
                                               and their names as keys.
410
        :param list words: Words to be projected.
411
        :return tuple: Projections and spearman rho.
412
        """
413
        # pylint: disable=W0212
414
        assert len(word_embedding_bias_dict) == 2, 'Support only in two'\
415
                                                    'word embeddings'
416
417
        intersection_words = [word for word in words
418
                              if all(word in web
419
                                     for web in (word_embedding_bias_dict
420
                                                 .values()))]
421
422
        projections = {name: web._calc_projection_scores(intersection_words)['projection']  # pylint: disable=C0301
423
                       for name, web in word_embedding_bias_dict.items()}
424
425
        df = pd.DataFrame(projections)
426
        df.index = intersection_words
427
428
        rho, _ = spearmanr(*df.transpose().values)
429
        return df, rho
430
431
    @classmethod
432
    def plot_bias_across_word_embeddings(cls, word_embedding_bias_dict,
433
                                         words, ax=None, scatter_kwargs=None):
434
        """
435
        Plot the projections of same words of two word mbeddings.
436
437
        :param dict word_embedding_bias_dict: ``WordsEmbeddingBias`` objects
438
                                               as values,
439
                                               and their names as keys.
440
        :param list words: Words to be projected.
441
        :param scatter_kwargs: Kwargs for matplotlib.pylab.scatter.
442
        :type scatter_kwargs: dict or None
443
        :return: The ax object of the plot
444
        """
445
        # pylint: disable=W0212
446
447
        df, rho = cls._calc_bias_across_word_embeddings(word_embedding_bias_dict,  # pylint: disable=C0301
448
                                                        words)
449
450
        if ax is None:
451
            _, ax = plt.subplots(1)
452
453
        if scatter_kwargs is None:
454
            scatter_kwargs = {}
455
456
        name1, name2 = word_embedding_bias_dict.keys()
457
458
        ax.scatter(x=name1, y=name2, data=df, **scatter_kwargs)
459
460
        plt.title('Bias Across Word Embeddings'
461
                  '(Spearman Rho = {:0.2f})'.format(rho))
462
463
        negative_end = word_embedding_bias_dict[name1].negative_end
464
        positive_end = word_embedding_bias_dict[name1].positive_end
465
        plt.xlabel('← {}     {}     {} →'.format(negative_end,
466
                                                 name1,
467
                                                 positive_end))
468
        plt.ylabel('← {}     {}     {} →'.format(negative_end,
469
                                                 name2,
470
                                                 positive_end))
471
472
        ax_min = round_to_extreme(df.values.min())
473
        ax_max = round_to_extreme(df.values.max())
474
        plt.xlim(ax_min, ax_max)
475
        plt.ylim(ax_min, ax_max)
476
477
        return ax
478
479
    # TODO: refactor for speed and clarity
480
    def generate_analogies(self, n_analogies=100, seed='ends',
481
                           multiple=False,
482
                           delta=1., restrict_vocab=30000,
483
                           unrestricted=False):
484
        """
485
        Generate analogies based on a seed vector.
486
487
        x - y ~ seed vector.
488
        or a:x::b:y when a-b ~ seed vector.
489
490
        The seed vector can be defined by two word ends,
491
        or by the bias direction.
492
493
        ``delta`` is used for semantically coherent. Default vale of 1
494
        corresponds to an angle <= pi/3.
495
496
497
        There is criticism regarding generating analogies
498
        when used with `unstricted=False` and not ignoring analogies
499
        with `match` column equal to `False`.
500
        Tolga's technique of generating analogies, as implemented in this
501
        method, is limited inherently to analogies with x != y, which may
502
        be force "fake" bias analogies.
503
504
        See:
505
506
        - Nissim, M., van Noord, R., van der Goot, R. (2019).
507
          `Fair is Better than Sensational: Man is to Doctor
508
          as Woman is to Doctor <https://arxiv.org/abs/1905.09866>`_.
509
510
        :param seed: The definition of the seed vector.
511
                     Either by a tuple of two word ends,
512
                     or by `'ends` for the pre-defined ends
513
                     or by `'direction'` for the pre-defined direction vector.
514
        :param int n_analogies: Number of analogies to generate.
515
        :param bool multiple: Whether to allow multiple appearances of a word
516
                              in the analogies.
517
        :param float delta: Threshold for semantic similarity.
518
                            The maximal distance between x and y.
519
        :param int restrict_vocab: The vocabulary size to use.
520
        :param bool unrestricted: Whether to validate the generated analogies
521
                                  with unrestricted `most_similar`.
522
        :return: Data Frame of analogies (x, y), their distances,
523
                 and their cosine similarity scores
524
        """
525
        # pylint: disable=C0301,R0914
526
527
        if not unrestricted:
528
            warnings.warn('Not Using unrestricted most_similar '
529
                          'may introduce fake biased analogies.')
530
531
        (seed_vector,
532
         positive_end,
533
         negative_end) = get_seed_vector(seed, self)
534
535
        restrict_vocab_vectors = self.model.vectors[:restrict_vocab]
536
537
        normalized_vectors = (restrict_vocab_vectors
538
                              / np.linalg.norm(restrict_vocab_vectors, axis=1)[:, None])
539
540
        pairs_distances = euclidean_distances(normalized_vectors, normalized_vectors)
541
542
        # `pairs_distances` must be not-equal to zero
543
        # otherwise, x-y will be the zero vector, and every cosine similarity
544
        # will be equal to zero.
545
        # This cause to the **limitation** of this method which enforce a not-same
546
        # words for x and y.
547
        pairs_mask = (pairs_distances < delta) & (pairs_distances != 0)
548
549
        pairs_indices = np.array(np.nonzero(pairs_mask)).T
550
        x_vectors = np.take(normalized_vectors, pairs_indices[:, 0], axis=0)
551
        y_vectors = np.take(normalized_vectors, pairs_indices[:, 1], axis=0)
552
553
        x_minus_y_vectors = x_vectors - y_vectors
554
        normalized_x_minus_y_vectors = (x_minus_y_vectors
555
                                        / np.linalg.norm(x_minus_y_vectors, axis=1)[:, None])
556
557
        cos_distances = normalized_x_minus_y_vectors @ seed_vector
558
559
        sorted_cos_distances_indices = np.argsort(cos_distances)[::-1]
560
561
        sorted_cos_distances_indices_iter = iter(sorted_cos_distances_indices)
562
563
        analogies = []
564
        generated_words_x = set()
565
        generated_words_y = set()
566
567
        while len(analogies) < n_analogies:
568
            cos_distance_index = next(sorted_cos_distances_indices_iter)
569
            paris_index = pairs_indices[cos_distance_index]
570
            word_x, word_y = [self.model.index2word[index]
571
                              for index in paris_index]
572
573
            if multiple or (not multiple
574
                            and (word_x not in generated_words_x
575
                                 and word_y not in generated_words_y)):
576
577
                analogy = ({positive_end: word_x,
578
                            negative_end: word_y,
579
                            'score': cos_distances[cos_distance_index],
580
                            'distance': pairs_distances[tuple(paris_index)]})
581
582
                generated_words_x.add(word_x)
583
                generated_words_y.add(word_y)
584
585
                if unrestricted:
586
                    most_x = next(word
587
                                  for word, _ in most_similar(self.model,
588
                                                              [word_y, positive_end],
589
                                                              [negative_end]))
590
                    most_y = next(word
591
                                  for word, _ in most_similar(self.model,
592
                                                              [word_x, negative_end],
593
                                                              [positive_end]))
594
595
                    analogy['most_x'] = most_x
596
                    analogy['most_y'] = most_y
597
                    analogy['match'] = ((word_x == most_x)
598
                                        and (word_y == most_y))
599
600
                analogies.append(analogy)
601
602
        df = pd.DataFrame(analogies)
603
604
        columns = [positive_end, negative_end, 'distance', 'score']
605
606
        if unrestricted:
607
            columns.extend(['most_x', 'most_y', 'match'])
608
609
        df = df[columns]
610
611
        return df
612
613
    def calc_direct_bias(self, neutral_words, c=None):
614
        """Calculate the direct bias.
615
616
        Based on the projection of neutral words on the direction.
617
618
        :param list neutral_words: List of neutral words
619
        :param c: Strictness of bias measuring
620
        :type c: float or None
621
        :return: The direct bias
622
        """
623
624
        if c is None:
625
            c = 1
626
627
        projections = self._calc_projection_scores(neutral_words)['projection']
628
        direct_bias_terms = np.abs(projections) ** c
629
        direct_bias = direct_bias_terms.sum() / len(neutral_words)
630
631
        return direct_bias
632
633
    def calc_indirect_bias(self, word1, word2):
634
        """Calculate the indirect bias between two words.
635
636
        Based on the amount of shared projection of the words on the direction.
637
638
        Also called PairBias.
639
        :param str word1: First word
640
        :param str word2: Second word
641
        :type c: float or None
642
        :return The indirect bias between the two words
643
        """
644
645
        self._is_direction_identified()
646
647
        vector1 = normalize(self[word1])
648
        vector2 = normalize(self[word2])
649
650
        perpendicular_vector1 = reject_vector(vector1, self.direction)
651
        perpendicular_vector2 = reject_vector(vector2, self.direction)
652
653
        inner_product = vector1 @ vector2
654
        perpendicular_similarity = cosine_similarity(perpendicular_vector1,
655
                                                     perpendicular_vector2)
656
657
        indirect_bias = ((inner_product - perpendicular_similarity)
658
                         / inner_product)
659
        return indirect_bias
660
661
    def generate_closest_words_indirect_bias(self,
662
                                             neutral_positive_end,
663
                                             neutral_negative_end,
664
                                             words=None, n_extreme=5):
665
        """
666
        Generate closest words to a neutral direction and their indirect bias.
667
668
        The direction of the neutral words is used to find
669
        the most extreme words.
670
        The indirect bias is calculated between the most extreme words
671
        and the closest end.
672
673
        :param str neutral_positive_end: A word that define the positive side
674
                                         of the neutral direction.
675
        :param str neutral_negative_end: A word that define the negative side
676
                                         of the neutral direction.
677
        :param list words: List of words to project on the neutral direction.
678
        :param int n_extreme: The number for the most extreme words
679
                              (positive and negative) to show.
680
        :return: Data Frame of the most extreme words
681
                 with their projection scores and indirect biases.
682
        """
683
684
        neutral_direction = normalize(self[neutral_positive_end]
685
                                      - self[neutral_negative_end])
686
687
        vectors = [normalize(self[word]) for word in words]
688
        df = (pd.DataFrame([{'word': word,
689
                             'projection': vector @ neutral_direction}
690
                            for word, vector in zip(words, vectors)])
691
              .sort_values('projection', ascending=False))
692
693
        df = take_two_sides_extreme_sorted(df, n_extreme,
694
                                           'end',
695
                                           neutral_positive_end,
696
                                           neutral_negative_end)
697
698
        df['indirect_bias'] = df.apply(lambda r:
699
                                       self.calc_indirect_bias(r['word'],
700
                                                               r['end']),
701
                                       axis=1)
702
703
        df = df.set_index(['end', 'word'])
704
        df = df[['projection', 'indirect_bias']]
705
706
        return df
707
708
    def _extract_neutral_words(self, specific_words):
709
        extended_specific_words = set()
710
711
        # because or specific_full data was trained on partial word embedding
712
        for word in specific_words:
713
            extended_specific_words.add(word)
714
            extended_specific_words.add(word.lower())
715
            extended_specific_words.add(word.upper())
716
            extended_specific_words.add(word.title())
717
718
        neutral_words = [word for word in self.model.vocab
719
                         if word not in extended_specific_words]
720
721
        return neutral_words
722
723
    def _neutralize(self, neutral_words):
724
        self._is_direction_identified()
725
726
        if self._verbose:
727
            neutral_words_iter = tqdm(neutral_words)
728
        else:
729
            neutral_words_iter = iter(neutral_words)
730
731
        for word in neutral_words_iter:
732
            neutralized_vector = reject_vector(self[word],
733
                                               self.direction)
734
            update_word_vector(self.model, word, neutralized_vector)
735
736
        self.model.init_sims(replace=True)
737
738
    def _equalize(self, equality_sets):
739
        # pylint: disable=R0914
740
741
        self._is_direction_identified()
742
743
        if self._verbose:
744
            words_data = []
745
746
        for equality_set_index, equality_set_words in enumerate(equality_sets):
747
            equality_set_vectors = [normalize(self[word])
748
                                    for word in equality_set_words]
749
            center = np.mean(equality_set_vectors, axis=0)
750
            (projected_center,
751
             rejected_center) = project_reject_vector(center,
752
                                                      self.direction)
753
            scaling = np.sqrt(1 - np.linalg.norm(rejected_center)**2)
754
755
            for word, vector in zip(equality_set_words, equality_set_vectors):
756
                projected_vector = project_vector(vector, self.direction)
757
758
                projected_part = normalize(projected_vector - projected_center)
759
760
                # In the code it is different of Bolukbasi
761
                # It behaves the same only for equality_sets
762
                # with size of 2 (pairs) - not sure!
763
                # However, my code is the same as the article
764
                # equalized_vector = rejected_center + scaling * self.direction
765
                # https://github.com/tolga-b/debiaswe/blob/10277b23e187ee4bd2b6872b507163ef4198686b/debiaswe/debias.py#L36-L37
766
                # For pairs, projected_part_vector1 == -projected_part_vector2,
767
                # and this is the same as
768
                # projected_part_vector1 == self.direction
769
                equalized_vector = rejected_center + scaling * projected_part
770
771
                update_word_vector(self.model, word, equalized_vector)
772
773
                if self._verbose:
774
                    words_data.append({
0 ignored issues
show
introduced by
The variable words_data does not seem to be defined in case self._verbose on line 743 is False. Are you sure this can never be the case?
Loading history...
775
                        'equality_set_index': equality_set_index,
776
                        'word': word,
777
                        'scaling': scaling,
778
                        'projected_scalar': vector @ self.direction,
779
                        'equalized_projected_scalar': (equalized_vector
780
                                                       @ self.direction),
781
                    })
782
783
        if self._verbose:
784
            print('Equalize Words Data '
785
                  '(all equal for 1-dim bias space (direction):')
786
            words_data_df = (pd.DataFrame(words_data)
787
                             .set_index(['equality_set_index', 'word']))
788
            print(tabulate(words_data_df, headers='keys'))
789
790
        self.model.init_sims(replace=True)
791
792
    def debias(self, method='hard', neutral_words=None, equality_sets=None,
793
               inplace=True):
794
        """Debias the word embedding.
795
796
        :param str method: The method of debiasing.
797
        :param list neutral_words: List of neutral words
798
                                   for the neutralize step
799
        :param list equality_sets: List of equality sets,
800
                                   for the equalize step.
801
                                   The sets represent the direction.
802
        :param bool inplace: Whether to debias the object inplace
803
                             or return a new one
804
805
        .. warning::
806
807
          After calling `debias`,
808
          all the vectors of the word embedding
809
          will be normalized to unit length.
810
811
        """
812
813
        # pylint: disable=W0212
814
        if inplace:
815
            bias_word_embedding = self
816
        else:
817
            bias_word_embedding = copy.deepcopy(self)
818
819
        if method not in DEBIAS_METHODS:
820
            raise ValueError('method should be one of {}, {} was given'.format(
821
                DEBIAS_METHODS, method))
822
823
        if method in ['hard', 'neutralize']:
824
            if self._verbose:
825
                print('Neutralize...')
826
            bias_word_embedding._neutralize(neutral_words)
827
828
        if method == 'hard':
829
            if self._verbose:
830
                print('Equalize...')
831
            bias_word_embedding._equalize(equality_sets)
832
833
        if inplace:
834
            return None
835
        else:
836
            return bias_word_embedding
837
838
    def evaluate_word_embedding(self,
839
                                kwargs_word_pairs=None,
840
                                kwargs_word_analogies=None):
841
        """
842
        Evaluate word pairs tasks and word analogies tasks.
843
844
        :param model: Word embedding.
845
        :param kwargs_word_pairs: Kwargs for
846
                                  evaluate_word_pairs
847
                                  method.
848
        :type kwargs_word_pairs: dict or None
849
        :param kwargs_word_analogies: Kwargs for
850
                                      evaluate_word_analogies
851
                                      method.
852
        :type evaluate_word_analogies: dict or None
853
        :return: Tuple of :class:`pandas.DataFrame`
854
                 for the evaluation results.
855
        """
856
857
        return evaluate_word_embedding(self.model,
858
                                       kwargs_word_pairs,
859
                                       kwargs_word_analogies)
860
861
    def learn_full_specific_words(self, seed_specific_words,
862
                                  max_non_specific_examples=None, debug=None):
863
        """Learn specific words given a list of seed specific wordsself.
864
865
        Using Linear SVM.
866
867
        :param list seed_specific_words: List of seed specific words
868
        :param int max_non_specific_examples: The number of non-specific words
869
                                              to sample for training
870
        :return: List of learned specific words and the classifier object
871
        """
872
873
        if debug is None:
874
            debug = False
875
876
        if max_non_specific_examples is None:
877
            max_non_specific_examples = MAX_NON_SPECIFIC_EXAMPLES
878
879
        data = []
880
        non_specific_example_count = 0
881
882
        for word in self.model.vocab:
883
            is_specific = word in seed_specific_words
884
885
            if not is_specific:
886
                non_specific_example_count += 1
887
                if non_specific_example_count <= max_non_specific_examples:
888
                    data.append((self[word], is_specific))
889
            else:
890
                data.append((self[word], is_specific))
891
892
        np.random.seed(RANDOM_STATE)
893
        np.random.shuffle(data)
894
895
        X, y = zip(*data)
896
897
        X = np.array(X)
898
        X /= np.linalg.norm(X, axis=1)[:, None]
899
900
        y = np.array(y).astype('int')
901
902
        clf = LinearSVC(C=1, class_weight='balanced',
903
                        random_state=RANDOM_STATE)
904
905
        clf.fit(X, y)
906
907
        full_specific_words = []
908
        for word in self.model.vocab:
909
            vector = [normalize(self[word])]
910
            if clf.predict(vector):
911
                full_specific_words.append(word)
912
913
        if not debug:
914
            return full_specific_words, clf
915
916
        return full_specific_words, clf, X, y
917
918
    def _plot_most_biased_one_cluster(self,
919
                                      most_biased_neutral_words, y_bias,
920
                                      random_state=1, ax=None):
921
        most_biased_vectors = [self.model[word]
922
                               for word in most_biased_neutral_words]
923
924
        return plot_clustering_as_classification(most_biased_vectors,
925
                                                 y_bias,
926
                                                 random_state=random_state,
927
                                                 ax=ax)
928
929
    @staticmethod
930
    def plot_most_biased_clustering(biased, debiased,
931
                                    seed='ends', n_extreme=500,
932
                                    random_state=1):
933
        """Plot clustering as classification of biased neutral words.
934
935
        :param biased: Biased word embedding of
936
                       :class:`~ethically.we.bias.BiasWordEmbedding`.
937
        :param debiased: Debiased word embedding of
938
                         :class:`~ethically.we.bias.BiasWordEmbedding`.
939
        :param seed: The definition of the seed vector.
940
                    Either by a tuple of two word ends,
941
                    or by `'ends` for the pre-defined ends
942
                    or by `'direction'` for
943
                    the pre-defined direction vector.
944
        :param n_extrem: The number of extreme biased
945
                         neutral words to use.
946
        :return: Tuple of list of ax objects of the plot,
947
                 and a dictionary with the most positive
948
                 and negative words.
949
950
        Based on:
951
952
        - Gonen, H., & Goldberg, Y. (2019).
953
          `Lipstick on a Pig:
954
           Debiasing Methods Cover up Systematic Gender Biases
955
           in Word Embeddings But do not Remove Them
956
           <https://arxiv.org/abs/1903.03862>`_.
957
           arXiv preprint arXiv:1903.03862.
958
959
        - https://github.com/gonenhila/gender_bias_lipstick
960
        """
961
        # pylint: disable=protected-access,too-many-locals,line-too-long
962
963
        assert biased.positive_end == debiased.positive_end, \
964
            'Postive ends should be the same.'
965
        assert biased.negative_end == debiased.negative_end, \
966
            'Negative ends should be the same.'
967
968
        seed_vector, _, _ = get_seed_vector(seed, biased)
969
970
        neutral_words = biased._data['neutral_words']
971
        neutral_word_vectors = (biased[word] for word in neutral_words)
0 ignored issues
show
Comprehensibility Best Practice introduced by
The variable word does not seem to be defined.
Loading history...
972
        neutral_word_projections = [(normalize(vector) @ seed_vector, word)
973
                                    for word, vector
974
                                    in zip(neutral_words,
975
                                           neutral_word_vectors)]
976
977
        neutral_word_projections.sort()
978
979
        _, most_negative_words = zip(*neutral_word_projections[:n_extreme])
980
        _, most_positive_words = zip(*neutral_word_projections[-n_extreme:])
981
982
        most_biased_neutral_words = most_negative_words + most_positive_words
983
984
        y_bias = [False] * n_extreme + [True] * n_extreme
985
986
        _, axes = plt.subplots(1, 2, figsize=(20, 5))
987
988
        acc_biased = biased._plot_most_biased_one_cluster(most_biased_neutral_words,
989
                                                          y_bias,
990
                                                          random_state=random_state,
991
                                                          ax=axes[0])
992
        axes[0].set_title('Biased - Accuracy={}'.format(acc_biased))
993
994
        acc_debiased = debiased._plot_most_biased_one_cluster(most_biased_neutral_words,
995
                                                              y_bias,
996
                                                              random_state=random_state,
997
                                                              ax=axes[1])
998
        axes[1].set_title('Debiased - Accuracy={}'.format(acc_debiased))
999
1000
        return axes, {biased.positive_end: most_positive_words,
1001
                      biased.negative_end: most_negative_words}
1002
1003
1004
class GenderBiasWE(BiasWordEmbedding):
1005
    """Measure and adjust the Gender Bias in English Word Embedding.
1006
1007
    :param model: Word embedding model of ``gensim.model.KeyedVectors``
1008
    :param bool only_lower: Whether the word embedding contrains
1009
                            only lower case words
1010
    :param bool verbose: Set verbosity
1011
    :param bool to_normalize: Whether to normalize all the vectors
1012
                              (recommended!)
1013
    """
1014
1015
    def __init__(self, model, only_lower=False, verbose=False,
1016
                 identify_direction=True, to_normalize=True):
1017
        super().__init__(model=model,
1018
                         only_lower=only_lower,
1019
                         verbose=verbose,
1020
                         to_normalize=True)
1021
        self._initialize_data()
1022
        if identify_direction:
1023
            self._identify_direction('she', 'he',
1024
                                     self._data['definitional_pairs'],
1025
                                     'pca')
1026
1027
    def _initialize_data(self):
1028
        self._data = copy.deepcopy(BOLUKBASI_DATA['gender'])
1029
1030
        if not self.only_lower:
1031
            self._data['specific_full_with_definitional_equalize'] = \
1032
                generate_words_forms(self
1033
                                     ._data['specific_full_with_definitional_equalize'])  # pylint: disable=C0301
1034
1035
        for key in self._data['word_group_keys']:
1036
            self._data[key] = (self._filter_words_by_model(self
1037
                                                           ._data[key]))
1038
1039
        self._data['neutral_words'] = self._extract_neutral_words(self
1040
                                                                  ._data['specific_full_with_definitional_equalize'])  # pylint: disable=C0301
1041
        self._data['neutral_words'].sort()
1042
        self._data['word_group_keys'].append('neutral_words')
1043
1044
    def plot_projection_scores(self, words='professions', n_extreme=10,
1045
                               ax=None, axis_projection_step=None):
1046
        if words == 'professions':
1047
            words = self._data['profession_names']
1048
1049
        return super().plot_projection_scores(words, n_extreme,
1050
                                              ax, axis_projection_step)
1051
1052
    def plot_dist_projections_on_direction(self, word_groups='bolukbasi',
1053
                                           ax=None):
1054
        if word_groups == 'bolukbasi':
1055
            word_groups = {key: self._data[key]
1056
                           for key in self._data['word_group_keys']}
1057
1058
        return super().plot_dist_projections_on_direction(word_groups, ax)
1059
1060
    @classmethod
1061
    def plot_bias_across_word_embeddings(cls, word_embedding_bias_dict,
1062
                                         ax=None, scatter_kwargs=None):
1063
        # pylint: disable=W0221
1064
        words = BOLUKBASI_DATA['gender']['neutral_profession_names']
1065
        # TODO: is it correct for inheritance of class method?
1066
        super(cls, cls).plot_bias_across_word_embeddings(word_embedding_bias_dict,  # pylint: disable=C0301
1067
                                                         words,
1068
                                                         ax,
1069
                                                         scatter_kwargs)
1070
1071
    def calc_direct_bias(self, neutral_words='professions', c=None):
1072
        if isinstance(neutral_words, str) and neutral_words == 'professions':
1073
            return super().calc_direct_bias(
1074
                self._data['neutral_profession_names'], c)
1075
        else:
1076
            return super().calc_direct_bias(neutral_words)
1077
1078
    def generate_closest_words_indirect_bias(self,
1079
                                             neutral_positive_end,
1080
                                             neutral_negative_end,
1081
                                             words='professions', n_extreme=5):
1082
        # pylint: disable=C0301
1083
1084
        if words == 'professions':
1085
            words = self._data['profession_names']
1086
1087
        return super().generate_closest_words_indirect_bias(neutral_positive_end,
1088
                                                            neutral_negative_end,
1089
                                                            words,
1090
                                                            n_extreme=n_extreme)
1091
1092
    def debias(self, method='hard', neutral_words=None, equality_sets=None,
1093
               inplace=True):
1094
        # pylint: disable=C0301
1095
        if method in ['hard', 'neutralize']:
1096
            if neutral_words is None:
1097
                neutral_words = self._data['neutral_words']
1098
1099
        if method == 'hard' and equality_sets is None:
1100
            equality_sets = self._data['definitional_pairs']
1101
1102
            if not self.only_lower:
1103
                assert all(len(equality_set) == 2
1104
                           for equality_set in equality_sets), 'currently supporting only equality pairs if only_lower is False'
1105
                # TODO: refactor
1106
                equality_sets = {(candidate1, candidate2)
1107
                                 for word1, word2 in equality_sets
1108
                                 for candidate1, candidate2 in zip(generate_one_word_forms(word1),
1109
                                                                   generate_one_word_forms(word2))}
1110
1111
        return super().debias(method, neutral_words, equality_sets,
1112
                              inplace)
1113
1114
    def learn_full_specific_words(self, seed_specific_words='bolukbasi',
1115
                                  max_non_specific_examples=None,
1116
                                  debug=None):
1117
        if seed_specific_words == 'bolukbasi':
1118
            seed_specific_words = self._data['specific_seed']
1119
1120
        return super().learn_full_specific_words(seed_specific_words,
1121
                                                 max_non_specific_examples,
1122
                                                 debug)
1123