| 
                    1
                 | 
                                    
                                                     | 
                
                 | 
                # pylint: disable=too-many-lines  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    2
                 | 
                                    
                                                     | 
                
                 | 
                """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    3
                 | 
                                    
                                                     | 
                
                 | 
                Measuring and adjusting bias in word embedding by Bolukbasi (2016).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    4
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    5
                 | 
                                    
                                                     | 
                
                 | 
                References:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    6
                 | 
                                    
                                                     | 
                
                 | 
                    - Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V.,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    7
                 | 
                                    
                                                     | 
                
                 | 
                      & Kalai, A. T. (2016).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    8
                 | 
                                    
                                                     | 
                
                 | 
                      `Man is to computer programmer as woman is to homemaker?  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    9
                 | 
                                    
                                                     | 
                
                 | 
                      debiasing word embeddings <https://arxiv.org/abs/1607.06520>`_.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    10
                 | 
                                    
                                                     | 
                
                 | 
                      In Advances in neural information processing systems  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    11
                 | 
                                    
                                                     | 
                
                 | 
                      (pp. 4349-4357).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    12
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    13
                 | 
                                    
                                                     | 
                
                 | 
                    - The code and data is based on the GitHub repository:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    14
                 | 
                                    
                                                     | 
                
                 | 
                      https://github.com/tolga-b/debiaswe (MIT License).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    15
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    16
                 | 
                                    
                                                     | 
                
                 | 
                    - Gonen, H., & Goldberg, Y. (2019).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    17
                 | 
                                    
                                                     | 
                
                 | 
                      `Lipstick on a Pig:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    18
                 | 
                                    
                                                     | 
                
                 | 
                      Debiasing Methods Cover up Systematic Gender Biases  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    19
                 | 
                                    
                                                     | 
                
                 | 
                      in Word Embeddings But do not Remove Them  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    20
                 | 
                                    
                                                     | 
                
                 | 
                      <https://arxiv.org/abs/1903.03862>`_.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    21
                 | 
                                    
                                                     | 
                
                 | 
                      arXiv preprint arXiv:1903.03862.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    22
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    23
                 | 
                                    
                                                     | 
                
                 | 
                    - Nissim, M., van Noord, R., van der Goot, R. (2019).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    24
                 | 
                                    
                                                     | 
                
                 | 
                      `Fair is Better than Sensational: Man is to Doctor  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    25
                 | 
                                    
                                                     | 
                
                 | 
                      as Woman is to Doctor <https://arxiv.org/abs/1905.09866>`_.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    26
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    27
                 | 
                                    
                                                     | 
                
                 | 
                Usage  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    28
                 | 
                                    
                                                     | 
                
                 | 
                ~~~~~  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    29
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    30
                 | 
                                    
                                                     | 
                
                 | 
                .. code:: python  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    31
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    32
                 | 
                                    
                                                     | 
                
                 | 
                   >>> from ethically.we import GenderBiasWE  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    33
                 | 
                                    
                                                     | 
                
                 | 
                   >>> from gensim import downloader  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    34
                 | 
                                    
                                                     | 
                
                 | 
                   >>> w2v_model = downloader.load('word2vec-google-news-300') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    35
                 | 
                                    
                                                     | 
                
                 | 
                   >>> w2v_gender_bias_we = GenderBiasWE(w2v_model)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    36
                 | 
                                    
                                                     | 
                
                 | 
                   >>> w2v_gender_bias_we.calc_direct_bias()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    37
                 | 
                                    
                                                     | 
                
                 | 
                   0.07307904249481942  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    38
                 | 
                                    
                                                     | 
                
                 | 
                   >>> w2v_gender_bias_we.debias()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    39
                 | 
                                    
                                                     | 
                
                 | 
                   >>> w2v_gender_bias_we.calc_direct_bias()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    40
                 | 
                                    
                                                     | 
                
                 | 
                   1.7964246601064155e-09  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    41
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    42
                 | 
                                    
                                                     | 
                
                 | 
                Types of Bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    43
                 | 
                                    
                                                     | 
                
                 | 
                ~~~~~~~~~~~~~  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    44
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    45
                 | 
                                    
                                                     | 
                
                 | 
                Direct Bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    46
                 | 
                                    
                                                     | 
                
                 | 
                ^^^^^^^^^^^  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    47
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    48
                 | 
                                    
                                                     | 
                
                 | 
                1. Associations  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    49
                 | 
                                    
                                                     | 
                
                 | 
                    Words that are closer to one end (e.g., *he*) than to  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    50
                 | 
                                    
                                                     | 
                
                 | 
                    the other end (*she*).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    51
                 | 
                                    
                                                     | 
                
                 | 
                    For example, occupational stereotypes (page 7).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    52
                 | 
                                    
                                                     | 
                
                 | 
                    Calculated by  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    53
                 | 
                                    
                                                     | 
                
                 | 
                    :meth:`~ethically.we.bias.BiasWordEmbedding.calc_direct_bias`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    54
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    55
                 | 
                                    
                                                     | 
                
                 | 
                2. Analogies  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    56
                 | 
                                    
                                                     | 
                
                 | 
                    Analogies of *he:x::she:y*.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    57
                 | 
                                    
                                                     | 
                
                 | 
                    For example analogies exhibiting stereotypes (page 7).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    58
                 | 
                                    
                                                     | 
                
                 | 
                    Generated by  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    59
                 | 
                                    
                                                     | 
                
                 | 
                    :meth:`~ethically.we.bias.BiasWordEmbedding.generate_analogies`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    60
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    61
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    62
                 | 
                                    
                                                     | 
                
                 | 
                Indirect Bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    63
                 | 
                                    
                                                     | 
                
                 | 
                ^^^^^^^^^^^^^  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    64
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    65
                 | 
                                    
                                                     | 
                
                 | 
                Projection of a neutral words into a two neutral words direction  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    66
                 | 
                                    
                                                     | 
                
                 | 
                is explained in a great portion by a shared bias direction projection.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    67
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    68
                 | 
                                    
                                                     | 
                
                 | 
                Calculated by  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    69
                 | 
                                    
                                                     | 
                
                 | 
                :meth:`~ethically.we.bias.BiasWordEmbedding.calc_indirect_bias`  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    70
                 | 
                                    
                                                     | 
                
                 | 
                and  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    71
                 | 
                                    
                                                     | 
                
                 | 
                :meth:`~ethically.we.bias.GenderBiasWE.generate_closest_words_indirect_bias`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    72
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    73
                 | 
                                    
                                                     | 
                
                 | 
                """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    74
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    75
                 | 
                                    
                                                     | 
                
                 | 
                import copy  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    76
                 | 
                                    
                                                     | 
                
                 | 
                import warnings  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    77
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    78
                 | 
                                    
                                                     | 
                
                 | 
                import matplotlib.pylab as plt  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    79
                 | 
                                    
                                                     | 
                
                 | 
                import numpy as np  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    80
                 | 
                                    
                                                     | 
                
                 | 
                import pandas as pd  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    81
                 | 
                                    
                                                     | 
                
                 | 
                import seaborn as sns  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    82
                 | 
                                    
                                                     | 
                
                 | 
                from scipy.stats import spearmanr  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    83
                 | 
                                    
                                                     | 
                
                 | 
                from sklearn.decomposition import PCA  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    84
                 | 
                                    
                                                     | 
                
                 | 
                from sklearn.metrics.pairwise import euclidean_distances  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    85
                 | 
                                    
                                                     | 
                
                 | 
                from sklearn.svm import LinearSVC  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    86
                 | 
                                    
                                                     | 
                
                 | 
                from tqdm import tqdm  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    87
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    88
                 | 
                                    
                                                     | 
                
                 | 
                from tabulate import tabulate  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    89
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    90
                 | 
                                    
                                                     | 
                
                 | 
                from ..consts import RANDOM_STATE  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    91
                 | 
                                    
                                                     | 
                
                 | 
                from .benchmark import evaluate_word_embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    92
                 | 
                                    
                                                     | 
                
                 | 
                from .data import BOLUKBASI_DATA  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    93
                 | 
                                    
                                                     | 
                
                 | 
                from .utils import (  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    94
                 | 
                                    
                                                     | 
                
                 | 
                    assert_gensim_keyed_vectors, cosine_similarity, generate_one_word_forms,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    95
                 | 
                                    
                                                     | 
                
                 | 
                    generate_words_forms, get_seed_vector, most_similar, normalize,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    96
                 | 
                                    
                                                     | 
                
                 | 
                    plot_clustering_as_classification, project_params, project_reject_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    97
                 | 
                                    
                                                     | 
                
                 | 
                    project_vector, reject_vector, round_to_extreme,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    98
                 | 
                                    
                                                     | 
                
                 | 
                    take_two_sides_extreme_sorted, update_word_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    99
                 | 
                                    
                                                     | 
                
                 | 
                )  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    100
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    101
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    102
                 | 
                                    
                                                     | 
                
                 | 
                DIRECTION_METHODS = ['single', 'sum', 'pca']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    103
                 | 
                                    
                                                     | 
                
                 | 
                DEBIAS_METHODS = ['neutralize', 'hard', 'soft']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    104
                 | 
                                    
                                                     | 
                
                 | 
                FIRST_PC_THRESHOLD = 0.5  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    105
                 | 
                                    
                                                     | 
                
                 | 
                MAX_NON_SPECIFIC_EXAMPLES = 1000  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    106
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    107
                 | 
                                    
                                                     | 
                
                 | 
                __all__ = ['GenderBiasWE', 'BiasWordEmbedding']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    108
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    109
                 | 
                                    
                                                     | 
                
                 | 
                # https://stackoverflow.com/questions/2187269/print-only-the-message-on-warnings  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    110
                 | 
                                    
                                                     | 
                
                 | 
                formatwarning_orig = warnings.formatwarning  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    111
                 | 
                                    
                                                     | 
                
                 | 
                # pylint: disable=line-too-long  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    112
                 | 
                                    
                                                     | 
                
                 | 
                warnings.formatwarning = lambda message, category, filename, lineno, line=None: \  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    113
                 | 
                                    
                                                     | 
                
                 | 
                    formatwarning_orig(message, category, filename, lineno, line='')  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    114
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    115
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    116
                 | 
                                    
                                                     | 
                
                 | 
                class BiasWordEmbedding:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    117
                 | 
                                    
                                                     | 
                
                 | 
                    """Measure and adjust a bias in English word embedding.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    118
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    119
                 | 
                                    
                                                     | 
                
                 | 
                    :param model: Word embedding model of ``gensim.model.KeyedVectors``  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    120
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool only_lower: Whether the word embedding contrains  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    121
                 | 
                                    
                                                     | 
                
                 | 
                                            only lower case words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    122
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool verbose: Set verbosity  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    123
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool to_normalize: Whether to normalize all the vectors  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    124
                 | 
                                    
                                                     | 
                
                 | 
                                              (recommended!)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    125
                 | 
                                    
                                                     | 
                
                 | 
                    """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    126
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    127
                 | 
                                    
                                                     | 
                
                 | 
                    def __init__(self, model, only_lower=False, verbose=False,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    128
                 | 
                                    
                                                     | 
                
                 | 
                                 identify_direction=False, to_normalize=True):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    129
                 | 
                                    
                                                     | 
                
                 | 
                        assert_gensim_keyed_vectors(model)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    130
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    131
                 | 
                                    
                                                     | 
                
                 | 
                        # TODO: this is bad Python, ask someone about it  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    132
                 | 
                                    
                                                     | 
                
                 | 
                        # probably should be a better design  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    133
                 | 
                                    
                                                     | 
                
                 | 
                        # identify_direction doesn't have any meaning  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    134
                 | 
                                    
                                                     | 
                
                 | 
                        # for the class BiasWordEmbedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    135
                 | 
                                    
                                                     | 
                
                 | 
                        if self.__class__ == __class__ and identify_direction is not False:  | 
            
                            
                    | 
                        
                     | 
                     | 
                     | 
                    
                                                                                                    
                        
                         
                                                                                        
                                                                                     
                     | 
                
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    136
                 | 
                                    
                                                     | 
                
                 | 
                            raise ValueError('identify_direction must be False' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    137
                 | 
                                    
                                                     | 
                
                 | 
                                             ' for an instance of {}' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    138
                 | 
                                    
                                                     | 
                
                 | 
                                             .format(__class__))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    139
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    140
                 | 
                                    
                                                     | 
                
                 | 
                        self.model = model  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    141
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    142
                 | 
                                    
                                                     | 
                
                 | 
                        # TODO: write unitest for when it is False  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    143
                 | 
                                    
                                                     | 
                
                 | 
                        self.only_lower = only_lower  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    144
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    145
                 | 
                                    
                                                     | 
                
                 | 
                        self._verbose = verbose  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    146
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    147
                 | 
                                    
                                                     | 
                
                 | 
                        self.direction = None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    148
                 | 
                                    
                                                     | 
                
                 | 
                        self.positive_end = None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    149
                 | 
                                    
                                                     | 
                
                 | 
                        self.negative_end = None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    150
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    151
                 | 
                                    
                                                     | 
                
                 | 
                        if to_normalize:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    152
                 | 
                                    
                                                     | 
                
                 | 
                            self.model.init_sims(replace=True)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    153
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    154
                 | 
                                    
                                                     | 
                
                 | 
                    def __copy__(self):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    155
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding = self.__class__(self.model,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    156
                 | 
                                    
                                                     | 
                
                 | 
                                                             self.only_lower,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    157
                 | 
                                    
                                                     | 
                
                 | 
                                                             self._verbose,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    158
                 | 
                                    
                                                     | 
                
                 | 
                                                             identify_direction=False)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    159
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding.direction = copy.deepcopy(self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    160
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding.positive_end = copy.deepcopy(self.positive_end)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    161
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding.negative_end = copy.deepcopy(self.negative_end)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    162
                 | 
                                    
                                                     | 
                
                 | 
                        return bias_word_embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    163
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    164
                 | 
                                    
                                                     | 
                
                 | 
                    def __deepcopy__(self, memo):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    165
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding = copy.copy(self)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    166
                 | 
                                    
                                                     | 
                
                 | 
                        bias_word_embedding.model = copy.deepcopy(bias_word_embedding.model)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    167
                 | 
                                    
                                                     | 
                
                 | 
                        return bias_word_embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    168
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    169
                 | 
                                    
                                                     | 
                
                 | 
                    def __getitem__(self, key):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    170
                 | 
                                    
                                                     | 
                
                 | 
                        return self.model[key]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    171
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    172
                 | 
                                    
                                                     | 
                
                 | 
                    def __contains__(self, item):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    173
                 | 
                                    
                                                     | 
                
                 | 
                        return item in self.model  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    174
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    175
                 | 
                                    
                                                     | 
                
                 | 
                    def _filter_words_by_model(self, words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    176
                 | 
                                    
                                                     | 
                
                 | 
                        return [word for word in words if word in self]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    177
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    178
                 | 
                                    
                                                     | 
                
                 | 
                    def _is_direction_identified(self):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    179
                 | 
                                    
                                                     | 
                
                 | 
                        if self.direction is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    180
                 | 
                                    
                                                     | 
                
                 | 
                            raise RuntimeError('The direction was not identified' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    181
                 | 
                                    
                                                     | 
                
                 | 
                                               ' for this {} instance' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    182
                 | 
                                    
                                                     | 
                
                 | 
                                               .format(self.__class__.__name__))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    183
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    184
                 | 
                                    
                                                     | 
                
                 | 
                    # There is a mistake in the article  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    185
                 | 
                                    
                                                     | 
                
                 | 
                    # it is written (section 5.1):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    186
                 | 
                                    
                                                     | 
                
                 | 
                    # "To identify the gender subspace, we took the ten gender pair difference  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    187
                 | 
                                    
                                                     | 
                
                 | 
                    # vectors and computed its principal components (PCs)"  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    188
                 | 
                                    
                                                     | 
                
                 | 
                    # however in the source code:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    189
                 | 
                                    
                                                     | 
                
                 | 
                    # https://github.com/tolga-b/debiaswe/blob/10277b23e187ee4bd2b6872b507163ef4198686b/debiaswe/we.py#L235-L245  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    190
                 | 
                                    
                                                     | 
                
                 | 
                    def _identify_subspace_by_pca(self, definitional_pairs, n_components):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    191
                 | 
                                    
                                                     | 
                
                 | 
                        matrix = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    192
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    193
                 | 
                                    
                                                     | 
                
                 | 
                        for word1, word2 in definitional_pairs:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    194
                 | 
                                    
                                                     | 
                
                 | 
                            vector1 = normalize(self[word1])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    195
                 | 
                                    
                                                     | 
                
                 | 
                            vector2 = normalize(self[word2])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    196
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    197
                 | 
                                    
                                                     | 
                
                 | 
                            center = (vector1 + vector2) / 2  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    198
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    199
                 | 
                                    
                                                     | 
                
                 | 
                            matrix.append(vector1 - center)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    200
                 | 
                                    
                                                     | 
                
                 | 
                            matrix.append(vector2 - center)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    201
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    202
                 | 
                                    
                                                     | 
                
                 | 
                        pca = PCA(n_components=n_components)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    203
                 | 
                                    
                                                     | 
                
                 | 
                        pca.fit(matrix)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    204
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    205
                 | 
                                    
                                                     | 
                
                 | 
                        if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    206
                 | 
                                    
                                                     | 
                
                 | 
                            table = enumerate(pca.explained_variance_ratio_, start=1)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    207
                 | 
                                    
                                                     | 
                
                 | 
                            headers = ['Principal Component',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    208
                 | 
                                    
                                                     | 
                
                 | 
                                       'Explained Variance Ratio']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    209
                 | 
                                    
                                                     | 
                
                 | 
                            print(tabulate(table, headers=headers))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    210
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    211
                 | 
                                    
                                                     | 
                
                 | 
                        return pca  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    212
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    213
                 | 
                                    
                                                     | 
                
                 | 
                    # TODO: add the SVD method from section 6 step 1  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    214
                 | 
                                    
                                                     | 
                
                 | 
                    # It seems there is a mistake there, I think it is the same as PCA  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    215
                 | 
                                    
                                                     | 
                
                 | 
                    # just with replacing it with SVD  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    216
                 | 
                                    
                                                     | 
                
                 | 
                    def _identify_direction(self, positive_end, negative_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    217
                 | 
                                    
                                                     | 
                
                 | 
                                            definitional, method='pca'):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    218
                 | 
                                    
                                                     | 
                
                 | 
                        if method not in DIRECTION_METHODS:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    219
                 | 
                                    
                                                     | 
                
                 | 
                            raise ValueError('method should be one of {}, {} was given'.format( | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    220
                 | 
                                    
                                                     | 
                
                 | 
                                DIRECTION_METHODS, method))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    221
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    222
                 | 
                                    
                                                     | 
                
                 | 
                        if positive_end == negative_end:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    223
                 | 
                                    
                                                     | 
                
                 | 
                            raise ValueError('positive_end and negative_end' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    224
                 | 
                                    
                                                     | 
                
                 | 
                                             'should be different, and not the same "{}"' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    225
                 | 
                                    
                                                     | 
                
                 | 
                                             .format(positive_end))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    226
                 | 
                                    
                                                     | 
                
                 | 
                        if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    227
                 | 
                                    
                                                     | 
                
                 | 
                            print('Identify direction using {} method...'.format(method)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    228
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    229
                 | 
                                    
                                                     | 
                
                 | 
                        direction = None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    230
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    231
                 | 
                                    
                                                     | 
                
                 | 
                        if method == 'single':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    232
                 | 
                                    
                                                     | 
                
                 | 
                            direction = normalize(normalize(self[definitional[0]])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    233
                 | 
                                    
                                                     | 
                
                 | 
                                                  - normalize(self[definitional[1]]))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    234
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    235
                 | 
                                    
                                                     | 
                
                 | 
                        elif method == 'sum':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    236
                 | 
                                    
                                                     | 
                
                 | 
                            group1_sum_vector = np.sum([self[word]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    237
                 | 
                                    
                                                     | 
                
                 | 
                                                        for word in definitional[0]], axis=0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    238
                 | 
                                    
                                                     | 
                
                 | 
                            group2_sum_vector = np.sum([self[word]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    239
                 | 
                                    
                                                     | 
                
                 | 
                                                        for word in definitional[1]], axis=0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    240
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    241
                 | 
                                    
                                                     | 
                
                 | 
                            diff_vector = (normalize(group1_sum_vector)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    242
                 | 
                                    
                                                     | 
                
                 | 
                                           - normalize(group2_sum_vector))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    243
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    244
                 | 
                                    
                                                     | 
                
                 | 
                            direction = normalize(diff_vector)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    245
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    246
                 | 
                                    
                                                     | 
                
                 | 
                        elif method == 'pca':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    247
                 | 
                                    
                                                     | 
                
                 | 
                            pca = self._identify_subspace_by_pca(definitional, 10)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    248
                 | 
                                    
                                                     | 
                
                 | 
                            if pca.explained_variance_ratio_[0] < FIRST_PC_THRESHOLD:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    249
                 | 
                                    
                                                     | 
                
                 | 
                                raise RuntimeError('The Explained variance' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    250
                 | 
                                    
                                                     | 
                
                 | 
                                                   'of the first principal component should be'  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    251
                 | 
                                    
                                                     | 
                
                 | 
                                                   'at least {}, but it is {}' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    252
                 | 
                                    
                                                     | 
                
                 | 
                                                   .format(FIRST_PC_THRESHOLD,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    253
                 | 
                                    
                                                     | 
                
                 | 
                                                           pca.explained_variance_ratio_[0]))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    254
                 | 
                                    
                                                     | 
                
                 | 
                            direction = pca.components_[0]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    255
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    256
                 | 
                                    
                                                     | 
                
                 | 
                            # if direction is opposite (e.g. we cannot control  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    257
                 | 
                                    
                                                     | 
                
                 | 
                            # what the PCA will return)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    258
                 | 
                                    
                                                     | 
                
                 | 
                            ends_diff_projection = cosine_similarity((self[positive_end]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    259
                 | 
                                    
                                                     | 
                
                 | 
                                                                      - self[negative_end]),  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    260
                 | 
                                    
                                                     | 
                
                 | 
                                                                     direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    261
                 | 
                                    
                                                     | 
                
                 | 
                            if ends_diff_projection < 0:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    262
                 | 
                                    
                                                     | 
                
                 | 
                                direction = -direction  # pylint: disable=invalid-unary-operand-type  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    263
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    264
                 | 
                                    
                                                     | 
                
                 | 
                        self.direction = direction  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    265
                 | 
                                    
                                                     | 
                
                 | 
                        self.positive_end = positive_end  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    266
                 | 
                                    
                                                     | 
                
                 | 
                        self.negative_end = negative_end  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    267
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    268
                 | 
                                    
                                                     | 
                
                 | 
                    def project_on_direction(self, word):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    269
                 | 
                                    
                                                     | 
                
                 | 
                        """Project the normalized vector of the word on the direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    270
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    271
                 | 
                                    
                                                     | 
                
                 | 
                        :param str word: The word tor project  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    272
                 | 
                                    
                                                     | 
                
                 | 
                        :return float: The projection scalar  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    273
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    274
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    275
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    276
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    277
                 | 
                                    
                                                     | 
                
                 | 
                        vector = self[word]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    278
                 | 
                                    
                                                     | 
                
                 | 
                        projection_score = self.model.cosine_similarities(self.direction,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    279
                 | 
                                    
                                                     | 
                
                 | 
                                                                          [vector])[0]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    280
                 | 
                                    
                                                     | 
                
                 | 
                        return projection_score  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    281
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    282
                 | 
                                    
                                                     | 
                
                 | 
                    def _calc_projection_scores(self, words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    283
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    284
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    285
                 | 
                                    
                                                     | 
                
                 | 
                        df = pd.DataFrame({'word': words}) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    286
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    287
                 | 
                                    
                                                     | 
                
                 | 
                        # TODO: maybe using cosine_similarities on all the vectors?  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    288
                 | 
                                    
                                                     | 
                
                 | 
                        # it might be faster  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    289
                 | 
                                    
                                                     | 
                
                 | 
                        df['projection'] = df['word'].apply(self.project_on_direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    290
                 | 
                                    
                                                     | 
                
                 | 
                        df = df.sort_values('projection', ascending=False) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    291
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    292
                 | 
                                    
                                                     | 
                
                 | 
                        return df  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    293
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    294
                 | 
                                    
                                                     | 
                
                 | 
                    def calc_projection_data(self, words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    295
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    296
                 | 
                                    
                                                     | 
                
                 | 
                        Calculate projection, projected and rejected vectors of a words list.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    297
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    298
                 | 
                                    
                                                     | 
                
                 | 
                        :param list words: List of words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    299
                 | 
                                    
                                                     | 
                
                 | 
                        :return: :class:`pandas.DataFrame` of the projection,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    300
                 | 
                                    
                                                     | 
                
                 | 
                                 projected and rejected vectors of the words list  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    301
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    302
                 | 
                                    
                                                     | 
                
                 | 
                        projection_data = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    303
                 | 
                                    
                                                     | 
                
                 | 
                        for word in words:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    304
                 | 
                                    
                                                     | 
                
                 | 
                            vector = self[word]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    305
                 | 
                                    
                                                     | 
                
                 | 
                            projection = self.project_on_direction(word)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    306
                 | 
                                    
                                                     | 
                
                 | 
                            normalized_vector = normalize(vector)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    307
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    308
                 | 
                                    
                                                     | 
                
                 | 
                            (projection,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    309
                 | 
                                    
                                                     | 
                
                 | 
                             projected_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    310
                 | 
                                    
                                                     | 
                
                 | 
                             rejected_vector) = project_params(normalized_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    311
                 | 
                                    
                                                     | 
                
                 | 
                                                               self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    312
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    313
                 | 
                                    
                                                     | 
                
                 | 
                            projection_data.append({'word': word, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    314
                 | 
                                    
                                                     | 
                
                 | 
                                                    'vector': vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    315
                 | 
                                    
                                                     | 
                
                 | 
                                                    'projection': projection,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    316
                 | 
                                    
                                                     | 
                
                 | 
                                                    'projected_vector': projected_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    317
                 | 
                                    
                                                     | 
                
                 | 
                                                    'rejected_vector': rejected_vector})  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    318
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    319
                 | 
                                    
                                                     | 
                
                 | 
                        return pd.DataFrame(projection_data)  | 
            
            
                                                                                                            
                                                                
            
                                    
            
            
                | 
                    320
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    321
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_projection_scores(self, words, n_extreme=10,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    322
                 | 
                                    
                                                     | 
                
                 | 
                                               ax=None, axis_projection_step=None):  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    323
                 | 
                                    
                                                     | 
                
                 | 
                        """Plot the projection scalar of words on the direction.  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    324
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    325
                 | 
                                    
                                                     | 
                
                 | 
                        :param list words: The words tor project  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    326
                 | 
                                    
                                                     | 
                
                 | 
                        :param int or None n_extreme: The number of extreme words to show  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    327
                 | 
                                    
                                                     | 
                
                 | 
                        :return: The ax object of the plot  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    328
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    329
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    330
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    331
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    332
                 | 
                                    
                                                     | 
                
                 | 
                        projections_df = self._calc_projection_scores(words)  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    333
                 | 
                                    
                                                     | 
                
                 | 
                        projections_df['projection'] = projections_df['projection'].round(2)  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    334
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    335
                 | 
                                    
                                                     | 
                
                 | 
                        if n_extreme is not None:  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    336
                 | 
                                    
                                                     | 
                
                 | 
                            projections_df = take_two_sides_extreme_sorted(projections_df,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    337
                 | 
                                    
                                                     | 
                
                 | 
                                                                           n_extreme=n_extreme)  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    338
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    339
                 | 
                                    
                                                     | 
                
                 | 
                        if ax is None:  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    340
                 | 
                                    
                                                     | 
                
                 | 
                            _, ax = plt.subplots(1)  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    341
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    342
                 | 
                                    
                                                     | 
                
                 | 
                        if axis_projection_step is None:  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    343
                 | 
                                    
                                                     | 
                
                 | 
                            axis_projection_step = 0.1  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    344
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    345
                 | 
                                    
                                                     | 
                
                 | 
                        cmap = plt.get_cmap('RdBu') | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    346
                 | 
                                    
                                                     | 
                
                 | 
                        projections_df['color'] = ((projections_df['projection'] + 0.5)  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    347
                 | 
                                    
                                                     | 
                
                 | 
                                                   .apply(cmap))  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    348
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    349
                 | 
                                    
                                                     | 
                
                 | 
                        most_extream_projection = (projections_df['projection']  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    350
                 | 
                                    
                                                     | 
                
                 | 
                                                   .abs()  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    351
                 | 
                                    
                                                     | 
                
                 | 
                                                   .max()  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    352
                 | 
                                    
                                                     | 
                
                 | 
                                                   .round(1))  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    353
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    354
                 | 
                                    
                                                     | 
                
                 | 
                        sns.barplot(x='projection', y='word', data=projections_df,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    355
                 | 
                                    
                                                     | 
                
                 | 
                                    palette=projections_df['color'])  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    356
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    357
                 | 
                                    
                                                     | 
                
                 | 
                        plt.xticks(np.arange(-most_extream_projection,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    358
                 | 
                                    
                                                     | 
                
                 | 
                                             most_extream_projection + axis_projection_step,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    359
                 | 
                                    
                                                     | 
                
                 | 
                                             axis_projection_step))  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    360
                 | 
                                    
                                                     | 
                
                 | 
                        plt.title('← {} {} {} →'.format(self.negative_end, | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    361
                 | 
                                    
                                                     | 
                
                 | 
                                                        ' ' * 20,  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    362
                 | 
                                    
                                                     | 
                
                 | 
                                                        self.positive_end))  | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    363
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    364
                 | 
                                    
                                                     | 
                
                 | 
                        plt.xlabel('Direction Projection') | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    365
                 | 
                                    
                                                     | 
                
                 | 
                        plt.ylabel('Words') | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    366
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                        
                            
            
                                    
            
            
                | 
                    367
                 | 
                                    
                                                     | 
                
                 | 
                        return ax  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    368
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    369
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_dist_projections_on_direction(self, word_groups, ax=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    370
                 | 
                                    
                                                     | 
                
                 | 
                        """Plot the projection scalars distribution on the direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    371
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    372
                 | 
                                    
                                                     | 
                
                 | 
                        :param dict word_groups word: The groups to projects  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    373
                 | 
                                    
                                                     | 
                
                 | 
                        :return float: The ax object of the plot  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    374
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    375
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    376
                 | 
                                    
                                                     | 
                
                 | 
                        if ax is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    377
                 | 
                                    
                                                     | 
                
                 | 
                            _, ax = plt.subplots(1)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    378
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    379
                 | 
                                    
                                                     | 
                
                 | 
                        names = sorted(word_groups.keys())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    380
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    381
                 | 
                                    
                                                     | 
                
                 | 
                        for name in names:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    382
                 | 
                                    
                                                     | 
                
                 | 
                            words = word_groups[name]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    383
                 | 
                                    
                                                     | 
                
                 | 
                            label = '{} (#{})'.format(name, len(words)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    384
                 | 
                                    
                                                     | 
                
                 | 
                            vectors = [self[word] for word in words]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    385
                 | 
                                    
                                                     | 
                
                 | 
                            projections = self.model.cosine_similarities(self.direction,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    386
                 | 
                                    
                                                     | 
                
                 | 
                                                                         vectors)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    387
                 | 
                                    
                                                     | 
                
                 | 
                            sns.distplot(projections, hist=False, label=label, ax=ax)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    388
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    389
                 | 
                                    
                                                     | 
                
                 | 
                        plt.axvline(0, color='k', linestyle='--')  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    390
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    391
                 | 
                                    
                                                     | 
                
                 | 
                        plt.title('← {} {} {} →'.format(self.negative_end, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    392
                 | 
                                    
                                                     | 
                
                 | 
                                                        ' ' * 20,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    393
                 | 
                                    
                                                     | 
                
                 | 
                                                        self.positive_end))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    394
                 | 
                                    
                                                     | 
                
                 | 
                        plt.xlabel('Direction Projection') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    395
                 | 
                                    
                                                     | 
                
                 | 
                        plt.ylabel('Density') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    396
                 | 
                                    
                                                     | 
                
                 | 
                        ax.legend(loc='center left', bbox_to_anchor=(1, 0.5))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    397
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    398
                 | 
                                    
                                                     | 
                
                 | 
                        return ax  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    399
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    400
                 | 
                                    
                                                     | 
                
                 | 
                    @classmethod  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    401
                 | 
                                    
                                                     | 
                
                 | 
                    def _calc_bias_across_word_embeddings(cls,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    402
                 | 
                                    
                                                     | 
                
                 | 
                                                          word_embedding_bias_dict,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    403
                 | 
                                    
                                                     | 
                
                 | 
                                                          words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    404
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    405
                 | 
                                    
                                                     | 
                
                 | 
                        Calculate to projections and rho of words for two word embeddings.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    406
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    407
                 | 
                                    
                                                     | 
                
                 | 
                        :param dict word_embedding_bias_dict: ``WordsEmbeddingBias`` objects  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    408
                 | 
                                    
                                                     | 
                
                 | 
                                                               as values,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    409
                 | 
                                    
                                                     | 
                
                 | 
                                                               and their names as keys.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    410
                 | 
                                    
                                                     | 
                
                 | 
                        :param list words: Words to be projected.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    411
                 | 
                                    
                                                     | 
                
                 | 
                        :return tuple: Projections and spearman rho.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    412
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    413
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=W0212  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    414
                 | 
                                    
                                                     | 
                
                 | 
                        assert len(word_embedding_bias_dict) == 2, 'Support only in two'\  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    415
                 | 
                                    
                                                     | 
                
                 | 
                                                                    'word embeddings'  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    416
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    417
                 | 
                                    
                                                     | 
                
                 | 
                        intersection_words = [word for word in words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    418
                 | 
                                    
                                                     | 
                
                 | 
                                              if all(word in web  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    419
                 | 
                                    
                                                     | 
                
                 | 
                                                     for web in (word_embedding_bias_dict  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    420
                 | 
                                    
                                                     | 
                
                 | 
                                                                 .values()))]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    421
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    422
                 | 
                                    
                                                     | 
                
                 | 
                        projections = {name: web._calc_projection_scores(intersection_words)['projection']  # pylint: disable=C0301 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    423
                 | 
                                    
                                                     | 
                
                 | 
                                       for name, web in word_embedding_bias_dict.items()}  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    424
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    425
                 | 
                                    
                                                     | 
                
                 | 
                        df = pd.DataFrame(projections)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    426
                 | 
                                    
                                                     | 
                
                 | 
                        df.index = intersection_words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    427
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    428
                 | 
                                    
                                                     | 
                
                 | 
                        rho, _ = spearmanr(*df.transpose().values)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    429
                 | 
                                    
                                                     | 
                
                 | 
                        return df, rho  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    430
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    431
                 | 
                                    
                                                     | 
                
                 | 
                    @classmethod  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    432
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_bias_across_word_embeddings(cls, word_embedding_bias_dict,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    433
                 | 
                                    
                                                     | 
                
                 | 
                                                         words, ax=None, scatter_kwargs=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    434
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    435
                 | 
                                    
                                                     | 
                
                 | 
                        Plot the projections of same words of two word mbeddings.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    436
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    437
                 | 
                                    
                                                     | 
                
                 | 
                        :param dict word_embedding_bias_dict: ``WordsEmbeddingBias`` objects  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    438
                 | 
                                    
                                                     | 
                
                 | 
                                                               as values,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    439
                 | 
                                    
                                                     | 
                
                 | 
                                                               and their names as keys.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    440
                 | 
                                    
                                                     | 
                
                 | 
                        :param list words: Words to be projected.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    441
                 | 
                                    
                                                     | 
                
                 | 
                        :param scatter_kwargs: Kwargs for matplotlib.pylab.scatter.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    442
                 | 
                                    
                                                     | 
                
                 | 
                        :type scatter_kwargs: dict or None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    443
                 | 
                                    
                                                     | 
                
                 | 
                        :return: The ax object of the plot  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    444
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    445
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=W0212  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    446
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    447
                 | 
                                    
                                                     | 
                
                 | 
                        df, rho = cls._calc_bias_across_word_embeddings(word_embedding_bias_dict,  # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    448
                 | 
                                    
                                                     | 
                
                 | 
                                                                        words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    449
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    450
                 | 
                                    
                                                     | 
                
                 | 
                        if ax is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    451
                 | 
                                    
                                                     | 
                
                 | 
                            _, ax = plt.subplots(1)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    452
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    453
                 | 
                                    
                                                     | 
                
                 | 
                        if scatter_kwargs is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    454
                 | 
                                    
                                                     | 
                
                 | 
                            scatter_kwargs = {} | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    455
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    456
                 | 
                                    
                                                     | 
                
                 | 
                        name1, name2 = word_embedding_bias_dict.keys()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    457
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    458
                 | 
                                    
                                                     | 
                
                 | 
                        ax.scatter(x=name1, y=name2, data=df, **scatter_kwargs)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    459
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    460
                 | 
                                    
                                                     | 
                
                 | 
                        plt.title('Bias Across Word Embeddings' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    461
                 | 
                                    
                                                     | 
                
                 | 
                                  '(Spearman Rho = {:0.2f})'.format(rho)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    462
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    463
                 | 
                                    
                                                     | 
                
                 | 
                        negative_end = word_embedding_bias_dict[name1].negative_end  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    464
                 | 
                                    
                                                     | 
                
                 | 
                        positive_end = word_embedding_bias_dict[name1].positive_end  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    465
                 | 
                                    
                                                     | 
                
                 | 
                        plt.xlabel('← {}     {}     {} →'.format(negative_end, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    466
                 | 
                                    
                                                     | 
                
                 | 
                                                                 name1,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    467
                 | 
                                    
                                                     | 
                
                 | 
                                                                 positive_end))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    468
                 | 
                                    
                                                     | 
                
                 | 
                        plt.ylabel('← {}     {}     {} →'.format(negative_end, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    469
                 | 
                                    
                                                     | 
                
                 | 
                                                                 name2,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    470
                 | 
                                    
                                                     | 
                
                 | 
                                                                 positive_end))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    471
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    472
                 | 
                                    
                                                     | 
                
                 | 
                        ax_min = round_to_extreme(df.values.min())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    473
                 | 
                                    
                                                     | 
                
                 | 
                        ax_max = round_to_extreme(df.values.max())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    474
                 | 
                                    
                                                     | 
                
                 | 
                        plt.xlim(ax_min, ax_max)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    475
                 | 
                                    
                                                     | 
                
                 | 
                        plt.ylim(ax_min, ax_max)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    476
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    477
                 | 
                                    
                                                     | 
                
                 | 
                        return ax  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    478
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    479
                 | 
                                    
                                                     | 
                
                 | 
                    # TODO: refactor for speed and clarity  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    480
                 | 
                                    
                                                     | 
                
                 | 
                    def generate_analogies(self, n_analogies=100, seed='ends',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    481
                 | 
                                    
                                                     | 
                
                 | 
                                           multiple=False,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    482
                 | 
                                    
                                                     | 
                
                 | 
                                           delta=1., restrict_vocab=30000,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    483
                 | 
                                    
                                                     | 
                
                 | 
                                           unrestricted=False):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    484
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    485
                 | 
                                    
                                                     | 
                
                 | 
                        Generate analogies based on a seed vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    486
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    487
                 | 
                                    
                                                     | 
                
                 | 
                        x - y ~ seed vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    488
                 | 
                                    
                                                     | 
                
                 | 
                        or a:x::b:y when a-b ~ seed vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    489
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    490
                 | 
                                    
                                                     | 
                
                 | 
                        The seed vector can be defined by two word ends,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    491
                 | 
                                    
                                                     | 
                
                 | 
                        or by the bias direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    492
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    493
                 | 
                                    
                                                     | 
                
                 | 
                        ``delta`` is used for semantically coherent. Default vale of 1  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    494
                 | 
                                    
                                                     | 
                
                 | 
                        corresponds to an angle <= pi/3.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    495
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    496
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    497
                 | 
                                    
                                                     | 
                
                 | 
                        There is criticism regarding generating analogies  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    498
                 | 
                                    
                                                     | 
                
                 | 
                        when used with `unstricted=False` and not ignoring analogies  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    499
                 | 
                                    
                                                     | 
                
                 | 
                        with `match` column equal to `False`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    500
                 | 
                                    
                                                     | 
                
                 | 
                        Tolga's technique of generating analogies, as implemented in this  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    501
                 | 
                                    
                                                     | 
                
                 | 
                        method, is limited inherently to analogies with x != y, which may  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    502
                 | 
                                    
                                                     | 
                
                 | 
                        be force "fake" bias analogies.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    503
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    504
                 | 
                                    
                                                     | 
                
                 | 
                        See:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    505
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    506
                 | 
                                    
                                                     | 
                
                 | 
                        - Nissim, M., van Noord, R., van der Goot, R. (2019).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    507
                 | 
                                    
                                                     | 
                
                 | 
                          `Fair is Better than Sensational: Man is to Doctor  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    508
                 | 
                                    
                                                     | 
                
                 | 
                          as Woman is to Doctor <https://arxiv.org/abs/1905.09866>`_.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    509
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    510
                 | 
                                    
                                                     | 
                
                 | 
                        :param seed: The definition of the seed vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    511
                 | 
                                    
                                                     | 
                
                 | 
                                     Either by a tuple of two word ends,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    512
                 | 
                                    
                                                     | 
                
                 | 
                                     or by `'ends` for the pre-defined ends  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    513
                 | 
                                    
                                                     | 
                
                 | 
                                     or by `'direction'` for the pre-defined direction vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    514
                 | 
                                    
                                                     | 
                
                 | 
                        :param int n_analogies: Number of analogies to generate.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    515
                 | 
                                    
                                                     | 
                
                 | 
                        :param bool multiple: Whether to allow multiple appearances of a word  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    516
                 | 
                                    
                                                     | 
                
                 | 
                                              in the analogies.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    517
                 | 
                                    
                                                     | 
                
                 | 
                        :param float delta: Threshold for semantic similarity.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    518
                 | 
                                    
                                                     | 
                
                 | 
                                            The maximal distance between x and y.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    519
                 | 
                                    
                                                     | 
                
                 | 
                        :param int restrict_vocab: The vocabulary size to use.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    520
                 | 
                                    
                                                     | 
                
                 | 
                        :param bool unrestricted: Whether to validate the generated analogies  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    521
                 | 
                                    
                                                     | 
                
                 | 
                                                  with unrestricted `most_similar`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    522
                 | 
                                    
                                                     | 
                
                 | 
                        :return: Data Frame of analogies (x, y), their distances,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    523
                 | 
                                    
                                                     | 
                
                 | 
                                 and their cosine similarity scores  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    524
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    525
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=C0301,R0914  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    526
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    527
                 | 
                                    
                                                     | 
                
                 | 
                        if not unrestricted:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    528
                 | 
                                    
                                                     | 
                
                 | 
                            warnings.warn('Not Using unrestricted most_similar ' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    529
                 | 
                                    
                                                     | 
                
                 | 
                                          'may introduce fake biased analogies.')  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    530
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    531
                 | 
                                    
                                                     | 
                
                 | 
                        (seed_vector,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    532
                 | 
                                    
                                                     | 
                
                 | 
                         positive_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    533
                 | 
                                    
                                                     | 
                
                 | 
                         negative_end) = get_seed_vector(seed, self)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    534
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    535
                 | 
                                    
                                                     | 
                
                 | 
                        restrict_vocab_vectors = self.model.vectors[:restrict_vocab]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    536
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    537
                 | 
                                    
                                                     | 
                
                 | 
                        normalized_vectors = (restrict_vocab_vectors  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    538
                 | 
                                    
                                                     | 
                
                 | 
                                              / np.linalg.norm(restrict_vocab_vectors, axis=1)[:, None])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    539
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    540
                 | 
                                    
                                                     | 
                
                 | 
                        pairs_distances = euclidean_distances(normalized_vectors, normalized_vectors)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    541
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    542
                 | 
                                    
                                                     | 
                
                 | 
                        # `pairs_distances` must be not-equal to zero  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    543
                 | 
                                    
                                                     | 
                
                 | 
                        # otherwise, x-y will be the zero vector, and every cosine similarity  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    544
                 | 
                                    
                                                     | 
                
                 | 
                        # will be equal to zero.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    545
                 | 
                                    
                                                     | 
                
                 | 
                        # This cause to the **limitation** of this method which enforce a not-same  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    546
                 | 
                                    
                                                     | 
                
                 | 
                        # words for x and y.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    547
                 | 
                                    
                                                     | 
                
                 | 
                        pairs_mask = (pairs_distances < delta) & (pairs_distances != 0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    548
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    549
                 | 
                                    
                                                     | 
                
                 | 
                        pairs_indices = np.array(np.nonzero(pairs_mask)).T  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    550
                 | 
                                    
                                                     | 
                
                 | 
                        x_vectors = np.take(normalized_vectors, pairs_indices[:, 0], axis=0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    551
                 | 
                                    
                                                     | 
                
                 | 
                        y_vectors = np.take(normalized_vectors, pairs_indices[:, 1], axis=0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    552
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    553
                 | 
                                    
                                                     | 
                
                 | 
                        x_minus_y_vectors = x_vectors - y_vectors  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    554
                 | 
                                    
                                                     | 
                
                 | 
                        normalized_x_minus_y_vectors = (x_minus_y_vectors  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    555
                 | 
                                    
                                                     | 
                
                 | 
                                                        / np.linalg.norm(x_minus_y_vectors, axis=1)[:, None])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    556
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    557
                 | 
                                    
                                                     | 
                
                 | 
                        cos_distances = normalized_x_minus_y_vectors @ seed_vector  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    558
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    559
                 | 
                                    
                                                     | 
                
                 | 
                        sorted_cos_distances_indices = np.argsort(cos_distances)[::-1]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    560
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    561
                 | 
                                    
                                                     | 
                
                 | 
                        sorted_cos_distances_indices_iter = iter(sorted_cos_distances_indices)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    562
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    563
                 | 
                                    
                                                     | 
                
                 | 
                        analogies = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    564
                 | 
                                    
                                                     | 
                
                 | 
                        generated_words_x = set()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    565
                 | 
                                    
                                                     | 
                
                 | 
                        generated_words_y = set()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    566
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    567
                 | 
                                    
                                                     | 
                
                 | 
                        while len(analogies) < n_analogies:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    568
                 | 
                                    
                                                     | 
                
                 | 
                            cos_distance_index = next(sorted_cos_distances_indices_iter)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    569
                 | 
                                    
                                                     | 
                
                 | 
                            paris_index = pairs_indices[cos_distance_index]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    570
                 | 
                                    
                                                     | 
                
                 | 
                            word_x, word_y = [self.model.index2word[index]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    571
                 | 
                                    
                                                     | 
                
                 | 
                                              for index in paris_index]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    572
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    573
                 | 
                                    
                                                     | 
                
                 | 
                            if multiple or (not multiple  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    574
                 | 
                                    
                                                     | 
                
                 | 
                                            and (word_x not in generated_words_x  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    575
                 | 
                                    
                                                     | 
                
                 | 
                                                 and word_y not in generated_words_y)):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    576
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    577
                 | 
                                    
                                                     | 
                
                 | 
                                analogy = ({positive_end: word_x, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    578
                 | 
                                    
                                                     | 
                
                 | 
                                            negative_end: word_y,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    579
                 | 
                                    
                                                     | 
                
                 | 
                                            'score': cos_distances[cos_distance_index],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    580
                 | 
                                    
                                                     | 
                
                 | 
                                            'distance': pairs_distances[tuple(paris_index)]})  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    581
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    582
                 | 
                                    
                                                     | 
                
                 | 
                                generated_words_x.add(word_x)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    583
                 | 
                                    
                                                     | 
                
                 | 
                                generated_words_y.add(word_y)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    584
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    585
                 | 
                                    
                                                     | 
                
                 | 
                                if unrestricted:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    586
                 | 
                                    
                                                     | 
                
                 | 
                                    most_x = next(word  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    587
                 | 
                                    
                                                     | 
                
                 | 
                                                  for word, _ in most_similar(self.model,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    588
                 | 
                                    
                                                     | 
                
                 | 
                                                                              [word_y, positive_end],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    589
                 | 
                                    
                                                     | 
                
                 | 
                                                                              [negative_end]))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    590
                 | 
                                    
                                                     | 
                
                 | 
                                    most_y = next(word  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    591
                 | 
                                    
                                                     | 
                
                 | 
                                                  for word, _ in most_similar(self.model,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    592
                 | 
                                    
                                                     | 
                
                 | 
                                                                              [word_x, negative_end],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    593
                 | 
                                    
                                                     | 
                
                 | 
                                                                              [positive_end]))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    594
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    595
                 | 
                                    
                                                     | 
                
                 | 
                                    analogy['most_x'] = most_x  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    596
                 | 
                                    
                                                     | 
                
                 | 
                                    analogy['most_y'] = most_y  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    597
                 | 
                                    
                                                     | 
                
                 | 
                                    analogy['match'] = ((word_x == most_x)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    598
                 | 
                                    
                                                     | 
                
                 | 
                                                        and (word_y == most_y))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    599
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    600
                 | 
                                    
                                                     | 
                
                 | 
                                analogies.append(analogy)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    601
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    602
                 | 
                                    
                                                     | 
                
                 | 
                        df = pd.DataFrame(analogies)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    603
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    604
                 | 
                                    
                                                     | 
                
                 | 
                        columns = [positive_end, negative_end, 'distance', 'score']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    605
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    606
                 | 
                                    
                                                     | 
                
                 | 
                        if unrestricted:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    607
                 | 
                                    
                                                     | 
                
                 | 
                            columns.extend(['most_x', 'most_y', 'match'])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    608
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    609
                 | 
                                    
                                                     | 
                
                 | 
                        df = df[columns]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    610
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    611
                 | 
                                    
                                                     | 
                
                 | 
                        return df  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    612
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    613
                 | 
                                    
                                                     | 
                
                 | 
                    def calc_direct_bias(self, neutral_words, c=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    614
                 | 
                                    
                                                     | 
                
                 | 
                        """Calculate the direct bias.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    615
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    616
                 | 
                                    
                                                     | 
                
                 | 
                        Based on the projection of neutral words on the direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    617
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    618
                 | 
                                    
                                                     | 
                
                 | 
                        :param list neutral_words: List of neutral words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    619
                 | 
                                    
                                                     | 
                
                 | 
                        :param c: Strictness of bias measuring  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    620
                 | 
                                    
                                                     | 
                
                 | 
                        :type c: float or None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    621
                 | 
                                    
                                                     | 
                
                 | 
                        :return: The direct bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    622
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    623
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    624
                 | 
                                    
                                                     | 
                
                 | 
                        if c is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    625
                 | 
                                    
                                                     | 
                
                 | 
                            c = 1  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    626
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    627
                 | 
                                    
                                                     | 
                
                 | 
                        projections = self._calc_projection_scores(neutral_words)['projection']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    628
                 | 
                                    
                                                     | 
                
                 | 
                        direct_bias_terms = np.abs(projections) ** c  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    629
                 | 
                                    
                                                     | 
                
                 | 
                        direct_bias = direct_bias_terms.sum() / len(neutral_words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    630
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    631
                 | 
                                    
                                                     | 
                
                 | 
                        return direct_bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    632
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    633
                 | 
                                    
                                                     | 
                
                 | 
                    def calc_indirect_bias(self, word1, word2):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    634
                 | 
                                    
                                                     | 
                
                 | 
                        """Calculate the indirect bias between two words.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    635
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    636
                 | 
                                    
                                                     | 
                
                 | 
                        Based on the amount of shared projection of the words on the direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    637
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    638
                 | 
                                    
                                                     | 
                
                 | 
                        Also called PairBias.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    639
                 | 
                                    
                                                     | 
                
                 | 
                        :param str word1: First word  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    640
                 | 
                                    
                                                     | 
                
                 | 
                        :param str word2: Second word  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    641
                 | 
                                    
                                                     | 
                
                 | 
                        :type c: float or None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    642
                 | 
                                    
                                                     | 
                
                 | 
                        :return The indirect bias between the two words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    643
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    644
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    645
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    646
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    647
                 | 
                                    
                                                     | 
                
                 | 
                        vector1 = normalize(self[word1])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    648
                 | 
                                    
                                                     | 
                
                 | 
                        vector2 = normalize(self[word2])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    649
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    650
                 | 
                                    
                                                     | 
                
                 | 
                        perpendicular_vector1 = reject_vector(vector1, self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    651
                 | 
                                    
                                                     | 
                
                 | 
                        perpendicular_vector2 = reject_vector(vector2, self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    652
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    653
                 | 
                                    
                                                     | 
                
                 | 
                        inner_product = vector1 @ vector2  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    654
                 | 
                                    
                                                     | 
                
                 | 
                        perpendicular_similarity = cosine_similarity(perpendicular_vector1,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    655
                 | 
                                    
                                                     | 
                
                 | 
                                                                     perpendicular_vector2)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    656
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    657
                 | 
                                    
                                                     | 
                
                 | 
                        indirect_bias = ((inner_product - perpendicular_similarity)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    658
                 | 
                                    
                                                     | 
                
                 | 
                                         / inner_product)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    659
                 | 
                                    
                                                     | 
                
                 | 
                        return indirect_bias  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    660
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    661
                 | 
                                    
                                                     | 
                
                 | 
                    def generate_closest_words_indirect_bias(self,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    662
                 | 
                                    
                                                     | 
                
                 | 
                                                             neutral_positive_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    663
                 | 
                                    
                                                     | 
                
                 | 
                                                             neutral_negative_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    664
                 | 
                                    
                                                     | 
                
                 | 
                                                             words=None, n_extreme=5):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    665
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    666
                 | 
                                    
                                                     | 
                
                 | 
                        Generate closest words to a neutral direction and their indirect bias.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    667
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    668
                 | 
                                    
                                                     | 
                
                 | 
                        The direction of the neutral words is used to find  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    669
                 | 
                                    
                                                     | 
                
                 | 
                        the most extreme words.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    670
                 | 
                                    
                                                     | 
                
                 | 
                        The indirect bias is calculated between the most extreme words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    671
                 | 
                                    
                                                     | 
                
                 | 
                        and the closest end.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    672
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    673
                 | 
                                    
                                                     | 
                
                 | 
                        :param str neutral_positive_end: A word that define the positive side  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    674
                 | 
                                    
                                                     | 
                
                 | 
                                                         of the neutral direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    675
                 | 
                                    
                                                     | 
                
                 | 
                        :param str neutral_negative_end: A word that define the negative side  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    676
                 | 
                                    
                                                     | 
                
                 | 
                                                         of the neutral direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    677
                 | 
                                    
                                                     | 
                
                 | 
                        :param list words: List of words to project on the neutral direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    678
                 | 
                                    
                                                     | 
                
                 | 
                        :param int n_extreme: The number for the most extreme words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    679
                 | 
                                    
                                                     | 
                
                 | 
                                              (positive and negative) to show.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    680
                 | 
                                    
                                                     | 
                
                 | 
                        :return: Data Frame of the most extreme words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    681
                 | 
                                    
                                                     | 
                
                 | 
                                 with their projection scores and indirect biases.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    682
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    683
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    684
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_direction = normalize(self[neutral_positive_end]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    685
                 | 
                                    
                                                     | 
                
                 | 
                                                      - self[neutral_negative_end])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    686
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    687
                 | 
                                    
                                                     | 
                
                 | 
                        vectors = [normalize(self[word]) for word in words]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    688
                 | 
                                    
                                                     | 
                
                 | 
                        df = (pd.DataFrame([{'word': word, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    689
                 | 
                                    
                                                     | 
                
                 | 
                                             'projection': vector @ neutral_direction}  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    690
                 | 
                                    
                                                     | 
                
                 | 
                                            for word, vector in zip(words, vectors)])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    691
                 | 
                                    
                                                     | 
                
                 | 
                              .sort_values('projection', ascending=False)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    692
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    693
                 | 
                                    
                                                     | 
                
                 | 
                        df = take_two_sides_extreme_sorted(df, n_extreme,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    694
                 | 
                                    
                                                     | 
                
                 | 
                                                           'end',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    695
                 | 
                                    
                                                     | 
                
                 | 
                                                           neutral_positive_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    696
                 | 
                                    
                                                     | 
                
                 | 
                                                           neutral_negative_end)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    697
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    698
                 | 
                                    
                                                     | 
                
                 | 
                        df['indirect_bias'] = df.apply(lambda r:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    699
                 | 
                                    
                                                     | 
                
                 | 
                                                       self.calc_indirect_bias(r['word'],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    700
                 | 
                                    
                                                     | 
                
                 | 
                                                                               r['end']),  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    701
                 | 
                                    
                                                     | 
                
                 | 
                                                       axis=1)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    702
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    703
                 | 
                                    
                                                     | 
                
                 | 
                        df = df.set_index(['end', 'word'])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    704
                 | 
                                    
                                                     | 
                
                 | 
                        df = df[['projection', 'indirect_bias']]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    705
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    706
                 | 
                                    
                                                     | 
                
                 | 
                        return df  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    707
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    708
                 | 
                                    
                                                     | 
                
                 | 
                    def _extract_neutral_words(self, specific_words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    709
                 | 
                                    
                                                     | 
                
                 | 
                        extended_specific_words = set()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    710
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    711
                 | 
                                    
                                                     | 
                
                 | 
                        # because or specific_full data was trained on partial word embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    712
                 | 
                                    
                                                     | 
                
                 | 
                        for word in specific_words:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    713
                 | 
                                    
                                                     | 
                
                 | 
                            extended_specific_words.add(word)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    714
                 | 
                                    
                                                     | 
                
                 | 
                            extended_specific_words.add(word.lower())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    715
                 | 
                                    
                                                     | 
                
                 | 
                            extended_specific_words.add(word.upper())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    716
                 | 
                                    
                                                     | 
                
                 | 
                            extended_specific_words.add(word.title())  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    717
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    718
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_words = [word for word in self.model.vocab  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    719
                 | 
                                    
                                                     | 
                
                 | 
                                         if word not in extended_specific_words]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    720
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    721
                 | 
                                    
                                                     | 
                
                 | 
                        return neutral_words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    722
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    723
                 | 
                                    
                                                     | 
                
                 | 
                    def _neutralize(self, neutral_words):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    724
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    725
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    726
                 | 
                                    
                                                     | 
                
                 | 
                        if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    727
                 | 
                                    
                                                     | 
                
                 | 
                            neutral_words_iter = tqdm(neutral_words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    728
                 | 
                                    
                                                     | 
                
                 | 
                        else:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    729
                 | 
                                    
                                                     | 
                
                 | 
                            neutral_words_iter = iter(neutral_words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    730
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    731
                 | 
                                    
                                                     | 
                
                 | 
                        for word in neutral_words_iter:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    732
                 | 
                                    
                                                     | 
                
                 | 
                            neutralized_vector = reject_vector(self[word],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    733
                 | 
                                    
                                                     | 
                
                 | 
                                                               self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    734
                 | 
                                    
                                                     | 
                
                 | 
                            update_word_vector(self.model, word, neutralized_vector)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    735
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    736
                 | 
                                    
                                                     | 
                
                 | 
                        self.model.init_sims(replace=True)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    737
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    738
                 | 
                                    
                                                     | 
                
                 | 
                    def _equalize(self, equality_sets):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    739
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=R0914  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    740
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    741
                 | 
                                    
                                                     | 
                
                 | 
                        self._is_direction_identified()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    742
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    743
                 | 
                                    
                                                     | 
                
                 | 
                        if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    744
                 | 
                                    
                                                     | 
                
                 | 
                            words_data = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    745
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    746
                 | 
                                    
                                                     | 
                
                 | 
                        for equality_set_index, equality_set_words in enumerate(equality_sets):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    747
                 | 
                                    
                                                     | 
                
                 | 
                            equality_set_vectors = [normalize(self[word])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    748
                 | 
                                    
                                                     | 
                
                 | 
                                                    for word in equality_set_words]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    749
                 | 
                                    
                                                     | 
                
                 | 
                            center = np.mean(equality_set_vectors, axis=0)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    750
                 | 
                                    
                                                     | 
                
                 | 
                            (projected_center,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    751
                 | 
                                    
                                                     | 
                
                 | 
                             rejected_center) = project_reject_vector(center,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    752
                 | 
                                    
                                                     | 
                
                 | 
                                                                      self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    753
                 | 
                                    
                                                     | 
                
                 | 
                            scaling = np.sqrt(1 - np.linalg.norm(rejected_center)**2)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    754
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    755
                 | 
                                    
                                                     | 
                
                 | 
                            for word, vector in zip(equality_set_words, equality_set_vectors):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    756
                 | 
                                    
                                                     | 
                
                 | 
                                projected_vector = project_vector(vector, self.direction)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    757
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    758
                 | 
                                    
                                                     | 
                
                 | 
                                projected_part = normalize(projected_vector - projected_center)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    759
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    760
                 | 
                                    
                                                     | 
                
                 | 
                                # In the code it is different of Bolukbasi  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    761
                 | 
                                    
                                                     | 
                
                 | 
                                # It behaves the same only for equality_sets  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    762
                 | 
                                    
                                                     | 
                
                 | 
                                # with size of 2 (pairs) - not sure!  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    763
                 | 
                                    
                                                     | 
                
                 | 
                                # However, my code is the same as the article  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    764
                 | 
                                    
                                                     | 
                
                 | 
                                # equalized_vector = rejected_center + scaling * self.direction  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    765
                 | 
                                    
                                                     | 
                
                 | 
                                # https://github.com/tolga-b/debiaswe/blob/10277b23e187ee4bd2b6872b507163ef4198686b/debiaswe/debias.py#L36-L37  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    766
                 | 
                                    
                                                     | 
                
                 | 
                                # For pairs, projected_part_vector1 == -projected_part_vector2,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    767
                 | 
                                    
                                                     | 
                
                 | 
                                # and this is the same as  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    768
                 | 
                                    
                                                     | 
                
                 | 
                                # projected_part_vector1 == self.direction  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    769
                 | 
                                    
                                                     | 
                
                 | 
                                equalized_vector = rejected_center + scaling * projected_part  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    770
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    771
                 | 
                                    
                                                     | 
                
                 | 
                                update_word_vector(self.model, word, equalized_vector)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    772
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    773
                 | 
                                    
                                                     | 
                
                 | 
                                if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    774
                 | 
                                    
                                                     | 
                
                 | 
                                    words_data.append({ | 
            
                            
                    | 
                        
                     | 
                     | 
                     | 
                    
                                                                                                    
                        
                         
                                                                                        
                                                                                     
                     | 
                
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    775
                 | 
                                    
                                                     | 
                
                 | 
                                        'equality_set_index': equality_set_index,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    776
                 | 
                                    
                                                     | 
                
                 | 
                                        'word': word,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    777
                 | 
                                    
                                                     | 
                
                 | 
                                        'scaling': scaling,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    778
                 | 
                                    
                                                     | 
                
                 | 
                                        'projected_scalar': vector @ self.direction,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    779
                 | 
                                    
                                                     | 
                
                 | 
                                        'equalized_projected_scalar': (equalized_vector  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    780
                 | 
                                    
                                                     | 
                
                 | 
                                                                       @ self.direction),  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    781
                 | 
                                    
                                                     | 
                
                 | 
                                    })  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    782
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    783
                 | 
                                    
                                                     | 
                
                 | 
                        if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    784
                 | 
                                    
                                                     | 
                
                 | 
                            print('Equalize Words Data ' | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    785
                 | 
                                    
                                                     | 
                
                 | 
                                  '(all equal for 1-dim bias space (direction):')  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    786
                 | 
                                    
                                                     | 
                
                 | 
                            words_data_df = (pd.DataFrame(words_data)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    787
                 | 
                                    
                                                     | 
                
                 | 
                                             .set_index(['equality_set_index', 'word']))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    788
                 | 
                                    
                                                     | 
                
                 | 
                            print(tabulate(words_data_df, headers='keys'))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    789
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    790
                 | 
                                    
                                                     | 
                
                 | 
                        self.model.init_sims(replace=True)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    791
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    792
                 | 
                                    
                                                     | 
                
                 | 
                    def debias(self, method='hard', neutral_words=None, equality_sets=None,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    793
                 | 
                                    
                                                     | 
                
                 | 
                               inplace=True):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    794
                 | 
                                    
                                                     | 
                
                 | 
                        """Debias the word embedding.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    795
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    796
                 | 
                                    
                                                     | 
                
                 | 
                        :param str method: The method of debiasing.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    797
                 | 
                                    
                                                     | 
                
                 | 
                        :param list neutral_words: List of neutral words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    798
                 | 
                                    
                                                     | 
                
                 | 
                                                   for the neutralize step  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    799
                 | 
                                    
                                                     | 
                
                 | 
                        :param list equality_sets: List of equality sets,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    800
                 | 
                                    
                                                     | 
                
                 | 
                                                   for the equalize step.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    801
                 | 
                                    
                                                     | 
                
                 | 
                                                   The sets represent the direction.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    802
                 | 
                                    
                                                     | 
                
                 | 
                        :param bool inplace: Whether to debias the object inplace  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    803
                 | 
                                    
                                                     | 
                
                 | 
                                             or return a new one  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    804
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    805
                 | 
                                    
                                                     | 
                
                 | 
                        .. warning::  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    806
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    807
                 | 
                                    
                                                     | 
                
                 | 
                          After calling `debias`,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    808
                 | 
                                    
                                                     | 
                
                 | 
                          all the vectors of the word embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    809
                 | 
                                    
                                                     | 
                
                 | 
                          will be normalized to unit length.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    810
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    811
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    812
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    813
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=W0212  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    814
                 | 
                                    
                                                     | 
                
                 | 
                        if inplace:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    815
                 | 
                                    
                                                     | 
                
                 | 
                            bias_word_embedding = self  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    816
                 | 
                                    
                                                     | 
                
                 | 
                        else:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    817
                 | 
                                    
                                                     | 
                
                 | 
                            bias_word_embedding = copy.deepcopy(self)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    818
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    819
                 | 
                                    
                                                     | 
                
                 | 
                        if method not in DEBIAS_METHODS:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    820
                 | 
                                    
                                                     | 
                
                 | 
                            raise ValueError('method should be one of {}, {} was given'.format( | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    821
                 | 
                                    
                                                     | 
                
                 | 
                                DEBIAS_METHODS, method))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    822
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    823
                 | 
                                    
                                                     | 
                
                 | 
                        if method in ['hard', 'neutralize']:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    824
                 | 
                                    
                                                     | 
                
                 | 
                            if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    825
                 | 
                                    
                                                     | 
                
                 | 
                                print('Neutralize...') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    826
                 | 
                                    
                                                     | 
                
                 | 
                            bias_word_embedding._neutralize(neutral_words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    827
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    828
                 | 
                                    
                                                     | 
                
                 | 
                        if method == 'hard':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    829
                 | 
                                    
                                                     | 
                
                 | 
                            if self._verbose:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    830
                 | 
                                    
                                                     | 
                
                 | 
                                print('Equalize...') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    831
                 | 
                                    
                                                     | 
                
                 | 
                            bias_word_embedding._equalize(equality_sets)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    832
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    833
                 | 
                                    
                                                     | 
                
                 | 
                        if inplace:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    834
                 | 
                                    
                                                     | 
                
                 | 
                            return None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    835
                 | 
                                    
                                                     | 
                
                 | 
                        else:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    836
                 | 
                                    
                                                     | 
                
                 | 
                            return bias_word_embedding  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    837
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    838
                 | 
                                    
                                                     | 
                
                 | 
                    def evaluate_word_embedding(self,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    839
                 | 
                                    
                                                     | 
                
                 | 
                                                kwargs_word_pairs=None,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    840
                 | 
                                    
                                                     | 
                
                 | 
                                                kwargs_word_analogies=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    841
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    842
                 | 
                                    
                                                     | 
                
                 | 
                        Evaluate word pairs tasks and word analogies tasks.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    843
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    844
                 | 
                                    
                                                     | 
                
                 | 
                        :param model: Word embedding.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    845
                 | 
                                    
                                                     | 
                
                 | 
                        :param kwargs_word_pairs: Kwargs for  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    846
                 | 
                                    
                                                     | 
                
                 | 
                                                  evaluate_word_pairs  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    847
                 | 
                                    
                                                     | 
                
                 | 
                                                  method.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    848
                 | 
                                    
                                                     | 
                
                 | 
                        :type kwargs_word_pairs: dict or None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    849
                 | 
                                    
                                                     | 
                
                 | 
                        :param kwargs_word_analogies: Kwargs for  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    850
                 | 
                                    
                                                     | 
                
                 | 
                                                      evaluate_word_analogies  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    851
                 | 
                                    
                                                     | 
                
                 | 
                                                      method.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    852
                 | 
                                    
                                                     | 
                
                 | 
                        :type evaluate_word_analogies: dict or None  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    853
                 | 
                                    
                                                     | 
                
                 | 
                        :return: Tuple of :class:`pandas.DataFrame`  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    854
                 | 
                                    
                                                     | 
                
                 | 
                                 for the evaluation results.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    855
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    856
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    857
                 | 
                                    
                                                     | 
                
                 | 
                        return evaluate_word_embedding(self.model,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    858
                 | 
                                    
                                                     | 
                
                 | 
                                                       kwargs_word_pairs,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    859
                 | 
                                    
                                                     | 
                
                 | 
                                                       kwargs_word_analogies)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    860
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    861
                 | 
                                    
                                                     | 
                
                 | 
                    def learn_full_specific_words(self, seed_specific_words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    862
                 | 
                                    
                                                     | 
                
                 | 
                                                  max_non_specific_examples=None, debug=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    863
                 | 
                                    
                                                     | 
                
                 | 
                        """Learn specific words given a list of seed specific wordsself.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    864
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    865
                 | 
                                    
                                                     | 
                
                 | 
                        Using Linear SVM.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    866
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    867
                 | 
                                    
                                                     | 
                
                 | 
                        :param list seed_specific_words: List of seed specific words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    868
                 | 
                                    
                                                     | 
                
                 | 
                        :param int max_non_specific_examples: The number of non-specific words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    869
                 | 
                                    
                                                     | 
                
                 | 
                                                              to sample for training  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    870
                 | 
                                    
                                                     | 
                
                 | 
                        :return: List of learned specific words and the classifier object  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    871
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    872
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    873
                 | 
                                    
                                                     | 
                
                 | 
                        if debug is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    874
                 | 
                                    
                                                     | 
                
                 | 
                            debug = False  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    875
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    876
                 | 
                                    
                                                     | 
                
                 | 
                        if max_non_specific_examples is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    877
                 | 
                                    
                                                     | 
                
                 | 
                            max_non_specific_examples = MAX_NON_SPECIFIC_EXAMPLES  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    878
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    879
                 | 
                                    
                                                     | 
                
                 | 
                        data = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    880
                 | 
                                    
                                                     | 
                
                 | 
                        non_specific_example_count = 0  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    881
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    882
                 | 
                                    
                                                     | 
                
                 | 
                        for word in self.model.vocab:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    883
                 | 
                                    
                                                     | 
                
                 | 
                            is_specific = word in seed_specific_words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    884
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    885
                 | 
                                    
                                                     | 
                
                 | 
                            if not is_specific:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    886
                 | 
                                    
                                                     | 
                
                 | 
                                non_specific_example_count += 1  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    887
                 | 
                                    
                                                     | 
                
                 | 
                                if non_specific_example_count <= max_non_specific_examples:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    888
                 | 
                                    
                                                     | 
                
                 | 
                                    data.append((self[word], is_specific))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    889
                 | 
                                    
                                                     | 
                
                 | 
                            else:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    890
                 | 
                                    
                                                     | 
                
                 | 
                                data.append((self[word], is_specific))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    891
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    892
                 | 
                                    
                                                     | 
                
                 | 
                        np.random.seed(RANDOM_STATE)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    893
                 | 
                                    
                                                     | 
                
                 | 
                        np.random.shuffle(data)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    894
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    895
                 | 
                                    
                                                     | 
                
                 | 
                        X, y = zip(*data)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    896
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    897
                 | 
                                    
                                                     | 
                
                 | 
                        X = np.array(X)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    898
                 | 
                                    
                                                     | 
                
                 | 
                        X /= np.linalg.norm(X, axis=1)[:, None]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    899
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    900
                 | 
                                    
                                                     | 
                
                 | 
                        y = np.array(y).astype('int') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    901
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    902
                 | 
                                    
                                                     | 
                
                 | 
                        clf = LinearSVC(C=1, class_weight='balanced',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    903
                 | 
                                    
                                                     | 
                
                 | 
                                        random_state=RANDOM_STATE)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    904
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    905
                 | 
                                    
                                                     | 
                
                 | 
                        clf.fit(X, y)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    906
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    907
                 | 
                                    
                                                     | 
                
                 | 
                        full_specific_words = []  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    908
                 | 
                                    
                                                     | 
                
                 | 
                        for word in self.model.vocab:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    909
                 | 
                                    
                                                     | 
                
                 | 
                            vector = [normalize(self[word])]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    910
                 | 
                                    
                                                     | 
                
                 | 
                            if clf.predict(vector):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    911
                 | 
                                    
                                                     | 
                
                 | 
                                full_specific_words.append(word)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    912
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    913
                 | 
                                    
                                                     | 
                
                 | 
                        if not debug:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    914
                 | 
                                    
                                                     | 
                
                 | 
                            return full_specific_words, clf  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    915
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    916
                 | 
                                    
                                                     | 
                
                 | 
                        return full_specific_words, clf, X, y  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    917
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    918
                 | 
                                    
                                                     | 
                
                 | 
                    def _plot_most_biased_one_cluster(self,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    919
                 | 
                                    
                                                     | 
                
                 | 
                                                      most_biased_neutral_words, y_bias,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    920
                 | 
                                    
                                                     | 
                
                 | 
                                                      random_state=1, ax=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    921
                 | 
                                    
                                                     | 
                
                 | 
                        most_biased_vectors = [self.model[word]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    922
                 | 
                                    
                                                     | 
                
                 | 
                                               for word in most_biased_neutral_words]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    923
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    924
                 | 
                                    
                                                     | 
                
                 | 
                        return plot_clustering_as_classification(most_biased_vectors,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    925
                 | 
                                    
                                                     | 
                
                 | 
                                                                 y_bias,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    926
                 | 
                                    
                                                     | 
                
                 | 
                                                                 random_state=random_state,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    927
                 | 
                                    
                                                     | 
                
                 | 
                                                                 ax=ax)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    928
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    929
                 | 
                                    
                                                     | 
                
                 | 
                    @staticmethod  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    930
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_most_biased_clustering(biased, debiased,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    931
                 | 
                                    
                                                     | 
                
                 | 
                                                    seed='ends', n_extreme=500,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    932
                 | 
                                    
                                                     | 
                
                 | 
                                                    random_state=1):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    933
                 | 
                                    
                                                     | 
                
                 | 
                        """Plot clustering as classification of biased neutral words.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    934
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    935
                 | 
                                    
                                                     | 
                
                 | 
                        :param biased: Biased word embedding of  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    936
                 | 
                                    
                                                     | 
                
                 | 
                                       :class:`~ethically.we.bias.BiasWordEmbedding`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    937
                 | 
                                    
                                                     | 
                
                 | 
                        :param debiased: Debiased word embedding of  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    938
                 | 
                                    
                                                     | 
                
                 | 
                                         :class:`~ethically.we.bias.BiasWordEmbedding`.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    939
                 | 
                                    
                                                     | 
                
                 | 
                        :param seed: The definition of the seed vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    940
                 | 
                                    
                                                     | 
                
                 | 
                                    Either by a tuple of two word ends,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    941
                 | 
                                    
                                                     | 
                
                 | 
                                    or by `'ends` for the pre-defined ends  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    942
                 | 
                                    
                                                     | 
                
                 | 
                                    or by `'direction'` for  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    943
                 | 
                                    
                                                     | 
                
                 | 
                                    the pre-defined direction vector.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    944
                 | 
                                    
                                                     | 
                
                 | 
                        :param n_extrem: The number of extreme biased  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    945
                 | 
                                    
                                                     | 
                
                 | 
                                         neutral words to use.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    946
                 | 
                                    
                                                     | 
                
                 | 
                        :return: Tuple of list of ax objects of the plot,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    947
                 | 
                                    
                                                     | 
                
                 | 
                                 and a dictionary with the most positive  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    948
                 | 
                                    
                                                     | 
                
                 | 
                                 and negative words.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    949
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    950
                 | 
                                    
                                                     | 
                
                 | 
                        Based on:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    951
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    952
                 | 
                                    
                                                     | 
                
                 | 
                        - Gonen, H., & Goldberg, Y. (2019).  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    953
                 | 
                                    
                                                     | 
                
                 | 
                          `Lipstick on a Pig:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    954
                 | 
                                    
                                                     | 
                
                 | 
                           Debiasing Methods Cover up Systematic Gender Biases  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    955
                 | 
                                    
                                                     | 
                
                 | 
                           in Word Embeddings But do not Remove Them  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    956
                 | 
                                    
                                                     | 
                
                 | 
                           <https://arxiv.org/abs/1903.03862>`_.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    957
                 | 
                                    
                                                     | 
                
                 | 
                           arXiv preprint arXiv:1903.03862.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    958
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    959
                 | 
                                    
                                                     | 
                
                 | 
                        - https://github.com/gonenhila/gender_bias_lipstick  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    960
                 | 
                                    
                                                     | 
                
                 | 
                        """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    961
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=protected-access,too-many-locals,line-too-long  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    962
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    963
                 | 
                                    
                                                     | 
                
                 | 
                        assert biased.positive_end == debiased.positive_end, \  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    964
                 | 
                                    
                                                     | 
                
                 | 
                            'Postive ends should be the same.'  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    965
                 | 
                                    
                                                     | 
                
                 | 
                        assert biased.negative_end == debiased.negative_end, \  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    966
                 | 
                                    
                                                     | 
                
                 | 
                            'Negative ends should be the same.'  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    967
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    968
                 | 
                                    
                                                     | 
                
                 | 
                        seed_vector, _, _ = get_seed_vector(seed, biased)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    969
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    970
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_words = biased._data['neutral_words']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    971
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_word_vectors = (biased[word] for word in neutral_words)  | 
            
                            
                    | 
                        
                     | 
                     | 
                     | 
                    
                                                                                                    
                        
                         
                                                                                        
                                                                                     
                     | 
                
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    972
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_word_projections = [(normalize(vector) @ seed_vector, word)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    973
                 | 
                                    
                                                     | 
                
                 | 
                                                    for word, vector  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    974
                 | 
                                    
                                                     | 
                
                 | 
                                                    in zip(neutral_words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    975
                 | 
                                    
                                                     | 
                
                 | 
                                                           neutral_word_vectors)]  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    976
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    977
                 | 
                                    
                                                     | 
                
                 | 
                        neutral_word_projections.sort()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    978
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    979
                 | 
                                    
                                                     | 
                
                 | 
                        _, most_negative_words = zip(*neutral_word_projections[:n_extreme])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    980
                 | 
                                    
                                                     | 
                
                 | 
                        _, most_positive_words = zip(*neutral_word_projections[-n_extreme:])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    981
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    982
                 | 
                                    
                                                     | 
                
                 | 
                        most_biased_neutral_words = most_negative_words + most_positive_words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    983
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    984
                 | 
                                    
                                                     | 
                
                 | 
                        y_bias = [False] * n_extreme + [True] * n_extreme  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    985
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    986
                 | 
                                    
                                                     | 
                
                 | 
                        _, axes = plt.subplots(1, 2, figsize=(20, 5))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    987
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    988
                 | 
                                    
                                                     | 
                
                 | 
                        acc_biased = biased._plot_most_biased_one_cluster(most_biased_neutral_words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    989
                 | 
                                    
                                                     | 
                
                 | 
                                                                          y_bias,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    990
                 | 
                                    
                                                     | 
                
                 | 
                                                                          random_state=random_state,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    991
                 | 
                                    
                                                     | 
                
                 | 
                                                                          ax=axes[0])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    992
                 | 
                                    
                                                     | 
                
                 | 
                        axes[0].set_title('Biased - Accuracy={}'.format(acc_biased)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    993
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    994
                 | 
                                    
                                                     | 
                
                 | 
                        acc_debiased = debiased._plot_most_biased_one_cluster(most_biased_neutral_words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    995
                 | 
                                    
                                                     | 
                
                 | 
                                                                              y_bias,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    996
                 | 
                                    
                                                     | 
                
                 | 
                                                                              random_state=random_state,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    997
                 | 
                                    
                                                     | 
                
                 | 
                                                                              ax=axes[1])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    998
                 | 
                                    
                                                     | 
                
                 | 
                        axes[1].set_title('Debiased - Accuracy={}'.format(acc_debiased)) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    999
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1000
                 | 
                                    
                                                     | 
                
                 | 
                        return axes, {biased.positive_end: most_positive_words, | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1001
                 | 
                                    
                                                     | 
                
                 | 
                                      biased.negative_end: most_negative_words}  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1002
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1003
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1004
                 | 
                                    
                                                     | 
                
                 | 
                class GenderBiasWE(BiasWordEmbedding):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1005
                 | 
                                    
                                                     | 
                
                 | 
                    """Measure and adjust the Gender Bias in English Word Embedding.  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1006
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1007
                 | 
                                    
                                                     | 
                
                 | 
                    :param model: Word embedding model of ``gensim.model.KeyedVectors``  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1008
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool only_lower: Whether the word embedding contrains  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1009
                 | 
                                    
                                                     | 
                
                 | 
                                            only lower case words  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1010
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool verbose: Set verbosity  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1011
                 | 
                                    
                                                     | 
                
                 | 
                    :param bool to_normalize: Whether to normalize all the vectors  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1012
                 | 
                                    
                                                     | 
                
                 | 
                                              (recommended!)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1013
                 | 
                                    
                                                     | 
                
                 | 
                    """  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1014
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1015
                 | 
                                    
                                                     | 
                
                 | 
                    def __init__(self, model, only_lower=False, verbose=False,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1016
                 | 
                                    
                                                     | 
                
                 | 
                                 identify_direction=True, to_normalize=True):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1017
                 | 
                                    
                                                     | 
                
                 | 
                        super().__init__(model=model,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1018
                 | 
                                    
                                                     | 
                
                 | 
                                         only_lower=only_lower,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1019
                 | 
                                    
                                                     | 
                
                 | 
                                         verbose=verbose,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1020
                 | 
                                    
                                                     | 
                
                 | 
                                         to_normalize=True)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1021
                 | 
                                    
                                                     | 
                
                 | 
                        self._initialize_data()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1022
                 | 
                                    
                                                     | 
                
                 | 
                        if identify_direction:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1023
                 | 
                                    
                                                     | 
                
                 | 
                            self._identify_direction('she', 'he', | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1024
                 | 
                                    
                                                     | 
                
                 | 
                                                     self._data['definitional_pairs'],  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1025
                 | 
                                    
                                                     | 
                
                 | 
                                                     'pca')  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1026
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1027
                 | 
                                    
                                                     | 
                
                 | 
                    def _initialize_data(self):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1028
                 | 
                                    
                                                     | 
                
                 | 
                        self._data = copy.deepcopy(BOLUKBASI_DATA['gender'])  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1029
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1030
                 | 
                                    
                                                     | 
                
                 | 
                        if not self.only_lower:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1031
                 | 
                                    
                                                     | 
                
                 | 
                            self._data['specific_full_with_definitional_equalize'] = \  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1032
                 | 
                                    
                                                     | 
                
                 | 
                                generate_words_forms(self  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1033
                 | 
                                    
                                                     | 
                
                 | 
                                                     ._data['specific_full_with_definitional_equalize'])  # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1034
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1035
                 | 
                                    
                                                     | 
                
                 | 
                        for key in self._data['word_group_keys']:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1036
                 | 
                                    
                                                     | 
                
                 | 
                            self._data[key] = (self._filter_words_by_model(self  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1037
                 | 
                                    
                                                     | 
                
                 | 
                                                                           ._data[key]))  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1038
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1039
                 | 
                                    
                                                     | 
                
                 | 
                        self._data['neutral_words'] = self._extract_neutral_words(self  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1040
                 | 
                                    
                                                     | 
                
                 | 
                                                                                  ._data['specific_full_with_definitional_equalize'])  # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1041
                 | 
                                    
                                                     | 
                
                 | 
                        self._data['neutral_words'].sort()  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1042
                 | 
                                    
                                                     | 
                
                 | 
                        self._data['word_group_keys'].append('neutral_words') | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1043
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1044
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_projection_scores(self, words='professions', n_extreme=10,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1045
                 | 
                                    
                                                     | 
                
                 | 
                                               ax=None, axis_projection_step=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1046
                 | 
                                    
                                                     | 
                
                 | 
                        if words == 'professions':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1047
                 | 
                                    
                                                     | 
                
                 | 
                            words = self._data['profession_names']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1048
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1049
                 | 
                                    
                                                     | 
                
                 | 
                        return super().plot_projection_scores(words, n_extreme,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1050
                 | 
                                    
                                                     | 
                
                 | 
                                                              ax, axis_projection_step)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1051
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1052
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_dist_projections_on_direction(self, word_groups='bolukbasi',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1053
                 | 
                                    
                                                     | 
                
                 | 
                                                           ax=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1054
                 | 
                                    
                                                     | 
                
                 | 
                        if word_groups == 'bolukbasi':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1055
                 | 
                                    
                                                     | 
                
                 | 
                            word_groups = {key: self._data[key] | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1056
                 | 
                                    
                                                     | 
                
                 | 
                                           for key in self._data['word_group_keys']}  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1057
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1058
                 | 
                                    
                                                     | 
                
                 | 
                        return super().plot_dist_projections_on_direction(word_groups, ax)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1059
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1060
                 | 
                                    
                                                     | 
                
                 | 
                    @classmethod  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1061
                 | 
                                    
                                                     | 
                
                 | 
                    def plot_bias_across_word_embeddings(cls, word_embedding_bias_dict,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1062
                 | 
                                    
                                                     | 
                
                 | 
                                                         ax=None, scatter_kwargs=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1063
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=W0221  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1064
                 | 
                                    
                                                     | 
                
                 | 
                        words = BOLUKBASI_DATA['gender']['neutral_profession_names']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1065
                 | 
                                    
                                                     | 
                
                 | 
                        # TODO: is it correct for inheritance of class method?  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1066
                 | 
                                    
                                                     | 
                
                 | 
                        super(cls, cls).plot_bias_across_word_embeddings(word_embedding_bias_dict,  # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1067
                 | 
                                    
                                                     | 
                
                 | 
                                                                         words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1068
                 | 
                                    
                                                     | 
                
                 | 
                                                                         ax,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1069
                 | 
                                    
                                                     | 
                
                 | 
                                                                         scatter_kwargs)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1070
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1071
                 | 
                                    
                                                     | 
                
                 | 
                    def calc_direct_bias(self, neutral_words='professions', c=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1072
                 | 
                                    
                                                     | 
                
                 | 
                        if isinstance(neutral_words, str) and neutral_words == 'professions':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1073
                 | 
                                    
                                                     | 
                
                 | 
                            return super().calc_direct_bias(  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1074
                 | 
                                    
                                                     | 
                
                 | 
                                self._data['neutral_profession_names'], c)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1075
                 | 
                                    
                                                     | 
                
                 | 
                        else:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1076
                 | 
                                    
                                                     | 
                
                 | 
                            return super().calc_direct_bias(neutral_words)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1077
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1078
                 | 
                                    
                                                     | 
                
                 | 
                    def generate_closest_words_indirect_bias(self,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1079
                 | 
                                    
                                                     | 
                
                 | 
                                                             neutral_positive_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1080
                 | 
                                    
                                                     | 
                
                 | 
                                                             neutral_negative_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1081
                 | 
                                    
                                                     | 
                
                 | 
                                                             words='professions', n_extreme=5):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1082
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1083
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1084
                 | 
                                    
                                                     | 
                
                 | 
                        if words == 'professions':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1085
                 | 
                                    
                                                     | 
                
                 | 
                            words = self._data['profession_names']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1086
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1087
                 | 
                                    
                                                     | 
                
                 | 
                        return super().generate_closest_words_indirect_bias(neutral_positive_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1088
                 | 
                                    
                                                     | 
                
                 | 
                                                                            neutral_negative_end,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1089
                 | 
                                    
                                                     | 
                
                 | 
                                                                            words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1090
                 | 
                                    
                                                     | 
                
                 | 
                                                                            n_extreme=n_extreme)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1091
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1092
                 | 
                                    
                                                     | 
                
                 | 
                    def debias(self, method='hard', neutral_words=None, equality_sets=None,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1093
                 | 
                                    
                                                     | 
                
                 | 
                               inplace=True):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1094
                 | 
                                    
                                                     | 
                
                 | 
                        # pylint: disable=C0301  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1095
                 | 
                                    
                                                     | 
                
                 | 
                        if method in ['hard', 'neutralize']:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1096
                 | 
                                    
                                                     | 
                
                 | 
                            if neutral_words is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1097
                 | 
                                    
                                                     | 
                
                 | 
                                neutral_words = self._data['neutral_words']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1098
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1099
                 | 
                                    
                                                     | 
                
                 | 
                        if method == 'hard' and equality_sets is None:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1100
                 | 
                                    
                                                     | 
                
                 | 
                            equality_sets = self._data['definitional_pairs']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1101
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1102
                 | 
                                    
                                                     | 
                
                 | 
                            if not self.only_lower:  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1103
                 | 
                                    
                                                     | 
                
                 | 
                                assert all(len(equality_set) == 2  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1104
                 | 
                                    
                                                     | 
                
                 | 
                                           for equality_set in equality_sets), 'currently supporting only equality pairs if only_lower is False'  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1105
                 | 
                                    
                                                     | 
                
                 | 
                                # TODO: refactor  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1106
                 | 
                                    
                                                     | 
                
                 | 
                                equality_sets = {(candidate1, candidate2) | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1107
                 | 
                                    
                                                     | 
                
                 | 
                                                 for word1, word2 in equality_sets  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1108
                 | 
                                    
                                                     | 
                
                 | 
                                                 for candidate1, candidate2 in zip(generate_one_word_forms(word1),  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1109
                 | 
                                    
                                                     | 
                
                 | 
                                                                                   generate_one_word_forms(word2))}  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1110
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1111
                 | 
                                    
                                                     | 
                
                 | 
                        return super().debias(method, neutral_words, equality_sets,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1112
                 | 
                                    
                                                     | 
                
                 | 
                                              inplace)  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1113
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1114
                 | 
                                    
                                                     | 
                
                 | 
                    def learn_full_specific_words(self, seed_specific_words='bolukbasi',  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1115
                 | 
                                    
                                                     | 
                
                 | 
                                                  max_non_specific_examples=None,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1116
                 | 
                                    
                                                     | 
                
                 | 
                                                  debug=None):  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1117
                 | 
                                    
                                                     | 
                
                 | 
                        if seed_specific_words == 'bolukbasi':  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1118
                 | 
                                    
                                                     | 
                
                 | 
                            seed_specific_words = self._data['specific_seed']  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1119
                 | 
                                    
                                                     | 
                
                 | 
                 | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1120
                 | 
                                    
                                                     | 
                
                 | 
                        return super().learn_full_specific_words(seed_specific_words,  | 
            
            
                                                                                                            
                            
            
                                    
            
            
                | 
                    1121
                 | 
                                    
                                                     | 
                
                 | 
                                                                 max_non_specific_examples,  | 
            
            
                                                                                                            
                                                                
            
                                    
            
            
                | 
                    1122
                 | 
                                    
                                                     | 
                
                 | 
                                                                 debug)  | 
            
            
                                                        
            
                                    
            
            
                | 
                    1123
                 | 
                                    
                                                     | 
                
                 | 
                 |