GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.

Issues (4082)

Orange/classification/mlp.py (14 issues)

1
import numpy as np
0 ignored issues
show
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
2
from scipy.optimize import fmin_l_bfgs_b
0 ignored issues
show
The import scipy.optimize could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
3
4
from Orange.classification import Learner, Model
5
6
__all__ = ["MLPLearner"]
7
8
9
def sigmoid(x):
10
    return 1.0 / (1.0 + np.exp(-x))
11
12
13
class MLPLearner(Learner):
14
    """Multilayer perceptron (feedforward neural network)
15
16
    This model uses stochastic gradient descent and the
17
    backpropagation algorithm to train the weights of a feedforward
18
    neural network.  The network uses the sigmoid activation
19
    functions, except for the last layer which computes the softmax
20
    activation function. The network can be used for binary and
21
    multiclass classification. Stochastic gradient descent minimizes
22
    the L2 regularize categorical crossentropy cost function. The
23
    topology of the network can be customized by setting the layers
24
    attribute. When using this model you should:
25
26
    - Choose a suitable:
27
        * topology (layers)
28
        * regularization parameter (lambda\_)
29
        * dropout (values of 0.2 for the input layer and 0.5 for the hidden
30
          layers usually work well)
31
        * The number of epochs of stochastic gradient descent (num_epochs)
32
        * The learning rate of stochastic gradient descent (learning_rate)
33
    - Continuize all discrete attributes
34
    - Transform the data set so that the columns are on a similar scale
35
36
    layers : list
37
        The topology of the network. A network with
38
        layer=[10, 100, 100, 3] has two hidden layers with 100 neurons each,
39
        10 features and a class value with 3 distinct values.
40
41
    lambda\_ : float, optional (default = 1.0)
42
        The regularization parameter. Higher values of lambda\_
43
        force the coefficients to be small.
44
45
    dropout : list, optional (default = None)
46
        The dropout rate for each, but the last,
47
        layer. The list should have one element less then the parameter layers.
48
        Values of 0.2 for the input layer and 0.5 for the hidden layers usually
49
        work well.
50
51
    num_epochs : int, optional
52
        The number of epochs of stochastic gradient descent
53
54
    learning_rate : float, optional
55
        The learning rate of stochastic gradient descent
56
57
    batch_size : int, optional
58
        The batch size of stochastic gradient descent
59
    """
60
61
    name = 'mlp'
62
63
    def __init__(self, layers, lambda_=1.0, dropout=None, preprocessors=None,
0 ignored issues
show
Comprehensibility Bug introduced by
lambda_ is re-defining a name which is already available in the outer-scope (previously defined on line 249).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
64
                 **opt_args):
65
        super().__init__(preprocessors=preprocessors)
66
        if dropout is None:
67
            dropout = [0] * (len(layers) - 1)
68
        assert len(dropout) == len(layers) - 1
69
70
        self.layers = layers
71
        self.lambda_ = lambda_
72
        self.dropout = dropout
73
        self.opt_args = opt_args
74
75
    def unfold_params(self, params):
76
        T, b = [], []
77
        acc = 0
78
        for l1, l2 in zip(self.layers, self.layers[1:]):
79
            b.append(params[acc:acc + l2])
80
            acc += l2
81
            T.append(params[acc:acc + l1 * l2].reshape((l2, l1)))
82
            acc += l1 * l2
83
        return T, b
84
85
    def cost_grad(self, params, X, Y):
86
        T, b = self.unfold_params(params)
87
88
        # forward pass
89
        a, z = [], []
90
91
        dropout_mask = []
92
        for i in range(len(T)):
93
            if self.dropout is None or self.dropout[0] < 1e-7:
94
                dropout_mask.append(1)
95
            else:
96
                dropout_mask.append(
97
                    np.random.binomial(1, 1 - self.dropout[0],
98
                                       (X.shape[0], self.layers[i])))
99
100
        a.append(X * dropout_mask[0])
101
        for i in range(len(self.layers) - 2):
102
            z.append(a[i].dot(T[i].T) + b[i])
103
            a.append(sigmoid(z[i]) * dropout_mask[i + 1])
104
105
        # softmax last layer
106
        z.append(a[-1].dot(T[-1].T) + b[-1])
107
        P = np.exp(z[-1] - np.max(z[-1], axis=1)[:, None])
108
        P /= np.sum(P, axis=1)[:, None]
109
        a.append(P)
110
111
        # cost
112
        cost = -np.sum(np.log(a[-1] + 1e-15) * Y)
113
        for theta in T:
114
            cost += self.lambda_ * np.dot(theta.flat, theta.flat) / 2.0
115
        cost /= X.shape[0]
116
117
        # gradient
118
        params = []
119
        for i in range(len(self.layers) - 1):
120
            if i == 0:
121
                d = a[-1] - Y
0 ignored issues
show
Comprehensibility Bug introduced by
d is re-defining a name which is already available in the outer-scope (previously defined on line 234).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
122
            else:
123
                d = d.dot(T[-i]) * a[-i - 1] * (1 - a[-i - 1])
124
            dT = (a[-i - 2] * dropout_mask[-i - 1]).T.dot(d).T + self.lambda_\
125
                * T[-i - 1]
126
            db = np.sum(d, axis=0)
127
128
            params.extend([dT.flat, db.flat])
129
        grad = np.concatenate(params[::-1]) / X.shape[0]
130
131
        return cost, grad
132
133
    def fit_bfgs(self, params, X, Y):
134
        params, j, ret = fmin_l_bfgs_b(self.cost_grad, params,
0 ignored issues
show
The variable ret seems to be unused.
Loading history...
The variable j seems to be unused.
Loading history...
135
                                       args=(X, Y), **self.opt_args)
136
        return params
137
138
    def fit_sgd(self, params, X, Y, num_epochs=1000, batch_size=100,
139
                learning_rate=0.1):
140
        # shuffle examples
141
        inds = np.random.permutation(X.shape[0])
142
        X = X[inds]
143
        Y = Y[inds]
144
145
        # split training and validation set
146
        num_tr = int(X.shape[0] * 0.8)
147
148
        X_tr, Y_tr = X[:num_tr], Y[:num_tr]
149
        X_va, Y_va = X[num_tr:], Y[num_tr:]
150
151
        early_stop = 100
152
153
        best_params = None
0 ignored issues
show
The variable best_params seems to be unused.
Loading history...
154
        best_cost = np.inf
155
156
        for epoch in range(num_epochs):
157
            for i in range(0, num_tr, batch_size):
0 ignored issues
show
The variable i seems to be unused.
Loading history...
158
                cost, grad = self.cost_grad(params, X_tr, Y_tr)
159
                params -= learning_rate * grad
160
161
            # test on validation set
162
            T, b = self.unfold_params(params)
163
            P_va = MLPModel(T, b, self.dropout).predict(X_va)
164
            cost = -np.sum(np.log(P_va + 1e-15) * Y_va)
165
166
            if cost < best_cost:
167
                best_cost = cost
168
                best_params = np.copy(params)
169
                early_stop *= 2
170
171
            if epoch > early_stop:
172
                break
173
174
        return params
175
176
    def fit(self, X, Y, W):
0 ignored issues
show
Bug Best Practice introduced by
Signature differs from overridden 'fit' method

It is generally a good practice to use signatures that are compatible with the Liskov substitution principle.

This allows to pass instances of the child class anywhere where the instances of the super-class/interface would be acceptable.

Loading history...
177
        if np.isnan(np.sum(X)) or np.isnan(np.sum(Y)):
178
            raise ValueError('MLP does not support unknown values')
179
180
        if Y.shape[1] == 1:
181
            num_classes = np.unique(Y).size
182
            Y = np.eye(num_classes)[Y.ravel().astype(int)]
183
184
        params = []
185
        num_params = 0
0 ignored issues
show
The variable num_params seems to be unused.
Loading history...
186
        for l1, l2 in zip(self.layers, self.layers[1:]):
187
            num_params += l1 * l2 + l2
188
            i = 4.0 * np.sqrt(6.0 / (l1 + l2))
189
            params.append(np.random.uniform(-i, i, l1 * l2))
190
            params.append(np.zeros(l2))
191
192
        params = np.concatenate(params)
193
194
        #params = self.fit_bfgs(params, X, Y)
195
        params = self.fit_sgd(params, X, Y, **self.opt_args)
196
197
        T, b = self.unfold_params(params)
198
        return MLPModel(T, b, self.dropout)
199
200
201
class MLPModel(Model):
202
    def __init__(self, T, b, dropout):
0 ignored issues
show
The __init__ method of the super-class ModelClassification is not called.

It is generally advisable to initialize the super-class by calling its __init__ method:

class SomeParent:
    def __init__(self):
        self.x = 1

class SomeChild(SomeParent):
    def __init__(self):
        # Initialize the super class
        SomeParent.__init__(self)
Loading history...
203
        self.T = T
204
        self.b = b
205
        self.dropout = dropout
206
207
    def predict(self, X):
208
        a = X
209
        for i in range(len(self.T) - 1):
210
            d = 1 - self.dropout[i]
0 ignored issues
show
Comprehensibility Bug introduced by
d is re-defining a name which is already available in the outer-scope (previously defined on line 234).

It is generally a bad practice to shadow variables from the outer-scope. In most cases, this is done unintentionally and might lead to unexpected behavior:

param = 5

class Foo:
    def __init__(self, param):   # "param" would be flagged here
        self.param = param
Loading history...
211
            a = sigmoid(a.dot(self.T[i].T * d) + self.b[i])
212
        z = a.dot(self.T[-1].T) + self.b[-1]
213
        P = np.exp(z - np.max(z, axis=1)[:, None])
214
        P /= np.sum(P, axis=1)[:, None]
215
        return P
216
217
if __name__ == '__main__':
218
    import Orange.data
219
    import sklearn.cross_validation as skl_cross_validation
0 ignored issues
show
The import sklearn.cross_validation could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
220
221
    np.random.seed(42)
222
223
    def numerical_grad(f, params, e=1e-4):
0 ignored issues
show
This code seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
224
        grad = np.zeros_like(params)
225
        perturb = np.zeros_like(params)
226
        for i in range(params.size):
227
            perturb[i] = e
228
            j1 = f(params - perturb)
229
            j2 = f(params + perturb)
230
            grad[i] = (j2 - j1) / (2.0 * e)
231
            perturb[i] = 0
232
        return grad
233
234
    d = Orange.data.Table('iris')
235
236
#    # gradient check
237
#    m = MLPLearner([4, 20, 20, 3], dropout=[0.0, 0.5, 0.0], lambda_=0.0)
238
#    params = np.random.randn(5 * 20 + 21 * 20 + 21 * 3) * 0.1
239
#    Y = np.eye(3)[d.Y.ravel().astype(int)]
240
#
241
#    ga = m.cost_grad(params, d.X, Y)[1]
242
#    gm = numerical_grad(lambda t: m.cost_grad(t, d.X, Y)[0], params)
243
#
244
#    print(np.sum((ga - gm)**2))
245
246
#    m = MLPLearner([4, 20, 3], dropout=[0.0, 0.0], lambda_=1.0)
247
#    m(d)
248
249
    for lambda_ in [0.03, 0.1, 0.3, 1, 3]:
250
        m = MLPLearner([4, 20, 20, 3], lambda_=lambda_, num_epochs=1000,
251
                       learning_rate=0.1)
252
        scores = []
253
        for tr_ind, te_ind in skl_cross_validation.StratifiedKFold(d.Y.ravel()):
254
            s = np.mean(m(d[tr_ind])(d[te_ind]) == d[te_ind].Y.ravel())
255
            scores.append(s)
256
        print(np.mean(scores), lambda_)
257