GitHub Access Token became invalid

It seems like the GitHub access token used for retrieving details about this repository from GitHub became invalid. This might prevent certain types of inspections from being run (in particular, everything related to pull requests).
Please ask an admin of your repository to re-new the access token on this website.

Issues (4082)

Orange/base.py (10 issues)

1
import inspect
2
3
import numpy as np
0 ignored issues
show
The import numpy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
4
import scipy
0 ignored issues
show
The import scipy could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
5
import bottlechest as bn
0 ignored issues
show
The import bottlechest could not be resolved.

This can be caused by one of the following:

1. Missing Dependencies

This error could indicate a configuration issue of Pylint. Make sure that your libraries are available by adding the necessary commands.

# .scrutinizer.yml
before_commands:
    - sudo pip install abc # Python2
    - sudo pip3 install abc # Python3
Tip: We are currently not using virtualenv to run pylint, when installing your modules make sure to use the command for the correct version.

2. Missing __init__.py files

This error could also result from missing __init__.py files in your module folders. Make sure that you place one file in each sub-folder.

Loading history...
6
7
from Orange.data import Table, Storage, Instance, Value
8
from Orange.preprocess import Continuize, RemoveNaNColumns, SklImpute
9
from Orange.misc.wrapper_meta import WrapperMeta
10
11
__all__ = ["Learner", "Model", "SklLearner", "SklModel"]
12
13
14
class Learner:
15
    supports_multiclass = False
16
    supports_weights = False
17
    name = 'learner'
18
    #: A sequence of data preprocessors to apply on data prior to
19
    #: fitting the model
20
    preprocessors = ()
21
    learner_adequacy_err_msg = ''
22
23
    def __init__(self, preprocessors=None):
24
        if preprocessors is None:
25
            preprocessors = type(self).preprocessors
26
        self.preprocessors = list(preprocessors)
27
28
    def fit(self, X, Y, W=None):
29
        raise NotImplementedError(
30
            "Descendants of Learner must overload method fit")
31
32
    def fit_storage(self, data):
33
        return self.fit(data.X, data.Y, data.W)
34
35
    def __call__(self, data):
36
        if not self.check_learner_adequacy(data.domain):
37
            raise ValueError(self.learner_adequacy_err_msg)
38
39
        origdomain = data.domain
40
41
        if isinstance(data, Instance):
42
            data = Table(data.domain, [data])
43
        data = self.preprocess(data)
44
45
        if len(data.domain.class_vars) > 1 and not self.supports_multiclass:
46
            raise TypeError("%s doesn't support multiple class variables" %
47
                            self.__class__.__name__)
48
49
        self.domain = data.domain
0 ignored issues
show
The attribute domain was defined outside __init__.

It is generally a good practice to initialize all attributes to default values in the __init__ method:

class Foo:
    def __init__(self, x=None):
        self.x = x
Loading history...
50
51
        if type(self).fit is Learner.fit:
52
            model = self.fit_storage(data)
53
        else:
54
            X, Y, W = data.X, data.Y, data.W if data.has_weights() else None
55
            model = self.fit(X, Y, W)
56
        model.domain = data.domain
57
        model.supports_multiclass = self.supports_multiclass
58
        model.name = self.name
59
        model.original_domain = origdomain
60
        return model
61
62
    def preprocess(self, data):
63
        """
64
        Apply the `preprocessors` to the data.
65
        """
66
        for pp in self.preprocessors:
67
            data = pp(data)
68
        return data
69
70
    def __repr__(self):
71
        return self.name
72
73
    def check_learner_adequacy(self, domain):
0 ignored issues
show
The argument domain seems to be unused.
Loading history...
This method could be written as a function/class method.

If a method does not access any attributes of the class, it could also be implemented as a function or static method. This can help improve readability. For example

class Foo:
    def some_method(self, x, y):
        return x + y;

could be written as

class Foo:
    @classmethod
    def some_method(cls, x, y):
        return x + y;
Loading history...
74
        return True
75
76
77
class Model:
78
    supports_multiclass = False
79
    supports_weights = False
80
    Value = 0
81
    Probs = 1
82
    ValueProbs = 2
83
84
    def __init__(self, domain=None):
85
        if isinstance(self, Learner):
86
            domain = None
87
        elif not domain:
88
            raise ValueError("unspecified domain")
89
        self.domain = domain
90
91
    def predict(self, X):
92
        if self.predict_storage == Model.predict_storage:
93
            raise TypeError("Descendants of Model must overload method predict")
94
        else:
95
            Y = np.zeros((len(X), len(self.domain.class_vars)))
96
            Y[:] = np.nan
97
            table = Table(self.domain, X, Y)
98
            return self.predict_storage(table)
99
100
    def predict_storage(self, data):
101
        if isinstance(data, Storage):
102
            return self.predict(data.X)
103
        elif isinstance(data, Instance):
104
            return self.predict(np.atleast_2d(data.x))
105
        raise TypeError("Unrecognized argument (instance of '{}')".format(
106
                        type(data).__name__))
107
108
    def __call__(self, data, ret=Value):
109
        if not 0 <= ret <= 2:
110
            raise ValueError("invalid value of argument 'ret'")
111
        if (ret > 0
112
            and any(v.is_continuous for v in self.domain.class_vars)):
113
            raise ValueError("cannot predict continuous distributions")
114
115
        # Call the predictor
116
        if isinstance(data, np.ndarray):
117
            prediction = self.predict(np.atleast_2d(data))
118
        elif isinstance(data, scipy.sparse.csr.csr_matrix):
119
            prediction = self.predict(data)
120
        elif isinstance(data, Instance):
121
            if data.domain != self.domain:
122
                data = Instance(self.domain, data)
123
            data = Table(data.domain, [data])
124
            prediction = self.predict_storage(data)
125
        elif isinstance(data, Table):
126
            if data.domain != self.domain:
127
                data = data.from_table(self.domain, data)
128
            prediction = self.predict_storage(data)
129
        elif isinstance(data, (list, tuple)):
130
            if not isinstance(data[0], (list, tuple)):
131
                data = [ data ]
132
            data = Table(self.original_domain, data)
133
            data = Table(self.domain, data)
134
            prediction = self.predict_storage(data)
135
        else:
136
            raise TypeError("Unrecognized argument (instance of '{}')".format(
137
                            type(data).__name__))
138
139
        # Parse the result into value and probs
140
        multitarget = len(self.domain.class_vars) > 1
141
        if isinstance(prediction, tuple):
142
            value, probs = prediction
143
        elif prediction.ndim == 1 + multitarget:
144
            value, probs = prediction, None
145
        elif prediction.ndim == 2 + multitarget:
146
            value, probs = None, prediction
147
        else:
148
            raise TypeError("model returned a %i-dimensional array",
149
                            prediction.ndim)
150
151
        # Ensure that we have what we need to return
152
        if ret != Model.Probs and value is None:
153
            value = np.argmax(probs, axis=-1)
154
        if ret != Model.Value and probs is None:
155
            if multitarget:
156
                max_card = max(len(c.values)
157
                               for c in self.domain.class_vars)
158
                probs = np.zeros(value.shape + (max_card,), float)
159
                for i, cvar in enumerate(self.domain.class_vars):
0 ignored issues
show
The variable cvar seems to be unused.
Loading history...
160
                    probs[:, i, :], _ = bn.bincount(np.atleast_2d(value[:, i]),
161
                                                    max_card - 1)
162
            else:
163
                probs, _ = bn.bincount(np.atleast_2d(value),
164
                                       len(self.domain.class_var.values) - 1)
165
            if ret == Model.ValueProbs:
166
                return value, probs
167
            else:
168
                return probs
169
170
        # Return what we need to
171
        if ret == Model.Probs:
172
            return probs
173
        if isinstance(data, Instance) and not multitarget:
174
            value = Value(self.domain.class_var, value[0])
175
        if ret == Model.Value:
176
            return value
177
        else:  # ret == Model.ValueProbs
178
            return value, probs
179
180
    def __repr__(self):
181
        return self.name
182
183
184
class SklModel(Model, metaclass=WrapperMeta):
185
    used_vals = None
186
187
    def __init__(self, skl_model):
0 ignored issues
show
The __init__ method of the super-class Model is not called.

It is generally advisable to initialize the super-class by calling its __init__ method:

class SomeParent:
    def __init__(self):
        self.x = 1

class SomeChild(SomeParent):
    def __init__(self):
        # Initialize the super class
        SomeParent.__init__(self)
Loading history...
188
        self.skl_model = skl_model
189
190
    def predict(self, X):
191
        value = self.skl_model.predict(X)
192
        if hasattr(self.skl_model, "predict_proba"):
193
            probs = self.skl_model.predict_proba(X)
194
            return value, probs
195
        return value
196
197
    def __repr__(self):
198
        return '{} {}'.format(self.name, self.params)
199
200
201
class SklLearner(Learner, metaclass=WrapperMeta):
202
    """
203
    ${skldoc}
204
    Additional Orange parameters
205
206
    preprocessors : list, optional (default=[Continuize(), SklImpute(), RemoveNaNColumns()])
207
        An ordered list of preprocessors applied to data before
208
        training or testing.
209
    """
210
    __wraps__ = None
211
    __returns__ = SklModel
212
    _params = {}
213
214
    name = 'skl learner'
215
    preprocessors = [Continuize(),
216
                     RemoveNaNColumns(),
217
                     SklImpute(force=False)]
218
219
    @property
220
    def params(self):
221
        return self._params
222
223
    @params.setter
224
    def params(self, value):
225
        self._params = self._get_sklparams(value)
226
227
    def _get_sklparams(self, values):
0 ignored issues
show
This code seems to be duplicated in your project.

Duplicated code is one of the most pungent code smells. If you need to duplicate the same code in three or more different places, we strongly encourage you to look into extracting the code into a single class or operation.

You can also find more detailed suggestions in the “Code” section of your repository.

Loading history...
228
        skllearner = self.__wraps__
229
        if skllearner is not None:
230
            spec = inspect.getargs(skllearner.__init__.__code__)
231
            # first argument is 'self'
232
            assert spec.args[0] == "self"
233
            params = {name: values[name] for name in spec.args[1:]
234
                      if name in values}
235
        else:
236
            raise TypeError("Wrapper does not define '__wraps__'")
237
        return params
238
239
    def preprocess(self, data):
240
        data = super().preprocess(data)
241
242
        if any(v.is_discrete and len(v.values) > 2
243
               for v in data.domain.attributes):
244
            raise ValueError("Wrapped scikit-learn methods do not support " +
245
                             "multinomial variables.")
246
247
        return data
248
249
    def __call__(self, data):
250
        m = super().__call__(data)
251
        m.used_vals = [np.unique(y) for y in data.Y[:, None].T]
252
        m.params = self.params
253
        return m
254
255
    def fit(self, X, Y, W):
0 ignored issues
show
Bug Best Practice introduced by
Signature differs from overridden 'fit' method

It is generally a good practice to use signatures that are compatible with the Liskov substitution principle.

This allows to pass instances of the child class anywhere where the instances of the super-class/interface would be acceptable.

Loading history...
256
        clf = self.__wraps__(**self.params)
257
        Y = Y.reshape(-1)
258
        if W is None or not self.supports_weights:
259
            return self.__returns__(clf.fit(X, Y))
260
        return self.__returns__(clf.fit(X, Y, sample_weight=W.reshape(-1)))
261
262
    def __repr__(self):
263
        return '{} {}'.format(self.name, self.params)
264