Passed
Push — master ( d03f6d...d5da96 )
by Simon
01:24
created

SklearnPipeline   A

Complexity

Total Complexity 3

Size/Duplication

Total Lines 50
Duplicated Lines 26 %

Importance

Changes 0
Metric Value
wmc 3
eloc 33
dl 13
loc 50
rs 10
c 0
b 0
f 0

3 Functions

Rating   Name   Duplication   Size   Complexity  
A pipeline1() 0 2 1
A model() 13 13 1
A pipeline2() 0 2 1

How to fix   Duplicated Code   

Duplicated Code

Duplicate code is one of the most pungent code smells. A rule that is often used is to re-structure code once it is duplicated in three or more places.

Common duplication problems, and corresponding solutions are:

1
from sklearn.datasets import load_breast_cancer
2
from sklearn.model_selection import cross_val_score
3
from sklearn.feature_selection import SelectKBest, f_classif
4
from sklearn.ensemble import GradientBoostingClassifier
5
from sklearn.pipeline import Pipeline
6
7
from hyperactive import Hyperactive
8
9
data = load_breast_cancer()
10
X, y = data.data, data.target
11
12
13
def pipeline1(filter_, gbc):
14
    return Pipeline([("filter_", filter_), ("gbc", gbc)])
15
16
17
def pipeline2(filter_, gbc):
18
    return gbc
19
20
21 View Code Duplication
def model(para, X, y):
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
22
    gbc = GradientBoostingClassifier(
23
        n_estimators=para["n_estimators"],
24
        max_depth=para["max_depth"],
25
        min_samples_split=para["min_samples_split"],
26
        min_samples_leaf=para["min_samples_leaf"],
27
    )
28
    filter_ = SelectKBest(f_classif, k=para["k"])
29
    model_ = para["pipeline"](filter_, gbc)
30
31
    scores = cross_val_score(model_, X, y, cv=3)
32
33
    return scores.mean()
34
35
36
search_config = {
37
    model: {
38
        "k": range(2, 30),
39
        "n_estimators": range(10, 200, 10),
40
        "max_depth": range(2, 12),
41
        "min_samples_split": range(2, 12),
42
        "min_samples_leaf": range(1, 11),
43
        "pipeline": [pipeline1, pipeline2],
44
    }
45
}
46
47
48
opt = Hyperactive(X, y)
49
opt.search(search_config, n_iter=100)
50