Passed
Push — master ( c241e4...b050e9 )
by Simon
01:57
created

nsga_iii_sampler_example.main()   A

Complexity

Conditions 3

Size

Total Lines 139
Code Lines 33

Duplication

Lines 139
Ratio 100 %

Importance

Changes 0
Metric Value
eloc 33
dl 139
loc 139
rs 9.0879
c 0
b 0
f 0
cc 3
nop 0

How to fix   Long Method   

Long Method

Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.

For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.

Commonly applied refactorings include:

1
"""
2
NSGAIIISampler Example - Many-objective Optimization with NSGA-III
3
4
NSGA-III is an extension of NSGA-II specifically designed for many-objective
5
optimization problems (typically 3+ objectives). It uses reference points
6
to maintain diversity and selection pressure in high-dimensional objective spaces.
7
8
Characteristics:
9
- Many-objective evolutionary algorithm (3+ objectives)
10
- Reference point-based selection mechanism
11
- Better performance than NSGA-II for many objectives
12
- Maintains diversity through structured reference points
13
- Elitist approach with improved selection pressure
14
- Population-based search with normalization
15
16
Note: For demonstration, we'll create a many-objective problem optimizing
17
accuracy, complexity, training time, and model interpretability.
18
"""
19
20
import numpy as np
21
from sklearn.datasets import load_breast_cancer
22
from sklearn.tree import DecisionTreeClassifier
23
from sklearn.model_selection import cross_val_score
24
import time
25
26
from hyperactive.experiment.integrations import SklearnCvExperiment
27
from hyperactive.opt.optuna import NSGAIIISampler
28
29
30
class ManyObjectiveExperiment:
31
    """Many-objective experiment: optimize multiple conflicting goals."""
32
33
    def __init__(self, X, y):
34
        self.X = X
35
        self.y = y
36
37
    def __call__(self, **params):
38
        # Create model with parameters
39
        model = DecisionTreeClassifier(random_state=42, **params)
40
41
        # Objective 1: Maximize accuracy (return negative for minimization)
42
        start_time = time.time()
43
        scores = cross_val_score(model, self.X, self.y, cv=3)
44
        training_time = time.time() - start_time
45
        accuracy = np.mean(scores)
46
47
        # Objective 2: Minimize model complexity (tree depth)
48
        complexity = params.get("max_depth", 20)
49
50
        # Objective 3: Minimize training time
51
        time_objective = training_time
52
53
        # Objective 4: Maximize interpretability (minimize tree size)
54
        # Approximate tree size based on parameters
55
        max_leaf_nodes = params.get("max_leaf_nodes", 100)
56
        interpretability = max_leaf_nodes / 100.0  # Normalized
57
58
        # Return all objectives for minimization (negative accuracy for maximization)
59
        return [
60
            -accuracy,  # Minimize negative accuracy (maximize accuracy)
61
            complexity / 20.0,  # Minimize complexity (normalized)
62
            time_objective,  # Minimize training time
63
            interpretability,  # Minimize tree size (maximize interpretability)
64
        ]
65
66
67
def nsga_iii_theory():
68
    """Explain NSGA-III algorithm theory."""
69
    # NSGA-III Algorithm (Many-objective Optimization):
70
    #
71
    # 1. Many-objective Challenge:
72
    #    - With 3+ objectives, most solutions become non-dominated
73
    #    - Traditional Pareto ranking loses selection pressure
74
    #    - Crowding distance becomes less effective
75
    #    - Need structured diversity preservation
76
    #
77
    # 2. NSGA-III Innovations:
78
    #    - Reference points on normalized hyperplane
79
    #    - Associate solutions with reference points
80
    #    - Select solutions to maintain balanced distribution
81
    #    - Adaptive normalization for different objective scales
82
    #
83
    # 3. Reference Point Strategy:
84
    #    - Systematic placement on unit simplex
85
    #    - Each reference point guides search direction
86
    #    - Solutions clustered around reference points
87
    #    - Maintains diversity across objective space
88
    #
89
    # 4. Selection Mechanism:
90
    #    - Non-dominated sorting (like NSGA-II)
91
    #    - Reference point association
92
    #    - Niche count balancing
93
    #    - Preserve solutions near each reference point
94
95
96 View Code Duplication
def main():
0 ignored issues
show
Duplication introduced by
This code seems to be duplicated in your project.
Loading history...
97
    # === NSGAIIISampler Example ===
98
    # Many-objective Optimization with NSGA-III
99
100
    nsga_iii_theory()
101
102
    # Load dataset
103
    X, y = load_breast_cancer(return_X_y=True)
104
    print(
105
        f"Dataset: Breast cancer classification ({X.shape[0]} samples, {X.shape[1]} features)"
106
    )
107
108
    # Create many-objective experiment
109
    experiment = ManyObjectiveExperiment(X, y)
110
111
    # Many-objective Problem (4 objectives):
112
    #   Objective 1: Maximize classification accuracy
113
    #   Objective 2: Minimize model complexity (tree depth)
114
    #   Objective 3: Minimize training time
115
    #   Objective 4: Maximize interpretability (smaller trees)
116
    #   → Complex trade-offs between multiple conflicting goals
117
118
    # Define search space
119
    param_space = {
120
        "max_depth": (1, 20),  # Tree depth
121
        "min_samples_split": (2, 50),  # Minimum samples to split
122
        "min_samples_leaf": (1, 20),  # Minimum samples per leaf
123
        "max_leaf_nodes": (10, 200),  # Maximum leaf nodes
124
        "criterion": ["gini", "entropy"],  # Split criterion
125
    }
126
127
    # Search Space:
128
    # for param, space in param_space.items():
129
    #   print(f"  {param}: {space}")
130
131
    # Configure NSGAIIISampler
132
    optimizer = NSGAIIISampler(
133
        param_space=param_space,
134
        n_trials=60,  # More trials needed for many objectives
135
        random_state=42,
136
        experiment=experiment,
137
        population_size=24,  # Larger population for many objectives
138
        mutation_prob=0.1,  # Mutation probability
139
        crossover_prob=0.9,  # Crossover probability
140
    )
141
142
    # NSGAIIISampler Configuration:
143
    # n_trials: configured above
144
    # population_size: larger for many objectives
145
    # mutation_prob: mutation probability
146
    # crossover_prob: crossover probability
147
    # Selection: Reference point-based diversity preservation
148
149
    # Note: NSGA-III is designed for 3+ objectives.
150
    # For 2 objectives, NSGA-II is typically preferred.
151
    # This example demonstrates the interface for many-objective problems.
152
153
    # Run optimization
154
    # Running NSGA-III many-objective optimization...
155
156
    try:
157
        best_params = optimizer.run()
158
159
        # Results
160
        print("\n=== Results ===")
161
        print(f"Best parameters: {best_params}")
162
        print(f"Best score: {optimizer.best_score_:.4f}")
163
        print()
164
165
        # NSGA-III produces a diverse set of solutions across 4D Pareto front:
166
        #  High accuracy, complex, slower models
167
        #  Balanced accuracy/complexity trade-offs
168
        #  Fast, simple, interpretable models
169
        #  Various combinations optimizing different objectives
170
171
    except Exception as e:
172
        print(f"Many-objective optimization example: {e}")
173
        print("Note: This demonstrates the interface for many-objective problems.")
174
        return None, None
175
176
    # NSGA-III vs NSGA-II for Many Objectives:
177
    #
178
    # NSGA-II Limitations (3+ objectives):
179
    #  Most solutions become non-dominated
180
    #  Crowding distance loses effectiveness
181
    #  Selection pressure decreases
182
    #  Uneven distribution in objective space
183
184
    # NSGA-III Advantages:
185
    #  Reference points guide search directions
186
    #  Maintains diversity across all objectives
187
    #  Better selection pressure in many objectives
188
    #  Structured exploration of objective space
189
    #  Adaptive normalization handles different scales
190
191
    # Reference Point Mechanism:
192
    #  Systematic placement on normalized hyperplane
193
    #  Each point represents a different objective priority
194
    #  Solutions associated with nearest reference points
195
    #  Selection maintains balance across all points
196
    #  Prevents clustering in limited objective regions
197
198
    # Many-objective Problem Characteristics:
199
    #
200
    # Challenges:
201
    #  Exponential growth of non-dominated solutions
202
    #  Difficulty visualizing high-dimensional trade-offs
203
    #  User preference articulation becomes complex
204
    #  Increased computational requirements
205
206
    # Best Use Cases:
207
    #  Engineering design with multiple constraints
208
    #  Multi-criteria decision making (3+ criteria)
209
    #  Resource allocation problems
210
    #  System optimization with conflicting requirements
211
    #  When objective interactions are complex
212
    #
213
    # Algorithm Selection Guide:
214
    #
215
    # Use NSGA-III when:
216
    #    3 or more objectives
217
    #    Objectives are truly conflicting
218
    #    Comprehensive trade-off analysis needed
219
    #    Reference point guidance is beneficial
220
    #
221
    # Use NSGA-II when:
222
    #    2 objectives
223
    #    Simpler Pareto front structure
224
    #    Established performance for bi-objective problems
225
    #
226
    # Use single-objective methods when:
227
    #    Can formulate as weighted combination
228
    #    Clear primary objective with constraints
229
    #    Computational efficiency is critical
230
231
    if "best_params" in locals():
232
        return best_params, optimizer.best_score_
233
    else:
234
        return None, None
235
236
237
if __name__ == "__main__":
238
    best_params, best_score = main()
239