nsga_ii_sampler_example.main() - Code Metrics - Inspection of "[ENH] `optuna` optimizer interface (#155)" - SimonBlanke/Hyperactive - Measure and Improve Code Quality continuously with Scrutinizer

Passed

Push — master ( c241e4...b050e9 )

by Simon

created 2025-08-16 16:30 UTC

nsga_ii_sampler_example.main() A

↳ Parent: nsga_ii_sampler_example

Complexity

Conditions

Size

Total Lines	127
Code Lines	32

Duplication

Lines	127
Ratio	100 %

Importance

Changes

Metric	Value
eloc	32
dl	127
loc	127
rs	9.112
c	0
b	0
f	0
cc	3
nop	0

How to fix Long Method

"""
NSGAIISampler Example - Multi-objective Optimization with NSGA-II

NSGA-II (Non-dominated Sorting Genetic Algorithm II) is designed for
multi-objective optimization problems where you want to optimize multiple
conflicting objectives simultaneously. It finds a Pareto front of solutions.

Characteristics:
- Multi-objective evolutionary algorithm
- Finds Pareto-optimal solutions (non-dominated set)
- Balances multiple conflicting objectives
- Population-based search with selection pressure
- Elitist approach preserving best solutions
- Crowding distance for diversity preservation

Note: For demonstration, we'll create a multi-objective problem from
a single-objective one by optimizing both performance and model complexity.
"""

import numpy as np
from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score

from hyperactive.experiment.integrations import SklearnCvExperiment
from hyperactive.opt.optuna import NSGAIISampler


class MultiObjectiveExperiment:
    """Multi-objective experiment: maximize accuracy, minimize complexity."""

    def __init__(self, X, y):
        self.X = X
        self.y = y

    def __call__(self, **params):
        # Create model with parameters
        model = RandomForestClassifier(random_state=42, **params)

        # Objective 1: Maximize accuracy (we'll return negative for minimization)
        scores = cross_val_score(model, self.X, self.y, cv=3)
        accuracy = np.mean(scores)

        # Objective 2: Minimize model complexity (number of parameters)
        # For Random Forest: roughly n_estimators × max_depth × n_features
        complexity = (
            params["n_estimators"] * params.get("max_depth", 10) * self.X.shape[1]
        )

        # NSGA-II minimizes objectives, so we return both as minimization
        # Note: This is a simplified multi-objective setup for demonstration
        return [-accuracy, complexity / 10000]  # Scale complexity for better balance


def nsga_ii_theory():
    """Explain NSGA-II algorithm theory."""
    # NSGA-II Algorithm (Multi-objective Optimization):
    #
    # 1. Core Concepts:
    #    - Pareto Dominance: Solution A dominates B if A is better in all objectives
    #    - Pareto Front: Set of non-dominated solutions
    #    - Trade-offs: Improving one objective may worsen another
    #
    # 2. NSGA-II Process:
    #    - Initialize population randomly
    #    - For each generation:
    #      a) Fast non-dominated sorting (rank solutions by dominance)
    #      b) Crowding distance calculation (preserve diversity)
    #      c) Selection based on rank and crowding distance
    #      d) Crossover and mutation to create offspring
    #
    # 3. Selection Criteria:
    #    - Primary: Non-domination rank (prefer better fronts)
    #    - Secondary: Crowding distance (prefer diverse solutions)
    #    - Elitist: Best solutions always survive
    #
    # 4. Output:
    #    - Set of Pareto-optimal solutions
    #    - User chooses final solution based on preferences


def main():

    # === NSGAIISampler Example ===
    # Multi-objective Optimization with NSGA-II

    nsga_ii_theory()

    # Load dataset
    X, y = load_digits(return_X_y=True)
    print(f"Dataset: Handwritten digits ({X.shape[0]} samples, {X.shape[1]} features)")

    # Create multi-objective experiment
    experiment = MultiObjectiveExperiment(X, y)

    # Multi-objective Problem:
    #   Objective 1: Maximize classification accuracy
    #   Objective 2: Minimize model complexity
    #   → Trade-off between performance and simplicity

    # Define search space
    param_space = {
        "n_estimators": (10, 200),  # Number of trees
        "max_depth": (1, 20),  # Tree depth (complexity)
        "min_samples_split": (2, 20),  # Minimum samples to split
        "min_samples_leaf": (1, 10),  # Minimum samples per leaf
        "max_features": ["sqrt", "log2", None],  # Feature sampling
    }

    # Search Space:
    # for param, space in param_space.items():
    #   print(f"  {param}: {space}")

    # Configure NSGAIISampler
    optimizer = NSGAIISampler(
        param_space=param_space,
        n_trials=50,  # Population evolves over multiple generations
        random_state=42,
        experiment=experiment,
        population_size=20,  # Population size for genetic algorithm
        mutation_prob=0.1,  # Mutation probability
        crossover_prob=0.9,  # Crossover probability
    )

    # NSGAIISampler Configuration:
    # n_trials: configured above
    # population_size: for genetic algorithm
    # mutation_prob: mutation probability
    # crossover_prob: crossover probability
    # Selection: Non-dominated sorting + crowding distance

    # Note: This example demonstrates the interface.
    # In practice, NSGA-II returns multiple Pareto-optimal solutions.
    # For single-objective problems, consider TPE or GP samplers instead.

    # Run optimization
    # Running NSGA-II multi-objective optimization...

    try:
        best_params = optimizer.run()

        # Results
        print("\n=== Results ===")
        print(f"Best parameters: {best_params}")
        print(f"Best score: {optimizer.best_score_:.4f}")
        print()

        # NSGA-II typically returns multiple solutions along Pareto front:
        #  High accuracy, high complexity models
        #  Medium accuracy, medium complexity models
        #  Lower accuracy, low complexity models
        #  User selects based on preferences/constraints

    except Exception as e:
        print(f"Multi-objective optimization example: {e}")
        print("Note: This demonstrates the interface for multi-objective problems.")
        return None, None

    # NSGA-II Evolution Process:
    #
    # Generation 1: Random initialization
    #  Diverse population across parameter space
    #  Wide range of accuracy/complexity trade-offs

    # Generations 2-N: Evolutionary improvement
    #  Non-dominated sorting identifies best fronts
    #  Crowding distance maintains solution diversity
    #  Crossover combines good solutions
    #  Mutation explores new parameter regions

    # Final Population: Pareto front approximation
    #  Multiple non-dominated solutions
    #  Represents optimal trade-offs
    #  User chooses based on domain requirements

    # Key Advantages:
    #  Handles multiple conflicting objectives naturally
    #  Finds diverse set of optimal trade-offs
    #  No need to specify objective weights a priori
    #  Provides insight into objective relationships
    #  Robust to objective scaling differences

    # Best Use Cases:
    #  True multi-objective problems (accuracy vs speed, cost vs quality)
    #  When trade-offs between objectives are important
    #  Robustness analysis with multiple criteria
    #  When single objective formulation is unclear

    # Limitations:
    #  More complex than single-objective methods
    #  Requires more evaluations (population-based)
    #  May be overkill for single-objective problems
    #  Final solution selection still required

    # When to Use NSGA-II vs Single-objective Methods:
    # Use NSGA-II when:
    #    Multiple objectives genuinely conflict
    #    Trade-off analysis is valuable
    #    Objective weights are unknown
    #
    # Use TPE/GP when:
    #    Single clear objective
    #    Computational budget is limited
    #    Faster convergence needed

    if "best_params" in locals():
        return best_params, optimizer.best_score_
    else:
        return None, None


if __name__ == "__main__":
    best_params, best_score = main()


1		"""
2		NSGAIISampler Example - Multi-objective Optimization with NSGA-II
3
4		NSGA-II (Non-dominated Sorting Genetic Algorithm II) is designed for
5		multi-objective optimization problems where you want to optimize multiple
6		conflicting objectives simultaneously. It finds a Pareto front of solutions.
7
8		Characteristics:
9		- Multi-objective evolutionary algorithm
10		- Finds Pareto-optimal solutions (non-dominated set)
11		- Balances multiple conflicting objectives
12		- Population-based search with selection pressure
13		- Elitist approach preserving best solutions
14		- Crowding distance for diversity preservation
15
16		Note: For demonstration, we'll create a multi-objective problem from
17		a single-objective one by optimizing both performance and model complexity.
18		"""
19
20		import numpy as np
21		from sklearn.datasets import load_digits
22		from sklearn.ensemble import RandomForestClassifier
23		from sklearn.model_selection import cross_val_score
24
25		from hyperactive.experiment.integrations import SklearnCvExperiment
26		from hyperactive.opt.optuna import NSGAIISampler
27
28
29		class MultiObjectiveExperiment:
30		"""Multi-objective experiment: maximize accuracy, minimize complexity."""
31
32		def __init__(self, X, y):
33		self.X = X
34		self.y = y
35
36		def __call__(self, **params):
37		# Create model with parameters
38		model = RandomForestClassifier(random_state=42, **params)
39
40		# Objective 1: Maximize accuracy (we'll return negative for minimization)
41		scores = cross_val_score(model, self.X, self.y, cv=3)
42		accuracy = np.mean(scores)
43
44		# Objective 2: Minimize model complexity (number of parameters)
45		# For Random Forest: roughly n_estimators × max_depth × n_features
46		complexity = (
47		params["n_estimators"] * params.get("max_depth", 10) * self.X.shape[1]
48		)
49
50		# NSGA-II minimizes objectives, so we return both as minimization
51		# Note: This is a simplified multi-objective setup for demonstration
52		return [-accuracy, complexity / 10000] # Scale complexity for better balance
53
54
55		def nsga_ii_theory():
56		"""Explain NSGA-II algorithm theory."""
57		# NSGA-II Algorithm (Multi-objective Optimization):
58		#
59		# 1. Core Concepts:
60		# - Pareto Dominance: Solution A dominates B if A is better in all objectives
61		# - Pareto Front: Set of non-dominated solutions
62		# - Trade-offs: Improving one objective may worsen another
63		#
64		# 2. NSGA-II Process:
65		# - Initialize population randomly
66		# - For each generation:
67		# a) Fast non-dominated sorting (rank solutions by dominance)
68		# b) Crowding distance calculation (preserve diversity)
69		# c) Selection based on rank and crowding distance
70		# d) Crossover and mutation to create offspring
71		#
72		# 3. Selection Criteria:
73		# - Primary: Non-domination rank (prefer better fronts)
74		# - Secondary: Crowding distance (prefer diverse solutions)
75		# - Elitist: Best solutions always survive
76		#
77		# 4. Output:
78		# - Set of Pareto-optimal solutions
79		# - User chooses final solution based on preferences
80
81
82	View Code Duplication	def main():
		0 ignored issues – show Duplication introduced 2025-08-16 16:32 UTC by Report Bug Copy Issue Report This code seems to be duplicated in your project. Loading history...
83		# === NSGAIISampler Example ===
84		# Multi-objective Optimization with NSGA-II
85
86		nsga_ii_theory()
87
88		# Load dataset
89		X, y = load_digits(return_X_y=True)
90		print(f"Dataset: Handwritten digits ({X.shape[0]} samples, {X.shape[1]} features)")
91
92		# Create multi-objective experiment
93		experiment = MultiObjectiveExperiment(X, y)
94
95		# Multi-objective Problem:
96		# Objective 1: Maximize classification accuracy
97		# Objective 2: Minimize model complexity
98		# → Trade-off between performance and simplicity
99
100		# Define search space
101		param_space = {
102		"n_estimators": (10, 200), # Number of trees
103		"max_depth": (1, 20), # Tree depth (complexity)
104		"min_samples_split": (2, 20), # Minimum samples to split
105		"min_samples_leaf": (1, 10), # Minimum samples per leaf
106		"max_features": ["sqrt", "log2", None], # Feature sampling
107		}
108
109		# Search Space:
110		# for param, space in param_space.items():
111		# print(f" {param}: {space}")
112
113		# Configure NSGAIISampler
114		optimizer = NSGAIISampler(
115		param_space=param_space,
116		n_trials=50, # Population evolves over multiple generations
117		random_state=42,
118		experiment=experiment,
119		population_size=20, # Population size for genetic algorithm
120		mutation_prob=0.1, # Mutation probability
121		crossover_prob=0.9, # Crossover probability
122		)
123
124		# NSGAIISampler Configuration:
125		# n_trials: configured above
126		# population_size: for genetic algorithm
127		# mutation_prob: mutation probability
128		# crossover_prob: crossover probability
129		# Selection: Non-dominated sorting + crowding distance
130
131		# Note: This example demonstrates the interface.
132		# In practice, NSGA-II returns multiple Pareto-optimal solutions.
133		# For single-objective problems, consider TPE or GP samplers instead.
134
135		# Run optimization
136		# Running NSGA-II multi-objective optimization...
137
138		try:
139		best_params = optimizer.run()
140
141		# Results
142		print("\n=== Results ===")
143		print(f"Best parameters: {best_params}")
144		print(f"Best score: {optimizer.best_score_:.4f}")
145		print()
146
147		# NSGA-II typically returns multiple solutions along Pareto front:
148		# High accuracy, high complexity models
149		# Medium accuracy, medium complexity models
150		# Lower accuracy, low complexity models
151		# User selects based on preferences/constraints
152
153		except Exception as e:
154		print(f"Multi-objective optimization example: {e}")
155		print("Note: This demonstrates the interface for multi-objective problems.")
156		return None, None
157
158		# NSGA-II Evolution Process:
159		#
160		# Generation 1: Random initialization
161		# Diverse population across parameter space
162		# Wide range of accuracy/complexity trade-offs
163
164		# Generations 2-N: Evolutionary improvement
165		# Non-dominated sorting identifies best fronts
166		# Crowding distance maintains solution diversity
167		# Crossover combines good solutions
168		# Mutation explores new parameter regions
169
170		# Final Population: Pareto front approximation
171		# Multiple non-dominated solutions
172		# Represents optimal trade-offs
173		# User chooses based on domain requirements
174
175		# Key Advantages:
176		# Handles multiple conflicting objectives naturally
177		# Finds diverse set of optimal trade-offs
178		# No need to specify objective weights a priori
179		# Provides insight into objective relationships
180		# Robust to objective scaling differences
181
182		# Best Use Cases:
183		# True multi-objective problems (accuracy vs speed, cost vs quality)
184		# When trade-offs between objectives are important
185		# Robustness analysis with multiple criteria
186		# When single objective formulation is unclear
187
188		# Limitations:
189		# More complex than single-objective methods
190		# Requires more evaluations (population-based)
191		# May be overkill for single-objective problems
192		# Final solution selection still required
193
194		# When to Use NSGA-II vs Single-objective Methods:
195		# Use NSGA-II when:
196		# Multiple objectives genuinely conflict
197		# Trade-off analysis is valuable
198		# Objective weights are unknown
199		#
200		# Use TPE/GP when:
201		# Single clear objective
202		# Computational budget is limited
203		# Faster convergence needed
204
205		if "best_params" in locals():
206		return best_params, optimizer.best_score_
207		else:
208		return None, None
209
210
211		if __name__ == "__main__":
212		best_params, best_score = main()
213

SimonBlanke / Hyperactive

Push — master ( c241e4...b050e9 )

nsga_ii_sampler_example.main() A

Complexity

Size

Duplication

Importance

How to fix Long Method

Long Method

Duplication Side-by-Side

Filter issues like