Conditions | 1 |
Total Lines | 128 |
Code Lines | 25 |
Lines | 0 |
Ratio | 0 % |
Changes | 0 |
Small methods make your code easier to understand, in particular if combined with a good name. Besides, if your method is small, finding a good name is usually much easier.
For example, if you find yourself adding comments to a method's body, this is usually a good sign to extract the commented part to a new method, and use the comment as a starting point when coming up with a good name for this new method.
Commonly applied refactorings include:
If many parameters/temporary variables are present:
1 | """ |
||
83 | def main(): |
||
84 | # === QMCSampler Example === |
||
85 | # Quasi-Monte Carlo Low-Discrepancy Sampling |
||
86 | |||
87 | qmc_theory() |
||
88 | demonstrate_space_filling() |
||
89 | |||
90 | # Load dataset |
||
91 | X, y = load_wine(return_X_y=True) |
||
92 | print(f"Dataset: Wine classification ({X.shape[0]} samples, {X.shape[1]} features)") |
||
93 | |||
94 | # Create experiment |
||
95 | estimator = LogisticRegression(random_state=42, max_iter=1000) |
||
96 | experiment = SklearnCvExperiment(estimator=estimator, X=X, y=y, cv=5) |
||
97 | |||
98 | # Define search space |
||
99 | param_space = { |
||
100 | "C": (0.001, 100), # Regularization strength |
||
101 | "l1_ratio": (0.0, 1.0), # Elastic net mixing parameter |
||
102 | "solver": ["liblinear", "saga"], # Solver algorithm |
||
103 | "penalty": ["l1", "l2", "elasticnet"], # Regularization type |
||
104 | } |
||
105 | |||
106 | # Search Space: |
||
107 | # C: (0.001, 100) - Regularization strength |
||
108 | # l1_ratio: (0.0, 1.0) - Elastic net mixing parameter |
||
109 | # solver: ['liblinear', 'saga'] - Solver algorithm |
||
110 | # penalty: ['l1', 'l2', 'elasticnet'] - Regularization type |
||
111 | |||
112 | # Configure QMCSampler |
||
113 | optimizer = QMCSampler( |
||
114 | param_space=param_space, |
||
115 | n_trials=32, # Power of 2 often works well for QMC |
||
116 | random_state=42, |
||
117 | experiment=experiment, |
||
118 | qmc_type="sobol", # Sobol or Halton sequences |
||
119 | scramble=True, # Randomized QMC (Owen scrambling) |
||
120 | ) |
||
121 | |||
122 | # QMCSampler Configuration: |
||
123 | # n_trials: 32 (power of 2 for better QMC properties) |
||
124 | # qmc_type: 'sobol' sequence |
||
125 | # scramble: True (randomized QMC) |
||
126 | # Deterministic low-discrepancy sampling |
||
127 | |||
128 | # QMC Sequence Types: |
||
129 | # Sobol: Excellent for moderate dimensions, binary-based |
||
130 | # Halton: Good for low dimensions, prime-based |
||
131 | # Scrambling: Adds randomization while preserving uniformity |
||
132 | |||
133 | # Run optimization |
||
134 | # Running QMC sampling optimization... |
||
135 | best_params = optimizer.run() |
||
136 | |||
137 | # Results |
||
138 | print("\n=== Results ===") |
||
139 | print(f"Best parameters: {best_params}") |
||
140 | print(f"Best score: {optimizer.best_score_:.4f}") |
||
141 | print() |
||
142 | |||
143 | # QMC behavior analysis: |
||
144 | # |
||
145 | # QMC Sampling Analysis: |
||
146 | # |
||
147 | # Sequence Properties: |
||
148 | # Deterministic generation (reproducible with same seed) |
||
149 | # Low-discrepancy (uniform distribution approximation) |
||
150 | # Space-filling (systematic coverage of parameter space) |
||
151 | # Stratification (even coverage of all regions) |
||
152 | # No clustering or large gaps |
||
153 | # |
||
154 | # Sobol Sequence Characteristics: |
||
155 | # Binary-based construction |
||
156 | # Good equidistribution properties |
||
157 | # Effective for dimensions up to ~40 |
||
158 | # Popular choice for QMC sampling |
||
159 | # |
||
160 | # Scrambling Benefits (when enabled): |
||
161 | # Breaks regularity patterns |
||
162 | # Provides Monte Carlo error estimates |
||
163 | # Maintains low-discrepancy properties |
||
164 | # Reduces correlation artifacts |
||
165 | # |
||
166 | # QMC vs Other Sampling Methods: |
||
167 | # |
||
168 | # vs Pure Random Sampling: |
||
169 | # + Better space coverage with fewer samples |
||
170 | # + More consistent performance |
||
171 | # + Faster convergence for integration-like problems |
||
172 | # - No true randomness (if needed for some applications) |
||
173 | # |
||
174 | # vs Grid Search: |
||
175 | # + Works well in higher dimensions |
||
176 | # + No exponential scaling |
||
177 | # + Covers continuous spaces naturally |
||
178 | # - No systematic guarantee of finding grid optimum |
||
179 | # |
||
180 | # vs Adaptive Methods (TPE, GP): |
||
181 | # + No assumptions about objective function |
||
182 | # + Embarrassingly parallel |
||
183 | # + Consistent performance regardless of function type |
||
184 | # - No learning from previous evaluations |
||
185 | # - May waste evaluations in clearly suboptimal regions |
||
186 | # |
||
187 | # Best Use Cases: |
||
188 | # Design of experiments (DoE) |
||
189 | # Initial exploration phase |
||
190 | # Baseline for comparing adaptive methods |
||
191 | # Integration and sampling problems |
||
192 | # When function evaluations are parallelizable |
||
193 | # Robustness testing across parameter space |
||
194 | # |
||
195 | # Implementation Considerations: |
||
196 | # Use powers of 2 for n_trials with Sobol sequences |
||
197 | # Consider scrambling for better statistical properties |
||
198 | # Choose sequence type based on dimensionality: |
||
199 | # - Sobol: Good general choice |
||
200 | # - Halton: Better for low dimensions (< 6) |
||
201 | # QMC works best with transformed uniform parameters |
||
202 | # |
||
203 | # Practical Recommendations: |
||
204 | # 1. Use QMC for initial exploration (first 20-50 evaluations) |
||
205 | # 2. Switch to adaptive methods (TPE/GP) for focused search |
||
206 | # 3. Use for sensitivity analysis across full parameter space |
||
207 | # 4. Good choice when unsure about objective function properties |
||
208 | # 5. Ideal for parallel evaluation scenarios |
||
209 | |||
210 | return best_params, optimizer.best_score_ |
||
211 | |||
215 |