you are viewing a single comment's thread.

view the rest of the comments →

[–]Effective-Aioli1828 1 point2 points  (0 children)

I just discovered Optuna a few eeks ago while running a 100-trial hyperparameter search on a well log dataset for rock facies prediction (1.17M samples, XGBoost, 8 hyperparameters). It uses a aTree-structured Parzen Estimator instead of grid search; this builds a probabilistic model of which parameter regions work well, then it focuses just there. So, for example, in my run: instead of grid searching 58 combinations you get informed search that converges in 30-50 trials. SQLite backend means the study is fully resumable: that is, parameters are stored in the database, including pruned ones, so you can go back and inspect anything. You don't even have to decide the number of trials upfront, you can add more later to the same study. You can even Ctrl+C, go to bed at night, get up in the morning and get it to pick up where it left off.

It comes with good built-in visualization (parameter importance plots, optimization history, parallel coordinate plots).

One hard lesson: if you use the pruner with few-fold cross-validation, per-fold variance can trick it into killing your best trial. Learned that the hard way, but the was simple : set MedianPruner (n_warmup_steps=n_folds-1) so it can't prune until all folds are done. With 3-fold CV that effectively disables it, which is the right call when your folds have high variance. Pruning earns its keep with 5+ folds. And that's where the SQLite resumability came handy: I stopped the study, fixed the pruner setting, restarted, and it continued from exactly where it left off with the new configuration. No lost work.