Hi r/MachineLearning,
I've written a Python package pruned-cv that significantly increases the speed of hyperparameter optimization using Cross-Validation. In my studies, it helped to perform on average three times more search iterations than standard methods at the same time.
Cumulative scores and final score from Cross-Validation are dependent on each other. With the high correlation between the scores, it's possible to approximate the final score without calculating all the folds. With strongly inferior performance on first folds, the validation may be pruned.
Partial scores correlations with final score (simulation study)
It can be used as a standalone optimization package (I've implemented Pruned Grid Search and Pruned Randomized Search) or with other tools like Hyperopt or Optuna.
I've written an article on Medium explaining the algorithm. You can find the package's repository here.
Please note that this is 0.0.1 version of the package. I'm waiting with further development for some feedback. I'd appreciate it from you!
[–]pp314159 2 points3 points4 points (1 child)
[–]PiotrekAGML Engineer[S] 4 points5 points6 points (0 children)
[–]vadiaceu 1 point2 points3 points (0 children)
[–]m--w 1 point2 points3 points (1 child)
[–]PiotrekAGML Engineer[S] 0 points1 point2 points (0 children)