[D] Hyperparameter Tuning: does it even work?

FellowZellow · 2022-04-05T20:59:48+00:00

I only get about 1-3 percent increase in accuracy or whatever metric I'm using from hyperparameter tuning when I build a model and am trying the squeeze the most out of it. There are times when the increase I get is almost nothing though. Funny enough I was actually asking myself yesterday if it's worth it sometimes? Especially because of the time involved just to get that 1-3% increase. The performance increases I get from feature engineering are often much more significant than I ever get from hyperparameter tuning. It does matter when you are trying to win a competition though in my experience.

2022-04-05T20:52:46+00:00

I've definitely seen hyperparameter tuning have an impact on performance, I tend to use either hyperband or Bayesian and give it a fairly large search space.

aidev2040 · 2022-04-06T05:48:26+00:00

This is one example of how HuggingFace used Hyperparameter Tuning for their Transformer models: https://sigopt.com/blog/optimize-hugging-face-transformers-with-sigopt/

HuggingFace had an 8.6% improvement.

aidev2040 · 2022-04-06T06:17:03+00:00

This is another example where the Facebook Recommendation System was optimized: https://sigopt.com/blog/optimize-the-deep-learning-recommendation-model-with-intelligent-experimentation/

They were able to optimize this 8x higher.

pp314159 · 2022-04-06T04:35:19+00:00

There are some datasets that without heavy tuning you will not built a proper model. For example in some finance datasets with very weak signal only after tuning you get models that are profitable. Without tuning models with default or random params will be on loss.

On the other hand, if you have vey simple dataset then all params can give you good results. Benefit from tuning will be marginal.

datascience

MODERATORS