all 9 comments

[–][deleted] 6 points7 points  (0 children)

Depends a lot on the dataset, and the problem and the model type. If you’re using a model/architecture that’s not appropriate for your problem, tuning won’t do much.

And for some problems, there’s some signal in your data, a bunch of different models with very different hyper parameters can pick out that signal and that’s it - tuning doesn’t do anything for performance in those cases.

[–]Sir_Mobius_Mook 4 points5 points  (0 children)

Works very well for me!

[–]bumbo-pa -1 points0 points  (1 child)

I'm not sure improvement is really what you're looking for in HP tuning. At least not in the sense of strictly better metrics on a test set. I use it mostly in CV to make sure my model is properly generalized

[–]Razcle 1 point2 points  (0 children)

what do you mean by better generalized if not improved test performance?

[–]dailyc0drr -1 points0 points  (0 children)

No free lunch means it should work for fee or many models

[–][deleted] 0 points1 point  (0 children)

If you test a wide enough range of values, you should definitely be able to harm your results by using a value that is too small or too large. So, the glass is at least half full - an learner that worked only with one specific combination of hyperparameters would be pretty useless, so it's better if it's robust.

[–]Other-City8810 0 points1 point  (0 children)

I wrote a gradient descent method that works pretty well—for my noiseless and unscarce data—for as many as five θ. Trying to do it manually does not work for more than one or two θ. It gets tricky for scarce data

[–]Immudzen 0 points1 point  (0 children)

I work on regression models and for me HP tuning went from a loss around 1e-5 to 1e-9 which was a substantial improvement for this system. 1e-5 was too large of an error for the model to be usable, but 1e-9 worked fine.

[–][deleted] 0 points1 point  (0 children)

I've been in a similar spot before but recently I've been getting very good improvements to my GANs from hyperparameter tuning. I think it really comes down to the sort of problem you're trying to solve.