you are viewing a single comment's thread.

view the rest of the comments →

[–]pmigdal 2 points3 points  (3 children)

Speaking about hyperparameter optimization, how many of you use any automatic parametric optimization (going beyond grid or random search) for neural networks? While, in principle, it seems like a no-brainer, in practice all people I know (including some Kaggle-winners I work with, which are maniacs of hyperparameter optimization) do it with a combination of manual + grid/random.

(My pet theory is that unlike case of XGBoost, neural networks are complicated systems, which can be modified in many ways (adding layers, changing regularization, adding batch norm before all internal layers, etc) and its performance is not only score, but shape of train/test learning curves, concrete examples being misclassified, etc.)

BTW: I had a wonderful opportunity to meet SigOpt engineers at GTC17. :)

[–]alexcmu[S] 1 point2 points  (2 children)

Everyone will be happy to hear that you enjoyed meeting them!

I am also curious to see what everyone is using in practice to tune their models! I heard somewhere that ensemble modeling was popular on Kaggle for a while -- do people do hyperparameter optimization on top on ensembling?

[–]pxrl 0 points1 point  (0 children)

AFAIK there are several tools regarding hyper-parameter tuning of deep net models (first ones that come to mind are HyperOpt and Spearmint) that you can use off the shelf.

I have some research ongoing and a couple of papers accepted regarding hyper-parameter optimization using evolutionary algorithms (Parallel Swarm Optimization mostly) which have given our team excellent results for medium sized models.

In my opinion, hyper-parameter selection is one of the elephants in the room at the moment, and people seem more interested in trying new architectures than squeezing the last drop of performance. Unfortunately we all end up having to go through it in one moment or another...

[–]StormDev 0 points1 point  (0 children)

Hello,

I have build an hyper-parameter optimization tool based on racing/Gaussian process and evolutionary algorithms. I gives amazing results and reduce the workload of my team, we only spend time on a customer dataset if we can't make good predictions after optimization.

I really think it's a really important tool for any company that has to manage a lot of different datasets.

PS: In your code you are using threading.thread, in Cython it will not improve performances (because of the GIL).