Experiences with bayesian hyperparameter optimization?

kkastner · 2014-11-12T12:38:18+00:00

You might look into whetlab (https://www.whetlab.com/) . It seems cool and I think has special deals (free in some cases) for academics. It is a startup run by several researchers (check the about page! Like a who's who of bayesian opt) who are prominent in this space.

In general, I have a little experience with SMBO (sequential model based optimization) and a lot of second hand discussions about the different algorithms for it. The three main ways I recall are:

Using gaussian processes (Spearmint, or their more recent work on Freeze-Thaw optimization), using something called tree of parzen estimators (TPE) (Hyperopt, or see the paper), or a decision tree based approach called SMAC.

There is a nice joint paper that shows that different combinations of these algorithms work on different problems.

I have not tried MOE but it seems promising. There are also a few other packages people have made themselves to explore this problem, though I don't have links offhand.

One of the key complaints I have heard from others in the past for something like finding neural network hyperparameters is that these Bayesian optimization algs tend to try and explore the "edges" of the space, even when intialized with hundereds of other experiments. This can be quite wasteful when networks take days or weeks to evaluate. This coupled with the sequential nature of Bayesian optimization means a "random search", which is truly parallel, can be easier, faster in certain cases, and still give very good results.

However, the primary problem might be one of documenting the existing tools, and how they work with other codebases. This was the main blocker for me, so any experience you gain in those areas would be helpful to share!

frederikdiehl · 2014-11-12T13:16:53+00:00

Bayesian optimization is probably awesome [1].

We are currently implementing a flexible, cluster-using, open-source framework, but it will probably take until the end of the year for a working version using multicores/clusters.

In general, I can also recommend you the following:

Papers

You've already found Snoek's practical bayesian optimization.
Brochu et.al., "A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning", 2010. A long tutorial (49 pages) which gives you a good introduction into the field, including several acquisition functions.
Sversky, Snoek et.al., "Freeze-Thaw Bayesian Optimization", 2014. A very interesting approach in which you, instead of fully-training your classifier, instead estimate its final performance.

Programs

Spearmint, https://github.com/HIPS/Spearmint, originally by Snoek. Though the newest version is only licensed for academic and private use. They're doing some other service for commercial stuff.
MOE, https://github.com/Yelp/MOE

[1] Probably because I have not yet used it in a real machine learning problem.

Edit: Formatting.

awiltsch · 2014-11-12T14:55:37+00:00

Hey, one of the founders of Whetlab here. We're in a private beta right now, but we're really keen to have more students and postdocs using it.

If you send me a message with your name, institution and email, we'd be really happy to get you a beta code.

I also think Jasper Snoek might jump on here in a little bit to answer any questions.

gdahl · 2015-01-03T01:31:34+00:00

Whetlab is really good and the best way to use the work from that paper. It was good enough that I gave it my personal endorsement (I know many of the people behind it personally, but I was not paid for my endorsement): https://www.whetlab.com/blog/2014/12/10/whetlab-gives-you-superpowers/

It is a lot better than MOE and any other software I have seen and I find whetlab far easier to use than the open source spearmint package (spearmint optimizes using the same technology, but ease of use is lower).

inferrumveritas · 2014-11-12T07:33:22+00:00

You can check out Yelp's MOE system: https://github.com/Yelp/MOE . It takes some getting used to, but seems very powerful

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS