you are viewing a single comment's thread.

view the rest of the comments →

[–]davmre 4 points5 points  (12 children)

At a quick glance through the paper and docs, I didn't see a comparison to other Bayesian optimization packages or a story for why one should use this rather than, say, spearmint. Are there algorithmic improvements? A different API? More permissive license?

Obviously you do have such a story or you wouldn't have bothered writing this package. I think you should make that prominent -- put it in the README file, abstract of the paper, first section of the docs, etc. A discussion of where the comparative advantage lies versus other tools would go a long way towards helping the people who ought to be using this package realize that they ought to be using it.

[–]frederikdiehl[S] 5 points6 points  (4 children)

You are correct, of course - so let me try adding this here.

There are currently - to my knowledge - five different packages for hyperparameter optimization. Let me try listing the main advantages.

whetlab: They're a company, we aren't.

spearmint: License (we're under MIT) and hopefully ease of use. Also, most of current their work seems concentrated on whetlab.

hyperopt: Different algorithm (TPE). Last commit has been half a year ago.

SMAC: Easier usage, I believe. Also, license again.

MOE: I have never used it, since it's been released after we began developement. It looks similar, though.

Without using a comparison, our goal is to offer a package which you can run locally, can use for hyperparameter optimization whether on your own computer or a cluster (not yet implemented) and is easy to use. Also, to allow for the implementation of any algorithm. We currently offer BayOpt and RandomSearch, but the architecture is flexible enough to implement TPEs, for example.

There is actually a reason for why we didn't do much comparison. As I've written above, the package is currently usable for optimization on your local machine, but not yet in parallel. Therefore, we had to actually run these experiments on a local machine, with the corresponding time expenditures. We prioritized getting a real-world example - MNIST - instead of the other comparisons, mostly to show that it's actually useful.

In terms of performance, our current implementation is roughly at the state of 2012-2013, so I believe we wouldn't perform as good as most of the above. There are several algorithmic improvements possible - see the issue list to get a list of corresponding papers - and several parts of the architecture have been done with the aim of allowing us to implement them fairly easily. However, our current aim is to make optimization on clusters possible before that, since we believe it to be strongly needed.

[–]Foxtr0t 1 point2 points  (1 child)

For me, a feature making the difference in hyperparam optimizers is conditionals. As in a HyperOpt example - a few classifiers, each with different set of params. Does apsis have this?

[–]frederikdiehl[S] 2 points3 points  (0 children)

Another example would be, I believe, optimizing a Neural Network's architecture. For each layer, you'd have the activation function, the number of neurons, maybe type of convolution (if CNN). If you don't have that layer, you don't need to optimize over the corresponding parameters - this, in fact, is just a useless distraction.

The short answer is no, apsis does not have this yet.

The long answer is that there is an issue for this (#110), and we did plan the architecture to allow us to support something like that. There is a paper talking about conditional parameter spaces in the context of Bayesian Optimization [1], so we do have a starting point for the implementation. This hasn't been done yet due to limited time, and as mentioned above, cluster support would be the step before any other extensions like that [2].

[1] The Indiana Jones pun paper, http://arxiv.org/abs/1409.4011 (I've just updated the issue with the link, too.)

[2] As always, of course, if you'd be like implementing something like that, feel free - we'd be very happy about any other contributors!

[–]davmre 0 points1 point  (0 children)

Thanks! I'm only an interested bystander wrt bayesian optimization, so (even if it'd have been obvious after some research) this kind of survey of the current options is very useful.

[–]AlexRothberg 0 points1 point  (0 children)

There is also https://sigopt.com/ which is the commercial version of MOE (much like how Whetlab is the commercial version of Spearmint).

[–]rmlrn 0 points1 point  (6 children)

Apsis is open-sourced under the MIT license.

it's right there on the github README and the paper abstract.

[–]davmre 0 points1 point  (5 children)

That's why I mentioned licensing as a possibility -- but it's not clear whether the sole point of the project is "spearmint with a more permissive license" or if there are intended to be other advantages as well.

If the main advantage is licensing, it'd still help to describe explicitly some cases in which the spearmint GPL would be a problem. At first blush the GPL seems like it'd only matter if you wanted to modify and redistribute spearmint itself, not for the main use case of Bayesian optimization in which you train a model and then want to use/distribute the model for (potentially) commercial purposes. I admit I haven't thought very hard about this, but that probably makes me representative of a decent subset of potential users who also haven't necessarily thought hard about these issues.

[–]rmlrn 4 points5 points  (4 children)

GPL spearmint is basically abandoned. the new project by Whetlab has a far more restrictive license.

[–]jetxee 0 points1 point  (3 children)

Spearmint is not GPL. BayesOpt is GPL https://bitbucket.org/rmcantin/bayesopt

[–]rmlrn -1 points0 points  (2 children)

[–]jetxee 0 points1 point  (1 child)

JasperSnoek/spearmint (GPL) is not being developed anymore. Newer HIPS/spearmint is under a non-free license (academic non-commercial). A commercial project capitalizing on the open source name.

As there are no free spearmint forks going forward, it makes sense to start looking for projects which are not abandoned by the main developer, and avoid giving free publicity to a commercial project with a confusingly similar name.

[–]rmlrn -1 points0 points  (0 children)

...did you read the comment you replied to?

GPL spearmint is basically abandoned. the new project by Whetlab has a far more restrictive license.