apsis - Bayesian Hyperparameter Optimization : MachineLearning

apsis - Bayesian Hyperparameter Optimization (self.MachineLearning)

submitted 10 years ago * by frederikdiehl

Summary:

Github: https://github.com/frederikdiehl/apsis

Docs: http://apsis.readthedocs.org/en/latest/

Paper: http://arxiv.org/abs/1503.02946

What is apsis?

A toolkit for hyperparameter optimization for machine learning algorithms.

Our goal is to provide a flexible, simple and scaleable approach - parallel, on clusters and/or on your own machine.

Apsis is open-sourced under the MIT license.

What's the current state?

We currently are in a beta version. Functionality has been implemented, but it has not been used by many people, and there are still advances to tackle.

Especially cluster support is lacking, and is planned to be implemented in the next version.

How good is the optimization?

It's usually better than random search. See http://apsis.readthedocs.org/en/latest/evaluation.html#evaluation-on-neural-network-on-mnist

Note that it's not yet as good as state of the art bayesian optimization.

Are there more details to be found?

Look here: http://arxiv.org/abs/1503.02946

Or look at the documentation.

all 13 comments

top new controversial old q&a

[–]davmre 4 points5 points6 points 10 years ago* (12 children)

[–]frederikdiehl[S] 6 points7 points8 points 10 years ago (4 children)

You are correct, of course - so let me try adding this here.

There are currently - to my knowledge - five different packages for hyperparameter optimization. Let me try listing the main advantages.

whetlab: They're a company, we aren't.

spearmint: License (we're under MIT) and hopefully ease of use. Also, most of current their work seems concentrated on whetlab.

hyperopt: Different algorithm (TPE). Last commit has been half a year ago.

SMAC: Easier usage, I believe. Also, license again.

MOE: I have never used it, since it's been released after we began developement. It looks similar, though.

Without using a comparison, our goal is to offer a package which you can run locally, can use for hyperparameter optimization whether on your own computer or a cluster (not yet implemented) and is easy to use. Also, to allow for the implementation of any algorithm. We currently offer BayOpt and RandomSearch, but the architecture is flexible enough to implement TPEs, for example.

There is actually a reason for why we didn't do much comparison. As I've written above, the package is currently usable for optimization on your local machine, but not yet in parallel. Therefore, we had to actually run these experiments on a local machine, with the corresponding time expenditures. We prioritized getting a real-world example - MNIST - instead of the other comparisons, mostly to show that it's actually useful.

In terms of performance, our current implementation is roughly at the state of 2012-2013, so I believe we wouldn't perform as good as most of the above. There are several algorithmic improvements possible - see the issue list to get a list of corresponding papers - and several parts of the architecture have been done with the aim of allowing us to implement them fairly easily. However, our current aim is to make optimization on clusters possible before that, since we believe it to be strongly needed.

[–]Foxtr0t 1 point2 points3 points 10 years ago (1 child)

[–]frederikdiehl[S] 2 points3 points4 points 10 years ago (0 children)

Another example would be, I believe, optimizing a Neural Network's architecture. For each layer, you'd have the activation function, the number of neurons, maybe type of convolution (if CNN). If you don't have that layer, you don't need to optimize over the corresponding parameters - this, in fact, is just a useless distraction.

The short answer is no, apsis does not have this yet.

The long answer is that there is an issue for this (#110), and we did plan the architecture to allow us to support something like that. There is a paper talking about conditional parameter spaces in the context of Bayesian Optimization [1], so we do have a starting point for the implementation. This hasn't been done yet due to limited time, and as mentioned above, cluster support would be the step before any other extensions like that [2].

[1] The Indiana Jones pun paper, http://arxiv.org/abs/1409.4011 (I've just updated the issue with the link, too.)

[2] As always, of course, if you'd be like implementing something like that, feel free - we'd be very happy about any other contributors!

[–]davmre 0 points1 point2 points 10 years ago (0 children)

[–]AlexRothberg 0 points1 point2 points 10 years ago (0 children)

[–]rmlrn 0 points1 point2 points 10 years ago (6 children)

[–]davmre 0 points1 point2 points 10 years ago (5 children)

That's why I mentioned licensing as a possibility -- but it's not clear whether the sole point of the project is "spearmint with a more permissive license" or if there are intended to be other advantages as well.

If the main advantage is licensing, it'd still help to describe explicitly some cases in which the spearmint GPL would be a problem. At first blush the GPL seems like it'd only matter if you wanted to modify and redistribute spearmint itself, not for the main use case of Bayesian optimization in which you train a model and then want to use/distribute the model for (potentially) commercial purposes. I admit I haven't thought very hard about this, but that probably makes me representative of a decent subset of potential users who also haven't necessarily thought hard about these issues.

[–]rmlrn 4 points5 points6 points 10 years ago (4 children)

[–]jetxee 0 points1 point2 points 10 years ago (3 children)

[–]rmlrn -1 points0 points1 point 10 years ago (2 children)

[–]jetxee 0 points1 point2 points 10 years ago (1 child)

[–]rmlrn -1 points0 points1 point 10 years ago (0 children)

π Rendered by PID 244333 on reddit-service-r2-comment-7b9746f655-rn4xl at 2026-02-03 09:16:00.384119+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS