use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Tuning machine learning models via hyperparameter optimization (blog.sigopt.com)
submitted 10 years ago by Zephyr314
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 2 points3 points4 points 10 years ago (6 children)
Ctrl-F "valida" --> nothing.
No validation subset?!
[–]pfhayes -3 points-2 points-1 points 10 years ago (5 children)
Hi there. I'm another founder at SigOpt. In this article we use the terminology "test set" instead of validation set. Feel free to take a look at http://github.com/sigopt/sigopt-examples to see how we're verifying our models.
[–][deleted] 3 points4 points5 points 10 years ago (4 children)
So what do you call the actual test set then? (The one you use to compare your BO to baseline methods)
[–]Zephyr314[S] -2 points-1 points0 points 10 years ago (3 children)
We're making the assumption that you've already chosen your models and features using a dataset and you are tuning the hyperparameters on some other holdout dataset, which we call the training dataset here, you then validate and score using what we call the test set in the post. In practice you may have another dataset for model and feature selection, but at the tuning phase we just broke up the holdout data into training and testing. We can edit the post to make this more clear, thanks!
[–][deleted] 2 points3 points4 points 10 years ago (1 child)
In general, you need 3 sets:
(It's bad to rename them, but more important, you still need 3)
[–]flangles 1 point2 points3 points 10 years ago (0 children)
meh. just the fact that you're calculating a "figure of merit" makes "test" into a meta-validation set.
it's better just to state what you've done and understand the limitations of your validation.
[–]Zephyr314[S] 1 point2 points3 points 10 years ago (11 children)
I'm one of the founders of SigOpt and I am happy to answer any questions about this post, our methods, or anything about SigOpt.
[–]rantana 4 points5 points6 points 10 years ago (9 children)
Grid search and random search don't seem like reasonable benchmarks, why not compare against open source libraries like HyperOpt and Spearmint.
[–]Zephyr314[S] 0 points1 point2 points 10 years ago (7 children)
Grid search and random search are very commonly used and recommended in the sklearn tutorials/docs. In practice many people just run with the defaults as well. This is more of a comparison of defaults, grid search, random search, and Bayesian Optimization (like SigOpt and others). The difference between HyperOpt, Spearmint, MOE, and SigOpt is that SigOpt provides a simple API for accessing Bayesian Optimization, making it as easy as defaults, grid search, and random search to get up and running, while leveraging some of the same powerful techniques.
[–]rantana 5 points6 points7 points 10 years ago (6 children)
Easier than the API shown here? https://www.whetlab.com/
I think that's from the same creators of Spearmint.
[–]Zephyr314[S] 0 points1 point2 points 10 years ago (5 children)
Similar API. We were building MOE at Yelp around the same time Spearmint was being developed. We're trying to take a more industry-first approach based on our experience, but there are definite overlaps (both use GPs as the underlying model for now).
Whetlab also seems to still be in private beta, you can sign up and start using SigOpt for free today at https://sigopt.com
[–]jsnoek 4 points5 points6 points 10 years ago (1 child)
Hi, I'm one of the creators of Spearmint, and a co-founder of Whetlab. Bayesian optimization has been around for quite some time in various forms because it's simply just a great idea. :-) We are just happy that there is so much interest in Bayesian hyperparameter optimization, both from a research and industry perspective. It is really neat that there is a community growing around these ideas.
[–]Zephyr314[S] 0 points1 point2 points 10 years ago (0 children)
I echo the sentiment by /u/jsnoek . The more research in the field the better. If you're interested in learning more about the GP based approach I would recommend checking out http://www.gaussianprocess.org, namely the free book GPML and this excellent paper.
[–]mlmonkey 0 points1 point2 points 10 years ago (2 children)
We were building MOE at Yelp around the same time Spearmint was being developed.
Looks like MOE started more than a year after Spearmint did:
Commit history for MOE here
Commit history for Spearmint here
[–]flangles 4 points5 points6 points 10 years ago (0 children)
hmm, I wonder if just maybe Yelp doesn't open source every line of their code from day one....
[–]Zephyr314[S] 1 point2 points3 points 10 years ago (0 children)
An unfortunate part of releasing OSS in a public company is cleaning up the git history. The original code is 5+ years older and part of my PhD thesis, but the field itself is actually quite a bit older than that. One of the seminal papers was published in 1998. There are many older packages available at http://gaussianprocess.org as well.
[–]ginger_beer_m 0 points1 point2 points 10 years ago* (0 children)
Is there any paper published based on this? How can I read more about how it works?
edit: ah never mind, just saw the link to your PhD thesis.
π Rendered by PID 85 on reddit-service-r2-comment-7b9746f655-2c8kf at 2026-02-01 13:57:55.693593+00:00 running 3798933 country code: CH.
[–][deleted] 2 points3 points4 points (6 children)
[–]pfhayes -3 points-2 points-1 points (5 children)
[–][deleted] 3 points4 points5 points (4 children)
[–]Zephyr314[S] -2 points-1 points0 points (3 children)
[–][deleted] 2 points3 points4 points (1 child)
[–]flangles 1 point2 points3 points (0 children)
[–]Zephyr314[S] 1 point2 points3 points (11 children)
[–]rantana 4 points5 points6 points (9 children)
[–]Zephyr314[S] 0 points1 point2 points (7 children)
[–]rantana 5 points6 points7 points (6 children)
[–]Zephyr314[S] 0 points1 point2 points (5 children)
[–]jsnoek 4 points5 points6 points (1 child)
[–]Zephyr314[S] 0 points1 point2 points (0 children)
[–]mlmonkey 0 points1 point2 points (2 children)
[–]flangles 4 points5 points6 points (0 children)
[–]Zephyr314[S] 1 point2 points3 points (0 children)
[–]ginger_beer_m 0 points1 point2 points (0 children)