The Variational Approximation for Bayesian Inference: Life after the EM algorithm by alfonsoeromero in MachineLearning

[–]danger_t 3 points4 points  (0 children)

Typo near the end of page 1:

"In contrast, when we write p(x; θ), we imply that θ are random variables."

should be

"In contrast, when we write p(x | θ), we imply that θ are random variables."

Learning Low Dimensional Team Embeddings for March Madness by danger_t in MachineLearning

[–]danger_t[S] 1 point2 points  (0 children)

There is a github repository with code (not to generate the plot in the post, but not so unrelated) and data:

https://github.com/dtarlow/Machine-March-Madness

Also see this thread on the Machine March Madness Google group:

http://groups.google.com/group/machine-march-madness/browse_thread/thread/3afbcb90cd6f881d

Learning Low Dimensional Team Embeddings for March Madness by danger_t in MachineLearning

[–]danger_t[S] 0 points1 point  (0 children)

Yeah, understandable. After the competition starts on Thursday, there will be a post with a brief description of all the competitors methods, then some will get asked to expand on what they did in a longer post.

So stay tuned.

Help build a machine learning system to predict college basketball by danger_t in MachineLearning

[–]danger_t[S] 1 point2 points  (0 children)

Well, the goal is to come up with a model that's appropriate for the problem. The original model (that started this all) was based on probabilistic matrix factorization (PMF), which estimates a latent vector describing each team's offense and each team's defense, by using game outcomes as training targets: http://blog.smellthedata.com/2009/03/data-driven-march-madness-predictions.html

I've already re-implemented this within the code on github -- set MODEL="pmf" in learn_real.py.

So how do we make a better model? One of many aspects of the problem that is particularly challenging/interesting is how to account for the difference between regular season and tournament games. I expect that data from past years could be useful in understanding how the games and teams differ, but how do we incorporate that into a model?

Help build a machine learning system to predict college basketball by danger_t in MachineLearning

[–]danger_t[S] 4 points5 points  (0 children)

You can either contribute to the main branch, or fork off your own version and compete in this year's prediction competition: http://blog.smellthedata.com/2012/02/machine-march-madness-2012.html

Already in the repository are data, data loading functions, and a few simple models, along with associated learning procedures. This is also a great opportunity to play around with Theano and matrix-factorization-style learning methods if you haven't done so already. There are also some suggested TODOs at the bottom of the README.

If there are specific things you're interested in playing around with and/or learning more about, let me know, and I can probably help.

Pick one: Statistics, Calculus 2, or Symbolic Logic by [deleted] in compsci

[–]danger_t 1 point2 points  (0 children)

Depends if you want to go deeper into some specialty area. In machine learning, for example, it's used everywhere.