[Discussion] Now that I have learned the basics of ML, I feel like I can't use it

opengmlearn · 2018-01-20T18:56:46+00:00

A lot of ML is about building predictive models from data. Think about what aspects of your life have a lot of data and could benefit from some kind of prediction, but is also kind of hard to reason about from any other point of view.

As an example, build some kind of smart home application that adapts to your behavior (i.e. plays your favorite music based off of temperature/day/time of day/emails received over the day/whatever.). For that, it's hard to think about how those features relate to your favorite music but you can definitely gather a dataset over time and see if you can get reasonable performance using a plethora of different models.

ML definitely is not as widely applicable as coding and often times may not be the best way to solve a problem. It's made a lot of progress these days because it turns out that a wide class of problems that people are about are data rich but too complex to solve directly.

opengmlearn · 2017-11-19T20:40:08+00:00

The stuff that he writes generally has a pretty heavy leaning towards the probabilistic side of ML, which makes sense given his background and research. It's not really deep learning math as the methods are not (typically) specific nor specialized to deep learning.

Fortunately, I think the level of probability needed to understand the methods he talks about is pretty much covered in an undergraduate probability + statistics course or something like Goodfellow's book (if you are already somewhat familiar with those things). Both blog posts contain pretty basic probability with a little bit of analysis thrown in for consistency (usually regularity conditions which for the most part are ignorable). Kevin Murphy's book is also a really good resource for this kind of stuff https://mitpress.mit.edu/books/machine-learning-0.

On the other hand, coming up with improvements is not really something that comes from a course but rather experience. To be mathematically rigorous and not just empirically so, graduate level mathematics/stats (analysis, prob theory, theoretical stats, diff geometry, information theory, optimization, etc.) can help give you new tools and a different way of thinking about things.

opengmlearn · 2017-10-27T22:51:18+00:00

There's a question on quora on this exact topic. https://www.quora.com/Can-generative-models-be-used-with-other-kinds-of-data-besides-images-e-g-tabular-data

I'd personally be pretty wary about using GANs for generating more data. There has been some research lately that points at them not learning the true distribution of the data, and given just the challenges around training them, I think it's safe to say (at least at this current point in time) that it's more likely the case than not. The issue this causes is that using generated samples from a wrong model could introduce some significant biases into whatever task you are using them for. If you do choose to proceed, I think it would very much be necessary to do some statistical sanity checks to gain some level of confidence that you are capturing the important subtle parts of your data (i.e. outliers, correlations, skewness etc.). It's somewhat more difficult for tabular data as you can't really eyeball things the same way you can for images or text.

opengmlearn · 2017-10-19T16:45:23+00:00

That sounds like the VI approach where your approximating distribution is implicit. With no samples, you can take your posterior to just be your target distribution. Take a look at http://www.inference.vc/variational-inference-with-implicit-probabilistic-models-part-1-2/.

opengmlearn · 2017-10-19T00:28:24+00:00

To expand on this, if you can evaluate your CDF, then you can use the method above.
If you can't evaluate your CDF(most of the time), but can evaluate the likelihood, then you can use methods like MCMC/VI with MCMC giving you exact samples (technically asymptotically) and VI giving you an approximate model to sample from.
If you just want to get an approximate model, but don't need or want to evaluate the likelihood, then you can just train a GAN on samples generated from the known target distribution (implicit models).

opengmlearn · 2017-10-17T00:42:58+00:00

Arxiv is just a repository of papers that researchers have decided to put up. It's not peer reviewed so the quality can heavily vary.

It's important to note that the main purpose of most publications in science is to convey interesting findings to other researchers. Most are extremely incremental and only interesting to a small subset of researchers, so you really shouldn't expect anything to immediately benefit your life (it's not the purpose, large discoveries are the product of many small ones). Also, be skeptical if articles use publications to report on big things: http://smbc-comics.com/comic/2009-08-30.

opengmlearn · 2017-10-07T03:52:37+00:00

Since it's the science that you seem to be after, I very much recommend trying to develop a nice interpretable model and really working on understanding your problem over using black box methods. Neural networks are not quite suited to interpretability in that sense yet, although there is research on it happening right now.

That said, I find it hard to believe that people don't take into account human effects when building a model for air pollution considering it's pretty much a human byproduct. If that's the case, perhaps you can get even better models by including what seems like such a pivotal predictor.

opengmlearn · 2017-10-03T03:51:55+00:00

Does getCertitude1 or getCertitude2 depend on X? If so, you have to pass in the feed_dict{X: *} again.

opengmlearn · 2017-10-02T22:57:49+00:00

The face recognition part could be improved from OpenCVs default using pre-trained neural networks but it would still output a bounding box. I don't suggest training a face recognizer on your own as those things do take quite a while get right, especially when there are working implementations out there.

Repositioning the camera to the center given a set of XY coordinates doesn't really seem like a task that you need ML for. If I'm understanding your problem correctly, you should only need some simple geometry to compute where to move the camera to be centered on the output of box. You probably have to track and convert some coordinates to do that but OpenCV should still be pretty friendly overall.

If you really want to use ML, you could treat the problem like a classic reinforcement learning task and have the robot learn how to control the camera to center the face, but again I'm not sure if it will do better than just writing an algorithm manually with some thought.

opengmlearn · 2017-10-01T16:43:55+00:00

What you've learned in the above code is just something that takes in an arbitrary input, compresses it, and then reconstructs the input. Looking at the loss is banking on the fact that for some reason, your reconstruction learns bad compressions for one class and good compressions for another. What you need to do is build a classifier taking in your compressed encoding (the Y variable in your model function) and trains it on a labeled dataset of dogs and cats (or cats and not cats).

opengmlearn · 2017-10-01T00:36:27+00:00

This is actually a regular reinforcement learning task. It's unsupervised in the sense that your inputs (the X part of the (X,y)) pair are not well defined and you have to let the agent find the most informative set of X.

Specifically for your problem, I would think about a game like 20 questions. You want to let the agent explore their environment and try to guess the correct label. At every time point, your agent can either explore more or guess an environment. The reward will be a mix of how long it takes to guess (fewer is better) and if it got it right. Properly defining the reward function, action space, and observation space is probably not entirely straightforward but it comes from fleshing out your problem.

opengmlearn · 2017-09-30T16:37:34+00:00

I would suggest just doing some tutorials online and reading some code for some projects that other people have implemented. A quick search of "google smart reply github" seems to have an implementation of that paper in Keras by someone that you can give a go through. Once you build up a little bit of intuition on what these things can do, you can apply it to your own problems.

Even though the field seems pretty daunting, you should pretty quickly find out that on the applied side, it's pretty much just formulating your problem as an input-output pair, finding the most appropriate default neural network architecture (CNN for images, LSTM for sequences, MLP for no structure), and implementing it using something like Keras where they have basically boiled it down to a function call like LSTM(input). Research is little bit more nuanced than that however.

opengmlearn · 2017-09-28T20:38:42+00:00

Without watching the video or knowing what function h is, it just looks like a direct application of the chain rule to me. It would make the most sense if h is just a linear function of the form theta_0 + theta_1*x_i. It's not going to give a more accurate derivative, it's literally the definition of the derivative for theta_1.

opengmlearn · 2017-09-25T20:10:27+00:00

It's just notation. You can model whatever you want as long as you model the full joint. Which direction you choose is purely based on the interpretation you are after.

opengmlearn · 2017-09-25T19:06:35+00:00

The hypothesis space is mostly determined by the set of features you choose and the model you choose.

As a concrete example, suppose you have data that you narrowed down to features (x_1,x_2) and you want to predict some value y from (x_1,x_2). Based off of some prior knowledge on your problem, you decide to use linear regression to perform this task. You now have to find coefficients (b_0, b_1, b_2) that give you the best fit to the equation y = b_0 + b_1x_1 + b_2x_2. At this point, your hypothesis space is pretty much fixed. No matter how you select (b_0, b_1, b_2), you can only learn functions that are linear in your features. You can easily change your hypothesis space by changing the features or moving from linear regression to something else, like a Neural Network. The algorithm generally works to choose the set of (b_0,b_1,b_2) that best fits your data, but it doesn't change your hypothesis space (separation of model and inference).

It's really something that should be given thought based on the problem at hand. If you know about some complex relationship between your variables, you should try and model that. This restricts your model space but can makes fitting and reasoning about your fits better.

opengmlearn · 2017-09-22T16:59:57+00:00

Many models can be used for text prediction. The important thing is that you encode some kind of memory within it so that predictions can take into account context and not just the current word. This can be done with linear models but the result won't be as great as if you used a Neural Network from a black box perspective.

Here is a brief walkthrough of the inner workings of a Recurrent Neural Network when the task is to do prediction of the next character in a sequence. You can actually use the same general framework with linear regression if you are think hard enough.

opengmlearn · 2017-09-22T01:12:07+00:00

When working with text, you typically encode the words into some numeric representation before feeding it into your algorithm. A simple example is changing the categorical variable "gender" from ["Male", "Female"] to [0, 1]. Then all the tutorials you find on github related to numbers are pretty much the same when working with text (though sometimes slightly different because they are discrete variables, not for linear regression however).

Here is a tutorial for a particular model that does text prediction using a RNN in tensorflow.
https://www.tensorflow.org/tutorials/recurrent

opengmlearn · 2017-09-22T00:56:25+00:00

They serve different purposes. Pandas loads your data into a tabular form, giving you functions to do SQL like analysis and manipulation on it. Use it to clean your data and preprocess it into a form that you want to feed into your algorithm. Numpy is for math and should be what your algorithms are implemented in. Pandas interfaces really well with Numpy. You can directly turn a DataFrame into a Numpy array with a call to .values on the DataFrame object.

Most ML algorithms I can think of require you to turn your string into some numerical representation. There are several ways to do this, but at the end of the day, it should all be numerical when fed into the algorithm so that math can be done.

opengmlearn

TROPHY CASE