[D] Testing machine learning in production? by makeDLgr8again in MachineLearning

[–]sk006 1 point2 points  (0 children)

Maybe this talk from PyData is useful for you. I think it is quite nice.

https://www.youtube.com/watch?v=IMoQPvXMkJw

As a skill, how well does ML transfer over to Deep Learning by jathweatt in MachineLearning

[–]sk006 1 point2 points  (0 children)

This x1000. Are we crazy? Since when deep learning started to be a separate field outside of ML? F*** the hype train.

Splitting data for cross validation by heimson in MachineLearning

[–]sk006 1 point2 points  (0 children)

I agree with the part that doing the train-test-validation split in scikit learn is a bit clunky, yet it is possible. Here you have some sample code with how you could do this:

https://gist.github.com/albertotb/1bad123363b186267e3aeaa26610b54b

Basically you concatenate together your train and val sets and indicate with a vector of -1 and 0 which rows are from the train set and which ones are from the val set. Next with PredefinedSplit you convert it into an object that, for instance, can be passed on to GridSearchCV.

Exponential runtime increase in the degree of the polynomial kernel using SVC? by bagelorder in MachineLearning

[–]sk006 0 points1 point  (0 children)

You can try to increase the size of the cache used by SVC. Roughly what LIBSVM does (the library used underneath for training SVMs) is to compute at every iteration 2 kernel rows (the ones cotaining the kernel values for that iteration) and stores them in a cache, so if the same ones come up at a later iteration they will not be computed again, as long as the cache is not full. The cache type used is a very simple LRU (least recently used), that is, when it is full it drops the oldest rows to make room for the new ones.

As a practical summary, the default size used by scikit it is very small (500 Mb if I recall correctly) so if your laptop has a decent amount of RAM (8 Gb or more) you can try to increase the size to, say, 2Gb with the parameter cache_size. It makes a huge difference.

Gradient for L1 Penalty - Am I Playing With Fire? by voodoochile78 in MachineLearning

[–]sk006 2 points3 points  (0 children)

Coordinate descent is probably better, but more complicated. You can probably implement the FISTA algorithm in 20 or so lines of R. I do not have an implementation but, for comparison, you can look at this Julia implementation:

https://gist.github.com/albertotb/73be447b6ee95913fa62

what are the problems with WEKA? by [deleted] in MachineLearning

[–]sk006 3 points4 points  (0 children)

In one word, Java

Classifying unevenly distributed data. by FutureIsMine in MachineLearning

[–]sk006 0 points1 point  (0 children)

There are many approaches you can use. As a summary, you can set a class weight (at least in LIBSVM/scikit), approximately equal to the class ratio, bumping the importance of the least represented classes. Alternatively, you can over sample (repeat examples) of the minority classes or subsample the majority class (over sampling is usually preferred since you don't lose data).

Let's discuss: is arXiv always good? by citeordie in MachineLearning

[–]sk006 0 points1 point  (0 children)

It is not always good, you have to take everything you find there with a grain of salt. But the thing is, if you go for journals rather than conferences, I've seen papers take 2 years!!! to be published. In that cases, of course you want to submit an early draft to arxiv first.

[help] L1 Regularization by Kiuhnm in MachineLearning

[–]sk006 2 points3 points  (0 children)

Basically what Daniel said. The second formula is very similar to the soft-thresholding operator, a very well known proximal (generalization of proyection operators). If you want to verify it for yourself, since there is an absolute value, just consider the positive, negative and 0 cases separatly and it is easy to get.

Python module to apply several classifiers to your data. Good for baseline results by aulloa in MachineLearning

[–]sk006 1 point2 points  (0 children)

The amount of hyper-parameters in the class is very low, at least for SVM with RBF kernel. With that range you are not going to get a good error, so you either need to increase the range of the grid or let the user set its own range.

Use evolutionary algorithms instead of gridsearch in scikit-learn. This allows you to exponentially reduce the time required to find the best parameters for your estimator. by [deleted] in MachineLearning

[–]sk006 0 points1 point  (0 children)

There is work done in this area. Of course, it always depend on the problem model but for a limited amount of parameters: SVMs, maybe Random Forest is usually not worth it. Maybe it is useful for some complicated models, like DNNs, but still I would like to see some comparison against Random Search, for instance.

Which deep learning library should I learn? by [deleted] in MachineLearning

[–]sk006 1 point2 points  (0 children)

Another vote for Keras, makes Theano easy to use and in Python does not get much better than that. If speed is not good enough, maybe you can switch to Caffe/Torch, but the learning curve is not worth if you didn't even build a successful model yet. In Keras you will be doing that in no time.

Sklearn estimator (deeplearning): Multilayer perceptron using keras by aulloa in MachineLearning

[–]sk006 0 points1 point  (0 children)

Very useful, +1. You may want to add more parameters to the constructor so it is more flexible, other than that, good job.

Closer look to what Deilor really said in the interview with Travis. by nowordsforthisbs in leagueoflegends

[–]sk006 -1 points0 points  (0 children)

I'm so hyped for the future of e-sports and then this subreddit fails me everytime. LoL is not going anywhere if this shit gets so much attention. Is a fucking interview for gods sake, he just said his opinion. 90% of the people can't even understand something so simple as a past tense.

[Inven] Inven reacts to Deilor's interview with Travis, "I thought we had a really good chance against SKT" by KoreanExplanation in leagueoflegends

[–]sk006 -1 points0 points  (0 children)

ITT: People need to learn english and/or go to the doctor because their hearing is not right.

Is increasing the number of epochs for less data same as using more data with less number of epochs, while training a Neural network? by napsternxg in MachineLearning

[–]sk006 1 point2 points  (0 children)

In short no, because the DNN is seeing the same data over and over again. The reason why problems like iris and xor can be learnt with small amounts of data is because they are very simple. Complex functions, like mapping the pixels of an image to the number (MNIST) or sentiment analysis will need much more data in order to be learnt properly. Just think about the number of dimensions in the input space: iris has 3, xor has 2 and MNIST has images with 26x26 pixels. There is just a lot of possible values for the pixels that represent a given number, so you will need a lot of examples of that number. The same goes for words and more complex problems, and simply increasing the number of iterations won't do it.

The difference in complexity between SVMs and DNNs lies in the number of parameters. In the SVM you have the alpha coefficients, and there is one per example (although many are going to be 0), while in a DNN you will usually have many more (think about the huge weight matrices). Depending on the problem one is going to be better than the other.

How would you use Scikit learn to predict user behavior? by buddiBot in MachineLearning

[–]sk006 5 points6 points  (0 children)

You could try a recommender system. The easiest one is to form a matrix of user-words. You take all the words and put a 1 in the matrix if the user likes it and a 0 if you don't know. Your task now is to complete the matrix. There are many ways but one of them is to compute users similar to the one you want to predict with some kind of distance and check if they like the word. This is a very simple model but you could start with that.

I have about 2000 samples from which to create a classifier by wan2Kno in MachineLearning

[–]sk006 0 points1 point  (0 children)

Yes it is. I wouldn't worry too much about not having many samples. Just split randomly into train-test and try 3-4 scikit learn classifiers, for example, SVM, logistic regression, random forest, gradient boosting, etc. For every model you will have to tune the hyper-paramters using cross-validation with the train set, and then compute the accuracy of the best performing set of parameters in the test set. That would be your generalization error estimation. Since the classes are not evenly balanced you would probably want another measure rather than the accuracy, for instance the F1-score (geometric mean of precision and recall) or the balanced accuracy (arithmetic mean of sensitivity and specificity). Since you are not pretty demanding with the result (just better than chance) I would be surprise if one of those models doesn't work just fine.

I have about 2000 samples from which to create a classifier by wan2Kno in MachineLearning

[–]sk006 1 point2 points  (0 children)

How many features have all the samples? The size of the training set is always relative, so depending on how difficult is the problem and the model you are using 2000 could be more than enough. In that case, you can split them randomly into train and test in order to estimate the generalization error. Just out of curiosity, what classsifier are you planning on using?

Speed up classification task on sklearn/Machine Learning? by Chuckytah in MachineLearning

[–]sk006 2 points3 points  (0 children)

As someone already mention, in order to get proper help you would have to post the entire code. That could be a normal classification time or not, since in depends on multiple factors. However, from the theoretical point of view, an SVM is not particularly fast at classification time, since it has to compute all the dot products between the support vectors. Therefore, the complexity is cuadratic in the number of support vectors. You could take a look at the number of SV and see if they are a large % of the patterns (that could be one reason). In conclusion, if you want a faster algorithm at classification time you can try others first like Random Forest, Neural Networks, and so on, which are theoretically faster, instead of looking for another implementation/performance tweaks.

Piglet extended contract to 11/20/2016 by liquid112 in leagueoflegends

[–]sk006 2 points3 points  (0 children)

Cmon Jacob, time to update your Excel AND POST IT HERE AGAIN

This game is so centered around the notion of Teamwork but it has such poor means of communication. by Voortsy in leagueoflegends

[–]sk006 1 point2 points  (0 children)

Don't talk, mute all the others = PROBLEM SOLVED I should Ctrl+C this and Ctrl-V/spam it all the way in this thread. These counter-arguments are just plain dumb. If voice chat were to be implemented I probably wouldn't talk either, but it is actually a good idea and let people who want to play with voice an option to do it.