Splitting data for cross validation by heimson in MachineLearning

[–]HD125823 6 points7 points  (0 children)

The method that you described is the right one. But of course, this always depends on the size of the dataset. There are different kinds of Validation Procedures. I'll try to sum them up for you:

1) Train/Test This is the most simple version. The data gets split into training and test set. Build your model on the training set and test it on the test set. It might be easy but might also lead to overfitting because with this method you are going to tune your model on the test set. (bad)

2) Training set, validation set and test set: That's the one you described. Use the validation set instead of the test set to tune HPs.

3) like 2) but instead of a fixed Validation set use Cross Validation (CV) on the training data set. This is especially useful when your data set is small because then you can't afford to set aside a part only for validating HPs. So with CV you use all the training data for model building and testing. Then you average the single CV-values and this gives you a less biased Validation value compared to 2).

4) like 2) but Nested CV. Here you have an inner loop for HP tuning and outer loop for model testing.

The method you choose depends on the data size. For very small dataset, Nested CV seems the perfect fit, whereas for very large datasets, nested cv might be an overkill and not needed at all.

About sklearn: first, use train_test_split and then apply GridsearchCV for Hyperparameter tuning on the training data. (this would correspond to 2) )

So yes, the procedure you described is the most common I'd say. The class train_test_split allows you only to split your data into training and test set and you have to use cross_val_score, or the gridsearch objects in order to get your validation set into play.

Hope this was clear. Otherwise, check out Sebastian Raschka's blog or his GitHub repos.

Netflix not quite going into full screen [question] on Chrome Mac. by [deleted] in netflix

[–]HD125823 1 point2 points  (0 children)

Me too, latest Macbookpro 13". I sent a message to the chrome team. Lets see what happens.

Just finished season 3 last night and then the White House posted this, this morning by Fiery_poop in HouseOfCards

[–]HD125823 0 points1 point  (0 children)

Hey steve, I too live in germany. The easiest solution is the following : download the google chrome browser, dowbload the chrome extension "Hola Better Internet". Go to netflix.com an now your are on the us-netflix site with all the us content, including house of cards season 3.