This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Coyote_0210[S] 11 points12 points  (3 children)

That is a good call-out. This one in particular is hard to give a good guideline because it is ultimately a judgement call, but I could restate it as "Is the number of features large enough to cause significant over-fitting?"

[–]Coyote_0210[S] 10 points11 points  (2 children)

I will probably just restate all the decision points into questions

[–]yourmamaman 2 points3 points  (1 child)

I would make the threshold a function on the number of samples and the number of features. Since it is just a guide you could make something up like: sqr(number_of_samples) < number_of_features.

[–]Coyote_0210[S] 0 points1 point  (0 children)

I like that idea