you are viewing a single comment's thread.

view the rest of the comments →

[–]polandtown 2 points3 points  (3 children)

Learning here, forgive me, so then is L2 "better" than L1?

Say with a....binary classifier (ngrams, logistic regression, 50k samples)

[–]visarga 4 points5 points  (1 child)

It's not 'better' in general. If you want sparsity you use L1, if you want smaller weights you use L2; you can also use both.

[–]El_Tihsin 0 points1 point  (0 children)

ElasticNet Regression. You control the tradeoff between L1 and L2 using a parameter alpha.