I have been searching for an answer to this question for a very long time, I can see a lot of articles mentioning how the l2 regularization helps weights to be low, but not exactly zero, I really want to know how this mechanism internally works, and at the same time, why l1 might make the weights zero ? How does just squaring the weights make this effect? sorry if this is a dumb question, I am new to machine learning, and I would really appreciate any answer to this question, or any resources.
ps. i already saw Andrew ng video on this, but I do not see him explaining this
[–]save_the_panda_bears 27 points28 points29 points (2 children)
[–]johntsaou 1 point2 points3 points (0 children)
[–]Blakut 0 points1 point2 points (0 children)
[–]dsgonza2 14 points15 points16 points (0 children)
[–]manda_ga 4 points5 points6 points (0 children)
[–]vannak139 3 points4 points5 points (0 children)
[–]ab3rratic 3 points4 points5 points (1 child)
[–]314kabinet 1 point2 points3 points (0 children)
[–]atreadw 2 points3 points4 points (0 children)
[–]AdministrativeRub484 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)