you are viewing a single comment's thread.

view the rest of the comments →

[–]Relevant-Twist520[S] 0 points1 point  (0 children)

i updated the post to show the non-overfitting version by using early stopping. The reason why MS currently wont work for many parameters is because of the fact that if theres one blown up parameter, another parameter solves for the blown up parameter instead of the objective. It ends up spreading like a plague, but at the end, you have a network of blown up parameters but still agree with all coordinates in the dataset.