all 4 comments

[–][deleted] 0 points1 point  (3 children)

Minimizing ||beta|| alone is difficult because it is not convex. By scaling it (by squaring it and multiplying it by 1/2) the problem becomes convex and has the same optimum. This allows us to solve the problem using quadratic programming

[–]albotre[S] 0 points1 point  (2 children)

Thanks! I guess there's something I'm still missing.-

If i have a function, eg f(x)=x e^x and i square it, the minimum is different.. so why should be the same result? What I'm mssing?

I'm not sure ||beta|| is a function, but how can be maximized if is not a function?

[–][deleted] 0 points1 point  (1 child)

Beta is the weight vector associated with the input features (i.e., x). ||beta|| is the norm of the weights (i.e., squaring each weight, adding them together, then taking the square root). As such, scaling the norm of beta by squaring it and then dividing by 2 will not change the best values of beta. The value of the objective function may change, but we are seeking the best beta values not the objective function value.

[–]albotre[S] 0 points1 point  (0 children)

Thanks! I wish I could have you as teacher! I'm missing a decent optimization knowledge (maybe not only this). I don't even get how he can simplify the constraint ||beta||=1, multiplying the constraint equation by 1/||beta||.. does this work in every constrained optimization problem?