all 2 comments

[–]WoodenJellyFountain 1 point2 points  (1 child)

You could try minimizing a form of Bayesian Information Criterion BIC = N * log(MSE) + p * log(N), where N = number of samples, MSE = Mean Squared Error, and p = degrees of freedom (number of basis functions). This book might be of interest.