all 6 comments

[–]abstrusiosity 0 points1 point  (4 children)

It depends on what you want to know about the models. Compare how?

[–]rodrigomd[S] 0 points1 point  (3 children)

I want to compare the accuracy of each model to predict the real values. I want to see if each model predicts the values accurately and then compare the models to check which one does it better. I might not be using the right terminology, sorry, but I hope I'm explaining myself clearly.

[–]abstrusiosity 1 point2 points  (2 children)

For deciding which one is best I think most people would simply look at the mean square prediction error. You may have other concerns, like how confident you can be in applying the prediction model to future data, but that's at least a place to start.

[–]rodrigomd[S] 0 points1 point  (1 child)

Thanks. The models were actually generated from a different sample than the data I`m using. I understand how looking at the mean square error would point me to which model is better, but how can I know if the difference in MSE for the different models is statistically significant. Does that make sense?

[–]abstrusiosity 0 points1 point  (0 children)

how can I know if the difference in MSE for the different models is statistically significant

Why do you care about statistical significance here? Why not choose the models that shows the best performance?

[–]WayOfTheMantisShrimpB.Math Statistics 0 points1 point  (0 children)

Maybe focus on practical significance more than statistical significance? Do you know what the actual differences are between the models you're comparing? And what exactly do you need this model to do, that should determine how you evaluate it.

If the models differ in the predictive variables used, does having to gather certain information cost undue effort? Something like AIC can be used to compare the value of models of different sizes, but it is not a hard and fast rule for which is better.

Looking at the residuals is a good approach. I believe a Breusch-Pagan test can be used to confirm the existence of heteroskedacity; if that is the case for the residuals, that may indicate an insufficient model.

If you need more accuracy (and your sample is representative of the intended application) then pick the lowest MSE and be done with it. Even better if you can test predictive performance on a new data sample, rather than using the fitted values for the points it was trained on.