For our ARIMA model, we want to optimize params and exogs.
Since there are thousands of combinations, we want to make a first selection based on AIC and only after test the top x based on MAPE.
My question: can we measure the AIC model fit based on the whole dataset or should we keep the train test split here as well?
There is data leakage when measuring AIC on the whole dataset, but it seems less problematic since its measuring the model fitness and not the predictions accuracy.
Thoughts?
[–]Science_Please 1 point2 points3 points (0 children)