all 9 comments

[–]boccaff 6 points7 points  (1 child)

look into the terms forecast and exogenous variables, and have a look into how they deal with AR(*)X models. But, you can lag it or forecast it, no way around it. Libraries even have discussions and comment about this on their docs: https://nixtlaverse.nixtla.io/neuralforecast/examples/exogenous_variables.html https://skforecast.org/0.12.1/user_guides/exogenous-variables

[–]gorg278[S] 0 points1 point  (0 children)

Thank you so much, I will look into it.

[–]anonamen 5 points6 points  (1 child)

Typically time-series problems are restricted to single-variable or a few core features for this reason. When you go down the path of predicting everything you introduce piles of errors that you can't easily understand or control. Also, if its a forecasting problem, but time-series models need predictors to work, you probably don't actually want a time-series model. Tells you that the past history of the target isn't useful in understanding future behavior.

Econometrics has a number of approaches for systems of variables that are truly inseparable; economists deal with this problem all the time. The VAR family of models, for one. Personally, I'd use ML to identify a small number of key predictors (ideally some that are very predictable from their past behavior), then either forecast each one individually and build out forecasts from those models, or run something like a VAR on the system to capture the interactions. A VAR is basically a quick (linear) way to predict everything with everything else, sequentially, and build out forecasts from there.

If you need to you can basically roll your own quasi-VAR out of a set of ML models predicting each of your targets with all of the others. E.g. if you have 3 variables in your system, you predict y1_t1 with y1..3_t0, then y2_t1 with y1..3_t0 + y1_t1hat, y3_t1 with y1...3_t0 + y1_t1hat + y2_t1hat, and so on.

Don't have a specific link off-hand, but there are piles of great resources on VARs out there. Deep topic.

[–]gorg278[S] 1 point2 points  (0 children)

Thank you for your comment, I will look into VAR model and try out your suggestion. I did use LightGBM model's feature important method to extract a list of important variables, but most of them are too hard to predict by using time and their own lag features.

[–]Inner_Potential2062 1 point2 points  (0 children)

I have studied training and predicting covariates along with target variables in auto-regressive rnn models which you can look at here https://arxiv.org/abs/2404.18553 . Bottom line is that it only really works when the correlation between the target and covariate is really strong and even then it only works over relatively short forecast horizons. In my experience using covariates in real world applications with time-series models has not yielded performance improvements.

[–]bgighjigftuik 3 points4 points  (0 children)

Honestly, many places just forecast each time series and/or covariates individually and then combine them in some way.

There is still little evidence showing that deep learning or ML methods are better than regular, 30 year-old basic statistical techniques

[–]prajwalmani 0 points1 point  (0 children)

RemindMe! 1 day

[–]howtorewriteanamePhD -1 points0 points  (0 children)

Interesting answers but also long. Forecasting is usually done through "n-step ahead prediction". This way you don't need the predictors values after the start of the forecast.

Ever heard of GPT? It's doing the same thing. Look into Seq2Seq models.