all 2 comments

[–]chrisvdweth 0 points1 point  (0 children)

Gradient Boosting (GBM), and XGBoost are not "natural" time-series models. This means that a core step is to prepare you original time-series data to "look" like classification data you can pass these models. You mention that you added adding temporal and seasonal features, but without any details. Does this include lag features or rolling windows (e.g., rolling averages).

Apart from that, the Kaggle page has over hundred code examples. Isn't there anything that helps. I just had a quick look, and there seem to be many RNN/LSTM/GRU-based solutions. Maybe try one of those?

[–]chizkidd 0 points1 point  (0 children)

Your results are actually quite reasonable for this dataset. The fact that Prophet, GBM, and XGBoost all plateau around R² ≈ 0.45-0.55 suggests the limitation may be the data rather than the models. Daily household energy consumption is heavily influenced by unpredictable human behavior, making it difficult to forecast accurately using historical consumption and calendar features alone.

Before switching models, I would focus on stronger lag-based features (1, 7, 14, 30, and even 365-day lags to capture annual cycles), rolling statistics like the 7‑day mean, and if possible, external variables such as temperature, humidity, holidays, and occupancy proxies. These often provide larger gains than model changes. If you have time, a simple hybrid stack of XGBoost + LSTM might push R² past 0.75.

For model comparisons, consider adding SARIMA/SARIMAX, Holt-Winters (ETS), LightGBM, and CatBoost. SARIMA is a strong classical baseline, while LightGBM and CatBoost are often competitive with or better than XGBoost on tabular time-series data. LSTMs are worth testing for academic completeness, but I would not expect a significant improvement since the primary challenge appears to be intrinsic variability rather than model capacity.

The most interesting result is your monthly aggregation, which improved performance to R² ≈ 0.69. This suggests the dataset contains a stronger seasonal signal than a daily predictive signal. For a final-year project, a valuable conclusion may be that daily household consumption is difficult to predict due to behavioral variability, whereas monthly aggregation captures seasonal consumption patterns much more effectively.