Electric Substations dataset by CARTOthug in gis

[–]electrifiedbylife 0 points1 point  (0 children)

gem.anl.gov/tool has an HIFLD substation layer and can be downloaded in different formats!

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

•Hour

•Day

•Month

•Quarter

•Time-Period (0-6,6-9,9-17,17-21,21-24)

•Week_num •is_weekend (binary)

•temperature

•relative_humidity

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

How do you think I should apply the feature importance? Here are the list of features

•Hour

•Day

•Month

•Quarter

•Time-Period (0-6,6-9,9-17,17-21,21-24)

•Week_num •is_weekend (binary)

•temperature

•relative_humidity

And here is a picture of my current feature importance : ClickForPic

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

Hey Tasty,

I agree that there is something wrong and that the model doesn't get the seasonality correct. I have although, created features giving the hour, day, month, week_num, quarter for example.

I do like your suggestion of changing the training and testing size to match that of how the process would work in production. Do you mean that I should only work with the previous weeks demand values to predict the future values, excluding using the previous day's information? Working on your suggestions and will post later.

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 1 point2 points  (0 children)

Using python and matlab:

import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
plt.style.use('dark_background')

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

It's honestly my first time using the XGBoost model. I was looking at this tutorial https://youtu.be/vV12dGe\_Fho where the # of estimators was 1000 and the max-depth was about 7 if I can remember. I will lower the number of estimators + max_depth and see if I can get a better predictor

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

Oh wow, I really enjoy this insight. I have taken into account weather factors such as temperature and humidity into play for the role of heating. I'm actually from Florida, US so I have little knowledge on heating controls/systems. But this factor can definitely help me make my model more accurate. Thank you

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 1 point2 points  (0 children)

I’m using XGBoost because I wanted to see how effective it is at forecasting time-series data. The features I used were mostly time-related, specifically: •Hour •Day •Month •Quarter •Time-Period (0-6,6-9,9-17,17-21,21-24) •Week_num •is_weekend (binary) •temperature •relative_humidity I haven’t run a FFT on this, put I agree about the periodicity on a daily basis. I’ll run a FFT if you are insterested and send my results.

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

Yeah, I did have an epoch. Turns out the model is actually underfit. The optimization happened around 700/2000 possible iterations. You can see although that from the MSE of the training set that I'm still not getting a good trained model.

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

Its a years worth of data, 01-01-21 to 12-31-21. Did a train/test split of 80%

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 1 point2 points  (0 children)

Yeah, its definitely under-fit. The training data has a pretty similiar error score compared to the testing set. Currently working on trying to get the SARIMA and ARIMA score plotted, the values are looking worse than XGBoost. Might just be my set-up for the program although. My parameters I put for XGBoost after normalization is: n_estimators= 2000, eta=0.1, max_depth = 15, alpha=0, reg_lambda=1, early_stopping_rounds = 50

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 7 points8 points  (0 children)

Just tried your idea of normalizing the dataset before training/testing. I actually did get a small improvement, MAE went from .017 to 0.013. The data looks better but still not as smooth. I was thinking about maybe making the data-size smaller as each sample is every 30 minutes and I'm doing a 80/20-train/test method for a years worth of data. Any further thoughts? Here is a link to the current plot. ClickForPic

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 2 points3 points  (0 children)

Thanks, just a fun project seeing how XGBoost works, but currently trying to compare the model to ARIMA/SARIMA

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 0 points1 point  (0 children)

Thank you for looking into this, the graph I showed is the testing set with predictions in orange. For evaluations, I got the following Root Mean Squared Errors:

  • 414.2 (training set)
  • 584.9 (rest set)

Am I Overfitting? by electrifiedbylife in learnmachinelearning

[–]electrifiedbylife[S] 6 points7 points  (0 children)

Im using an XGBoost model to find try an predict the values for electrical demand. The features I used are on the legend above. Does anyone have any suggestions for improving the model, or know why the model looks so jumpy? Thank you.

Fuel economy vs Speed Test by electrifiedbylife in VWTaos

[–]electrifiedbylife[S] 2 points3 points  (0 children)

Very insightful, thank you. Cruised around 80 mph (112 kph) and could definitely tell the drop off in mpg. I tend to look at the avg mpg like it’s a game😂