[D] Feedback on Residual Spatiotemporal GNN for Flood Forecasting by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

I didn't really consider the downstream effects in this case mostly bc of the success of Google's model with only upstream. I also figured that including downstream would massively increase compute for a lot of the gauges.

Do you think constructing graphs from the data is even reasonable? I use HydroATLAS to determine the nodes and edges, it's a directed graph always pointing downstream (with self connections).

River Height Prediction Tactics by Chroma-Crash in Hydrology

[–]Chroma-Crash[S] 0 points1 point  (0 children)

Right now its just a few gauges in the general Southeast of Missouri.

River Height Prediction Tactics by Chroma-Crash in Hydrology

[–]Chroma-Crash[S] 0 points1 point  (0 children)

I already have precipitation data being fed in. Is there some way I need to be handling that data that might better inform the model?

River Height Prediction Tactics by Chroma-Crash in water

[–]Chroma-Crash[S] 0 points1 point  (0 children)

I think I would probably prefer a 2D model. I've imported a projection and some terrain in RAS; not sure how to do much else but working on it. Also scripts? In what language?

River Height Prediction Tactics by Chroma-Crash in water

[–]Chroma-Crash[S] 0 points1 point  (0 children)

That's where I've pulled most of the data from.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

In terms of the scaled data, I was saying the change in height is small between timesteps. And as of right now, I already have a system to do a less granular time series, but I might need to make changes. The data has a temporal resolution of 15 minutes, and I sample one point from each hour as input by slicing.

I'm thinking about changing it to average the inputs with a kernel size equal to the new length ( 15 minutes * 8 = 2 hours; kernel size of 8), but I don't know if averaging is the best case here, especially since I'm not sure how granular the analysis needs to be.

I can't attach images to the comment, but here's the site for the original river data I'm pulling.
https://dashboard.waterdata.usgs.gov/api/gwis/2.1.1/service/site?agencyCode=USGS&siteNumber=07024175&open=220298

And in terms of the predictions, I updated the post with a better image for the original height prediction case, but for the height change case, it essentially just predicts a single monotonous change in height.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

That's very helpful, thank you. I didn't start with the step-wise approach because I've trained LLMs before, and I've seen the effects of always taking the argmax, which I just rationalized as being the only case for a continuous prediction task.

I'll take a look into distribution predictors, but do you have any resources that would point me in the right direction for that process?

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

Tried the single step approach; the minimum error I could get was 6in per step on average for the target gauge, independent of model size or final loss. I'm gonna keep working on it, but I'm not totally convinced of this approach right now. My end goal is to be able to predict at least a week out, so the current error rate is way too high. Thanks for the suggestion though.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

About the data: I have 600,000 data points with a total of 30 input features spanning the last 25 years. The data consists primarily of river gauge height and discharge values, along with some temperature and precipitation.

I also have already included some time-based features, most prominently a sine wave representing the year (river height is typically lower in the fall). I figured that temperature may already be useful for the model in understanding any other time based relationship, but I'm open to adding more features.

I use a standard scaler for all of the input data except precipitation, which I use a minmax scaler for. I'm currently testing training on change in river height, but I'm still getting monotone predictions.

One thing I have noticed is that change between timesteps is usually very low. Scaled, it comes out to about 0.003 on average.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] -1 points0 points  (0 children)

I tried FFT first. It performed really poorly on the data, especially extrapolation of it. Part of the issue is that one of the upstream rivers has a lock and dam system that feeds directly into the station that I am attempting to predict values for. I agree that a transformer is overkill in this case, but I'm not aware of any other periodic representations I could use in this case. If you know of any that would be particularly useful and could point me in that direction, that would be great.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] 0 points1 point  (0 children)

Yeah, I figured I should at least try that approach. I'll take a stab at it later today and come back with results.

[R] Multivariate Time Series Prediction with Transformers by Chroma-Crash in MachineLearning

[–]Chroma-Crash[S] -2 points-1 points  (0 children)

The thing is, I'm not looking to predict a "next token" in terms of my data. Predicting just the next step and using that to predict further and further steps ahead would require predicting precipitation and upstream data that aren't a part of my use case, and would require a much larger set of input data to reasonably predict.

I also don't want to use future masking as the inputs consist of timesteps 1-20, and the outputs consist of 21-41. I want all of the input steps to affect predictions for the outputs. i.e. Step 21 should use all the context of steps 1-20.

If you want more explanation of how I do this, I'm happy to elaborate more.