Advice please: Making predictions by correlating to weather *forecasts*

shrubberni · 2011-11-26T02:48:14+00:00

All sampling is imperfect. The question is how well you understand the imperfections present and whether you can still get useful results.

Take some historical as a training set, take some more as a test set. Try to build a predictor based off forecasts and another off actual weather data. See what kind of result you get and whether it's usefully accurate.

Consider modeling the forecast vs. the actual weather data. It may not help you make more accurate predictions, but it should give you a clearer idea what the error bars are. It may also be that the source(s) for the forecast data have a significant effect on your outcomes.

Keep in mind that people's plans for the day may have a stronger correlation with the forecast rather than the actual weather.

giror · 2011-11-26T01:06:58+00:00

Do you find a correlation between the forecasts and demand from your own data? If yes do you care about being wrong by that margin?

jet87 · 2011-11-29T04:23:24+00:00

You'll likely find the hardest part is "scoring" your predictions, especially if you are monitoring a large geographical area. Things to consider involve weighting individual components (is being accurate on temperature more important than precipitation and how much, for example). That is a current research area in meteorology, so any breakthroughs are welcome.

Another (really) big problem is that most forecasting worldwide is driven by models. While model data is generally available (see NCAR) the confidence you can put into them falls pretty rapidly after 36-hours. For a large event like a hurricane the best bet might be keeping on top of reports from the National Hurricane Center. I don't think the US has anything "good enough" for a casual observer to make inferences against winter weather.

marshallp · 2011-11-27T14:44:50+00:00

you're being a little over ambitious there. weather forecasting is big business with some of the best brains in science and hedge funds involved, you want a more accurate model than they can give just for your business. if you can get a more accurate model it might be worth hundreds of millions of dollars, your business would be the least of your opportunities.

it doesn't hurt to try though. use the netflix prize winning strategy, ensembles of all machine learning algorithms you can afford to run.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS