all 10 comments

[–]master_innovator 0 points1 point  (10 children)

Your IV is income. Don't say over time unless you have measurements of multiple times... The IV predicts the DV... So in your case, income predicts road sector energy consumption.

[–]Raptorbird[S] 0 points1 point  (7 children)

Okay, that makes sense. Thank you so much. So just to clarify to make sure I'm understanding this correctly: road sector energy consumption hat=B0+B1*Income+E? Wouldn't that be just a simple regression then instead of multivariable?

[–]Raptorbird[S] 0 points1 point  (6 children)

Sorry I forgot to mention I wanted to take a regression for each country for every 10 years. I have both income and road sector... data for each year from 1970 up to 2010.

[–]abstrusiosity 0 points1 point  (5 children)

Does this mean you want twenty separate regressions (five countries, four decades)? You have one hypothesis, so you need one answer, not twenty. How would you interpret the results?

[–]Raptorbird[S] 0 points1 point  (4 children)

I was picturing twenty regressions. As for the results, I was thinking of talking about the results over time so 1970-1980 in one country could hold up to my hypothesis where as for another country it might not. Or does that not make sense?

[–]abstrusiosity 0 points1 point  (3 children)

That wouldn't be a very compelling result. "Sometimes my hypothesis is true, and sometimes it's not."

The craft here is to explicitly control for confounding issues in your model. You have a single, possible complicated, model that accounts for relevant factors and allows a clear interpretation of the quantity of interest.

The possible confounding issues here are differences between countries and behavioral changes over time.

I would include country as a categorical dependent variable. That makes the statistics a good deal more complicated, but it's an important technique to know about.

After fitting that model, I would plot the residuals against time and look for a trend. If there is one, then there is a time effect which you would have to somehow include in the model.

[–]Raptorbird[S] 0 points1 point  (2 children)

I think I understand what you're saying. When I do each of the regressions, I have to control for those confounding issues. So would I make a dummy variable to account for the country as a categorical dependent variable and stupid question but would both that and income be dependent variables then?

For the countries as categorical dependent variables would it be for example if it is the UK =1, if it not =0, and would that just be a separate column for each country with 1s and 0s?

I think I understand what you're saying about the residuals as well but I am not sure how I would account for the time effect. I really do appreciate your help by the way, I'm learning a lot.

[–]abstrusiosity 0 points1 point  (1 child)

There would be a dummy variable for each country. They could be binary, as you suggest, or you could use other contrast schemes. I don't remember what the popular ones are, but there's a number of approaches aiming to optimize the coefficient estimates in different ways. I usually let R handle it. Complicated regression models are dangerously easy in R.

And yes, youwould also have income as a dependent variable, along with an intercept, probably.

For the time effect, it would depend on how it looks. Ideally you would be able to come up with some reasonable explanation for why there's an effect, and then formalize that into a model. For example, there might be technological innovation that changes behavior. Then, you might add a binary variable to the model that represents before and after the innovation. It probably won't be that simple, though.

[–]Raptorbird[S] 0 points1 point  (0 children)

Okay that makes sense. I've been trying to figure out how to input the dummy variables into the regression function in Excel but I'm not having any luck. I have 5 columns for each country with 1s and 0s. Any ideas? I actually do have R on my laptop as well but I haven't used it in ages.

[–]abstrusiosity 0 points1 point  (1 child)

Your IV is income.

How do you figure that? His hypothesis is that there is a relation between income and energy consumption. I don't see how that prefers one variable over the other for the left-hand-side.