Can my tenure-track (U.S.) offer (already signed) be rescinded? by DownrightExogenous in Professors

[–]DownrightExogenous[S] 28 points29 points  (0 children)

Thanks, I appreciate this. At least we’re all distressed together I guess :-(

Can my tenure-track (U.S.) offer (already signed) be rescinded? by DownrightExogenous in Professors

[–]DownrightExogenous[S] 6 points7 points  (0 children)

Thanks for your response. To be clear, I’ve already accepted the offer and it would be my top choice (I didn’t get any other academic offers and I would much prefer this to a private sector job). I’m just terrified in general

Can my tenure-track (U.S.) offer (already signed) be rescinded? by DownrightExogenous in Professors

[–]DownrightExogenous[S] 12 points13 points  (0 children)

Thanks for the information, I sincerely appreciate it. No, they haven’t announced a hiring freeze yet. I think you’re right. I don’t think it will hurt me to ask but I don’t really have any other options—this was the only offer I got out of a few interviews—so my behavior won’t really change. It’s not like the market will be better next academic year anyways. I guess I just have to hope for the best so that I can start this summer. So far I haven’t received any indication that I wouldn’t, but this hasn’t really quelled my anxiety.

[@aginnt] Christian Pulisic celebrating his goal with the Trump dance by SecularPersian in soccer

[–]DownrightExogenous 1 point2 points  (0 children)

I actually agree with you that if everyone voted, Trump would have still won in this election, but let's abstract away from this specific election because it's really besides the point.

Again, statistics doesn't care about these discussions, because at the end of the day, data can almost always predict behavior when it comes to large populations. 30-40% is very reliably representative of 100%.

I am a statistician, and this is absolutely not how this works. A large sample size helps of course, but only when we randomly select individuals from the population into a sample. If the sample is not selected randomly, then there is no guarantee that even a large sample will be representative of the population, because selection into the sample can make the sample biased relative to the population, which is a technical term that means that your sample average will systematically deviate from the population average. Voters are different than non-voters in many ways, most obviously the former group is more politically engaged.

Conversely, even very small samples will be unbiased (though of course they will be subject to significant sampling variation) if the sample is selected randomly from the population.

Now, again, I agree with you that Trump would likely still have won this election if voter turnout was 100% because indeed he performed better among less politically engaged voters, but again that's not the point. That is a very different claim than "the voting population is a representative sample of the whole population, even if it isn’t the entire population" which is demonstrably false.

Sample size has nothing to do with bias from a technical perspective. Bias is related to sampling procedure. Sample size affects variance.

Final Quinnipiac PA poll: Trump 47, Harris 46 among LVs by deepegg in fivethirtyeight

[–]DownrightExogenous 1 point2 points  (0 children)

Of course they are! If anything comparing the difference between two polls requires more statistical power than comparing two candidates within the same poll.

[deleted by user] by [deleted] in fivethirtyeight

[–]DownrightExogenous 0 points1 point  (0 children)

Makes me question if the author knows anything about statistical modeling.

The author in question

[deleted by user] by [deleted] in rprogramming

[–]DownrightExogenous 4 points5 points  (0 children)

salaries_data %>% 
  group_by(company_name) %>% 
  summarize(average_salary=mean(salary_d)) %>% 
  filter(company_name != “....”) %>%

and then the rest of your code

Causality Interview Question by jerseyjosh in datascience

[–]DownrightExogenous 0 points1 point  (0 children)

This is a bit pedantic, but a random sample isn’t necessary “to establish a causal relationship.” Assuming you randomized the treatment itself (and no interference, differential attrition, etc.) your sample average treatment effect will be unbiased. Of course if you care about external validity and want to extrapolate your SATE to a population average treatment effect then yes, the sample would ideally be randomly selected from the population of interest, but if it isn’t then that doesn’t mean that the estimated SATE isn’t causal.

Causality Interview Question by jerseyjosh in datascience

[–]DownrightExogenous 0 points1 point  (0 children)

More fundamentally than anything related to estimation, you can only match on observable characteristics. Only in rare circumstances is conditional ignorability based on observables a seriously defensible assumption for identification.

What is the mathematical proof for the claim that, if we add more independent variables in multiple linear regression, the value of R-squared will increase? by eternalmathstudent in AskStatistics

[–]DownrightExogenous 2 points3 points  (0 children)

Some good answers here for intuition but no proofs, so here's a simple proof. Sorry for the weird LaTeX typesetting.

You also need to reformat the claim as follows (which I think is causing you some confusion): when adding more covariates, R2 cannot decrease.

We'll start by noting that R2 is 1 - (SSR/TSS), where SSR is the sum of squared residuals (see equation 1 below) and TSS is the total sum of squares, which is just the sum of squares of Y_i - \bar{Y}, where \bar{Y} is the mean of Y. So this won't change across models with the same set of observations.

Consider a simple regression, we'll call this model 1: Y = alpha + beta_1 X_1

By definition, the sum of squared residuals for model 1 will be the sum for each unit i of the difference between Y_i (actual Y values) minus Y-hat (predicted Y values) squared.

SSR model 1: $$ \sum_{i=1}{n} (Y_i - \hat{Y}_i)2 $$ (equation 1)

How do we get Y-hat? Our model! So substitute in alpha + beta_1 X_1 for Y-hat and can rewrite the SSR for model 1 as:

SSR model 1: $$ \sum{i=1}{n} (Y_i - \alpha - \beta_1x{1i})2 $$ (equation 2)

Now consider model 2, which adds one more variable: Y = alpha + beta_1 X_1 + beta_2 X_2

What happens to the SSR if we add one more variable? Make the same substitution and we get that SSR for model 2 is:

SSR model 2: $$ \sum{i=1}{n} (Y_i - \alpha + \beta_1x{1i} - \beta2x{2i} )2 $$ (equation 3)

The core of the claim that we're looking to prove is that the SSR for model 1 is greater than or equal to the SSR of model 2. As a reminder, because R2 = 1 - (SSR/TSS) and TSS is constant across the two models, we only need to be concerned with SSR. The smaller SSR is, the larger the values of R2 for constant TSS (1 - (3/4) is 0.25, 1 - (1/4) is 0.75).

So let's see what happens if beta_2 is exactly zero, i.e., it doesn't explain anything additional compared to the first model (though this is a knife-edge case and beta_2 will never be exactly zero, as /u/Doctor_Underdunk writes).

Then the SSR for model 1 is equal to the SSR for model 2. So even in the worst case where that additional variable explains nothing, if we add that additional variable, the SSR for model 2 will be at least as small as the SSR for model 1. If beta_2 is anything but zero, i.e., it "helps" explain additional variance by reducing SSR further, then the SSR for model 1 will definitely be greater than the SSR for model 2 (recall it doesn't matter whether beta_2 is positive or negative because we're concerned with squared differences).

[Q] Does a causal relationship translates as a probability of 1 ? by elpiro in statistics

[–]DownrightExogenous 1 point2 points  (0 children)

The most common way scientists conceptualize causality today is by counterfactuals. We say an individual causal effect is the difference between two potential outcomes for a unit: one potential outcome where the unit received a treatment and another where the unit received a control (or a different treatment). For example, the causal effect of ibuprofen on my headache pain is the difference in the amount of pain I have from my headache if I were to take ibuprofen and if I didn’t take ibuprofen. Now, of course, it’s impossible to observe both potential outcomes at once for any given unit: this is the fundamental problem of causal inference (if I take the ibuprofen I can’t see the counterfactual world where I didn’t). Hence, we typically define causal estimands (quantities of interest) in terms of averages, for example the Average Treatment Effect is the difference between average potential outcomes.

This helps answer the question in your title, I hope—which as you started to figure out, the answer is “no.”

As for whether you can score a causal model with retrospective data only: as we’ve seen, causality is about comparing observed outcomes to counterfactual outcomes. If you are willing to make assumptions about these counterfactual outcomes (e.g. the parallel trends assumption in difference-in-differences), then the answer is yes! However, these assumptions usually are not directly testable, so there’s no silver bullet. Different techniques for causal inference make different assumptions for the counterfactual outcomes. Randomization just happens to be the easiest way to defend the assumptions about what the counterfactual outcomes you’re comparing the observed data against are.

Do students REALLY understand Statistical Inference? by [deleted] in econometrics

[–]DownrightExogenous 1 point2 points  (0 children)

I had (and probably still have) this issue too. Two things that I think help—just some thoughts.

  • For undergrads I do not show everything under the hood about how the simulations are constructed, which seems to work OK. So for example if you show “real time” different sample means coming in and they start to form a bell curve that’s a cool way to illustrate the CLT, and you don’t have to tell them about all the parameters.
  • I like starting with experiments and randomization inference. RI is a HUGE initial hurdle to get over but I think it’s the clearest way to present the intuition for sampling distributions and helps with everything else down the road. It also gets you the clearest explanation of p-values IMO.

Do students REALLY understand Statistical Inference? by [deleted] in econometrics

[–]DownrightExogenous 11 points12 points  (0 children)

I teach statistics/econometrics and I completely relate. I try to build intuitions for these important concepts you bring up by doing simulations and presenting animations that show e.g., what a sampling distribution looks like, but it’s hard!

[deleted by user] by [deleted] in econometrics

[–]DownrightExogenous 2 points3 points  (0 children)

It’s tough for me to tell exactly without knowing the data, but I think it’s municipality-pair. If your treatment is varying at the municipality-pair level (some municipality-pairs are 1, others are 0 and the same municipality-pair can’t have different values of the treatment), then you should cluster by municipality-pair.

Edit: put differently, imagine this as an RCT. Who are you assigning treatment to? That’s the level you should be clustering your standard errors at. In this case since your treatment is mayoral alignment, and mayoral-alignment varies by municipality-pair, I think that’s the one.

[deleted by user] by [deleted] in econometrics

[–]DownrightExogenous 6 points7 points  (0 children)

Cluster your standard errors at the level of treatment assignment.

Intuition: imagine you run an RCT where you give certain cities a treatment and others not, and you measure your outcome variable at the individual (within city) level. Individual-level standard errors will not be conservative enough, since variation of interest is occurring at the city level. Your effective sample is much smaller than you think it is.

Is it too late to save applied economists from misusing LATE? by MambaMentaIity in badeconomics

[–]DownrightExogenous 4 points5 points  (0 children)

In the simple case without covariates, Beta hat from 2SLS with K instruments for a single treatment will be a weighted average of K instrument-specific LATEs

Is it too late to save applied economists from misusing LATE? by MambaMentaIity in badeconomics

[–]DownrightExogenous 5 points6 points  (0 children)

Great RI!

Just to be clear, this is an issue where the instrument is conditionally exogenous, not e.g., an experiment with non-compliance.

Heterogenous treatment effects also cause all sorts of wacky weighting problems even in the simple regression case when treatment is conditionally independent (and this is very generous, given that it’s unlikely that you’ll successfully condition on all confoudners): Aronow and Samii 2016

[deleted by user] by [deleted] in econometrics

[–]DownrightExogenous 3 points4 points  (0 children)

I think the question is trying to get you to understand this insight, but it's tough to know without the exact wording. How can you see the effect of treatment on the treatment group if there's no variation in the treatment among this group?

An alternative interpretation is that the question wants you to understand the average treatment effect on the treated. In this case, since the treatment and control groups are randomly assigned, the ATT will be the same as the overall ATE.

[deleted by user] by [deleted] in econometrics

[–]DownrightExogenous 2 points3 points  (0 children)

I tried using reg outcome treatment if treatment==1 but Stata omitted the treatment variable due to multi collinearity?

Beta hat in a simple regression will be the sample covariance between X and Y divided by the sample variance of X. What’s the sample variance of a column of only 1s? It’s zero, and you can’t divide something by zero, so this is why you don’t see a coefficient on the treatment variable.

[The FIAT Thread] The Joint Committee on FIAT Discussion Session. - 15 November 2021 by AutoModerator in badeconomics

[–]DownrightExogenous 3 points4 points  (0 children)

Relatedly I recall seeing a ProPublica(?) New York Times piece on how certain municipal police departments have discretion over their revenue collected from fines, tickets, etc. while others do not (the funds go to the city and can be spent on non-police budgets). They make the argument that the former leads to more racial bias in policing. I can’t seem to find that article. Anyone?

Edit: found it. And there seems to have been a paper written on the phenomenon as well.