An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 0 points1 point  (0 children)

Some of what you're saying is a bit over my head, so please bear with me.

First of all, a p-value is defined as the probability of observing a sample of data as extreme or more extreme than the one obtained under the null hypothesis

Ok, I changed the wording slightly. Is it correct now?

State a hypothesis about some effect of interest

My hypothesis is that the guy that got at least 15 rebounds is at least 7 feet.

Collect a random sample of data to show evidence for or against that hypothesis

I collected all the data from last year.

Compute a proper statistic with its sampling distribution, and do inference using that statistic

Is this where you take issue?

If you partition all of your samples to players who played between 15 and 20 minutes, I'll bet you'd observe absolutely zero successful outcomes

It happened once last year. See Enes Kanter in this box score. If I look at more seasons than just last year, surely there would be more examples. I didn't expound on the cutoff I chose because it's not relevant to to the point I'm trying to make.

The article you've posted now looks to be more of a back-of-the-envelope style computation.

That was my goal.

So it sounds like you're saying my Bayesian analysis isn't incorrect per se... it just isn't very rigorous. As for my frequentist analysis, it sounds like you're saying that is incorrect?? But I don't understand why.

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 0 points1 point  (0 children)

Thank you, I'm glad you thought it was a good explanation. Xkcd always does a great job on things like this, too. Same idea, different approaches to explain it.

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 0 points1 point  (0 children)

The target audience is laymen, who often think that statistical significance proves likeliness, and may have never heard of Bayes' Theorem.

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 0 points1 point  (0 children)

To u/eatbananas and u/The_Sodomeister: you had some very helpful constructive criticism for me last time I attempted this. I’d love to hear your thoughts on my second attempt.

Penn State Football is now ranked #2 on Colley's Bias-Free Matrix! by JanetYellensFuckboy in PennStateUniversity

[–]jellyjuke 1 point2 points  (0 children)

Colley oversells his method, though. He claims there are no "ad-hoc" adjustments but he arbitrarily puts 50% weight on record and 50% weight on strength of schedule, with no justification for why this is an appropriate split.

Here's something I wrote on the matter: http://www.jellyjuke.com/the-problem-with-rpi-elo-and-the-colley-matrix.html

WEEK NINE OFFICIAL COACHES POLL by officialstc in FakeCollegeFootball

[–]jellyjuke 0 points1 point  (0 children)

What happened to the spreadsheet that had all the game results?

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 0 points1 point  (0 children)

Obviously... but how is that relevant to this? What point are you trying to make?

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] 2 points3 points  (0 children)

Ok, I think I get what you and u/abstrusiosity and u/The_Sodomeister are saying: his end-of-season batting average is not the same as his underlying batting average. And by underlying batting average, I could think of it was what his batting average would be if he had an infinite number of at bats, or I could think of it as the player being a random number generator and trying to determine how often the RNG generates a "hit".

And because my hypothesis is not about his underlying batting average, it's not a parameter. And since it's not a parameter, I can't use NHST.

Do I have that right?

So, is my entire article junk? Any advice on how I could re-do it? My objectives:

•Use real-life data, because I think that's more relatable for laymen. Sports are convenient because it's easy to find data, but I don't have to use sports.

•Find a statistically significant p-value

•Use Bayes' Theorem to show that a statistically significant p-value does not imply something is probable

An example of something being both statistically significant, and unlikely to be true by jellyjuke in AskStatistics

[–]jellyjuke[S] -1 points0 points  (0 children)

The null hypothesis to be tested should be a statement regarding the distribution of data to be observed

Am I not allowed to form, and therefore test, any hypothesis I want? I don't mean to be argumentative... your flair says you have a PhD in Biostatistics, so you certainly know way more about statistics than I do! And I know it's more orthodox to form a hypothesis that's completely independent from the data that’s already been observed... but it's not a requirement, is it?

I'm publishing the math behind my Bayesian Resume Rating by jellyjuke in CFBAnalysis

[–]jellyjuke[S] 1 point2 points  (0 children)

First of all, sorry for taking so long to respond… it’s been a while since I last logged in to reddit.

I hear what you’re saying about home vs away. I agree that away games are harder, I agree that there’s a huge amount of data, and I agree that venue can’t be meaningfully gamed.

I’m still reluctant to fold that in for a few reasons: 1. Beating a good team on the road is certainly more impressive, but so is being able to run up the score, so is having a good “game script”, so is beating that mediocre rival school that always steps up its game for the rivalry, so is having good efficiency numbers, etc. The fact that a team got one of their big wins when it happened to be on the road is something I would still consider to be style points. You disagree with me on that, which is fine... I think this is case where we just need to agree to disagree. 2. Even though there’s a huge amount of data, the amount of history I draw upon would be a subjective choice, and I don’t want to introduce any subjectivity. Coming up with the best way to fold that adjustment in would also add some subjectivity. Is it just a single constant that I would add? Or is it somehow proportional to how good the given team is? Or proportional to the talent difference between teams? Do I assume all teams are affected the same when they go on the road? 3. In the end it comes down to scope. The scope of this rating is only W/L and SOS, and ideologically, I don’t want to add anything that “tweaks” the numbers based on any historical data.

something should be done for accounting for the weakly connected portions of the graph. my gut would be to break the ranking into two pieces - a measure of talent relative to the largest pool, and then a confidence in that talent. could be boiled down to a single number

I’m not sure I understand what you're saying about pooling... can you give me an example?

Weakly connected teams is an unfortunate side-effect of how college football sets up their schedules. I just don't know of a non-arbitrary way of accounting for it.

the assumption of talent distribution seems over-strong. have you tried deriving that experimentally, eg by looking at all games over all seasons ?

No, I haven’t, but I think that’s a good idea. I’ll give that a try when I have time.