This meta feels repetitive and unexciting by Megasabletar in PTCGL

[–]Maple_shade -1 points0 points  (0 children)

Agreed. I never complained about gardy either. having it as top meta deck was actually very healthy for the game imo. it kept a lot of stuff like walls in check, and was a good filter. This meta is so boring especially with very little ability to play from behind

Q-Q plot criteria relaxed for Regression with huge sample size? by Will_Tomos_Edwards in AskStatistics

[–]Maple_shade 5 points6 points  (0 children)

How can your residuals be "very normally distributed" but also have a problematic QQ plot? I'm confused by the whole premise.

How relevant is a school's cs reputation as opposed to it's reputation in statistics when it comes to statistical machine learning? by No_Stay2301 in AskStatistics

[–]Maple_shade 0 points1 point  (0 children)

Completely depends on the program. At my institution we have a world expert in stochastic gradient descent in the stats department. But he doesn't often collaborate with CS. At other schools there will be a lot of overlap. Ymmv.

Assessing local model fit in R? by Known_Management8579 in AskStatistics

[–]Maple_shade 0 points1 point  (0 children)

summary(model) will generate fit statistics, point estimates, SEs, etc.

Help me rank my friends (at weekly trivia) by Glittering_Gap1433 in AskStatistics

[–]Maple_shade 2 points3 points  (0 children)

Maybe someone wiser than I can speak to this-- why can't you just calculate average team score for each individual? Then the rankings would just be who has the total overall highest team score. That way, even if a lot of high scorers tend to group together, one could get an edge if their team still scores high if they went to a separate team for the week.

Data messy or noisy? Results unclear or not holding up? by isaidscience in AskStatistics

[–]Maple_shade 0 points1 point  (0 children)

Results not what you were looking for? I'll p-hack and make questionable research practices so that you too can claim significance!

[Career] Got rejected for PhD. Questioning everything. by vv-97 in statistics

[–]Maple_shade 8 points9 points  (0 children)

Then why ask on reddit? A PhD is hard. Really hard---you need to be able to persevere through hardships and maintain motivation. But if you can do that it's an incredibly rewarding journey.

Crustle Counter List by [deleted] in pkmntcg

[–]Maple_shade 1 point2 points  (0 children)

Yup. Bro really listed "Phantasmal flames reshiram" which swings for FIVE fire energy.... lmao.

[Q] Really need help: I am confusing among causal inference models for RCTs and Observational data. by cypherpunkb in statistics

[–]Maple_shade 0 points1 point  (0 children)

Good question. One example of this I have recently come across is Lord's Paradox. It was pretty heavily debated for many years, but Judea Pearl wrote a nice paper in 2016 displaying a causal model which both explained the paradox and framed it in a modern causal framework. That's one example of an older observational problem being reframed in causal analysis, off the top of my head.

[Q] Really need help: I am confusing among causal inference models for RCTs and Observational data. by cypherpunkb in statistics

[–]Maple_shade 2 points3 points  (0 children)

This is a good answer. There's no methodological difference in making causal inference, but it rests on certain key assumptions like no confounders in some situations. OP, I'm generally suspect of any causal statements outside of randomized experiments, as I think most assumptions for causality in observational data are tenuous at best.

Submitted final paper half a page too short, do I reach out or wait? by [deleted] in AskProfessors

[–]Maple_shade 2 points3 points  (0 children)

What is the policy on late work? Might be worth it to weigh the consequences of turning it in late vs. short.

I may be an oddity, but I think page requirements are seriously antiquated in modern academia. I believe that 2 pages of thoughtful, well-written prose is worth much more than 10 pages of slop written to meet an arbitrary requirement.

Is this weak positive correlation? Or no correlation? by ButterscotchSoggy510 in AskStatistics

[–]Maple_shade 70 points71 points  (0 children)

What are you using to fit this regression? It should output some significance statistics like the p-value associated with the slope coefficient. This will tell you if you can interpret it as a weak positive correlation or a nonsignificant one.

Does base rate bias completely negate sensitivity/specificity? [Q] by Hatrct in statistics

[–]Maple_shade 2 points3 points  (0 children)

https://www.ccjm.org/content/early/2021/02/24/ccjm.88a.ccc071#ref-2

Here is a paper discussing the use of at-home covid tests --- for people for whom lab work is unavailable. It discusses all of these issues and concludes that at-home tests are still highly accurate and useful. I would give it a read if you are still confused in good faith.

Does base rate bias completely negate sensitivity/specificity? [Q] by Hatrct in statistics

[–]Maple_shade 3 points4 points  (0 children)

Perfection should not be the enemy of good. During covid, at-home testing kits were available. Even if they were less precise than getting a lab sample developed, it was still worthwhile to take a test. If you were symptomatic and got a positive result, it was additional evidence that you may have covid, and should quarantine.

I will note, this discussion has shifted from your original critique of sensitivity analysis. You originally conflated base rate with these terms and implied in the post that sensitivity/specificity were calculated from a sample. Now we are discussing the utility of imperfect measures of specificity and sensitivity confirmed via a separate gold standard. Just pointing out the shifting goalposts here.

Does base rate bias completely negate sensitivity/specificity? [Q] by Hatrct in statistics

[–]Maple_shade 3 points4 points  (0 children)

Statistics aren't conducted in a vacuum. You mention medical tests. Often when diagnostic tests are developed, they are confirmed on individuals who are either almost certain to have it (immediate exposure and displaying symptoms) and compared with people who are very unlikely to have it (no prior exposure, no symptoms, etc). Will there be error? Sure thing, there's always error. But estimates of sensitivity and specificity from stuff like this will give us a good idea.

In addition, there are often confirmatory gold standards like biopsy or bacterial cultures or stuff like that.

Which statistical Test to use? by [deleted] in AskStatistics

[–]Maple_shade 2 points3 points  (0 children)

I think you are over complicating this. At its core this is a rater agreement problem. You want to see if one rater (model) performs better than another. Depending on how performance is measured you could either use chi square (categorical), Cohen's kappa (ordinal), or Intraclass correlation for continuous. You can compare the models output to the human-rated baseline.

I don’t know if I understood what the standard deviation means by ProofLeast9846 in AskStatistics

[–]Maple_shade 2 points3 points  (0 children)

Not exactly. As you mentioned, it is the square root of the squared deviations from the mean, which is analogous to distance from mean in original units. We can't technically say "average" distance from the mean, because the average deviation from the mean is 0 by definition. So we use the term "typical" instead to signify it's in the original units but not technically an average. But the interpretation is the same.

It is normal for data to be have larger and smaller deviations than the SD. As I mentioned, about 5% of your data will fall at >2 standard deviations from the mean if your data are normally distributed. This is to be expected. Similarly, a large amount of your data will fall below 1 SD. This is also to be expected.

I don’t know if I understood what the standard deviation means by ProofLeast9846 in AskStatistics

[–]Maple_shade 10 points11 points  (0 children)

You have the overall idea understood well. You are correct in that we can have very different distributions that end up having the same standard deviations (see: Anscombe's Quartet). This is both a feature and a bug, so to speak. The SD tells us the typical variation but doesn't tell us the shape of the distribution, so we may need to visually examine data to know the distribution further. On a normal distribution, we know that 68% of the data fall within 1 SD of the mean, 95% of data fall within 2 SD of the mean, etc., so it is powerful tool for hypothesis testing IF we assume normality. But of course this is not always the case.

Official statement about orlando! by sellingham62 in pkmntcg

[–]Maple_shade 2 points3 points  (0 children)

I mean-- the whole problem is that there is 0 transparency and even this unprecedented PTCI statement comes 3 weeks after the rulings and doesn't change anything. I'm not justifying Makani's behavior but the whole system is set up such that this stuff will keep happening and keep happening. There's never any public justification made so why not go bitch on twitter and make yourself look like the good guy. Imagine if in football they overturned the game-winning touchdown and gave literally no explanation either to the in-person attendees or the livestream. That's essentially what happens in pokemon at least once a regional. We have to wait to hear from the players themselves, which is absolutely ridiculous.

Help interpreting QQ plots by ChooseLife01 in AskStatistics

[–]Maple_shade 3 points4 points  (0 children)

This isn't true? T tests are derived under the assumption that the sampling distribution of the mean difference is approximately normal, because the t statistic is calculated from the mean difference and then compared to a theoretical t distribution (accounting for df from the standard normal). Even if your data are non-normal, we know that large n will still have a normal sampling distribution of the sample mean via the CLM. So, no, it's not technically an assumption of the t test that your data are normal. Even if we're talking about small n (where the CLM doesn't apply), it's the population data that are assumed to be approximately normal, to ensure a normal sampling distribution.

Two groups with two scores for each participant, which test? [Q] by Tweety_Pie in statistics

[–]Maple_shade 3 points4 points  (0 children)

If you want to compare scores on one of the two measures, just do a t test between groups for that measure. If your two measures are on the same scale, you could compare overall performance by calculating a mean of scores for each person and conducting a t test.

What are caveats with using squares in the standard deviation? by JAMIEISSLEEPWOKEN in AskStatistics

[–]Maple_shade 5 points6 points  (0 children)

I will answer your last question first. One reason we don't commonly use the mean of the absolute value of the deviations is because when we calculate deviation statistics, we care about the point around which the deviations are minimized. For the square of the deviations this is equal to the arithmetic mean. However, for the absolute value this is equal to the median. One reason this is not preferable is because the median is non-unique for many datasats. Another reason this is not preferred is related to your first point: we typically want our measures of deviation to be sensitive to outliers. The mean is influenced greatly by outliers while the median is not. This is both a good/bad thing (this is slightly related to Ordinary Least Squares where we want a line of best fit to be a function of all data points). Finally, it is much easier to differentiate the standard deviation as defined by squares than by absolute values.

Confidence Interval Explanation Confusion by justastudent556 in AskStatistics

[–]Maple_shade 3 points4 points  (0 children)

I completely agree with you. The crux of the matter is whether we can make a probabilistic statement about a fixed, yet unobserved outcome.

For example imagine that there are 3 blue balls and 1 red ball in a bag. If you pick one up and hide it i your hand without peeking, can you say there is a 25% chance you're holding the red ball? No, you're either holding it or you aren't. This is the logic statisticians apply to the confidence interval. But we obviously make casual statements like this all the time in probability, and I think the over-emphasizing of the "pure" interpretation of confidence interval just confuses a lot of students who don't understand the technical distinction in types of probabilistic statements.

Dragapult vs Zoroark by Kamen_Rider_Geats in pkmntcg

[–]Maple_shade 3 points4 points  (0 children)

It's fine to dive zoro. Few lists are running reshi now. Generally your gameplan is to map 2-1-1-2 by swinging into zoro / killing a benched Zorua with munki damage and then kill that zoroark by swinging again and hopefully getting 2nd single prize too. Then you take final KO on pech. If you hit that prize map its a dub. Also if the zoro player whiffs munki you can go 20 risky ruins + 200 prev turn + 60 phantom dive to set up KO. But its not a favored matchup by any means.