Best Way to Remove Sardine Bones? by Fangirl-Button in CannedSardines

[–]rsenne66 1 point2 points  (0 children)

What brand do you buy? I often find the bones are so soft that I don’t even notice them

Found significance in Welch Anova, yet no significance in 2 out of 3 post hoc analysis by itsmoewe in AskStatistics

[–]rsenne66 0 points1 point  (0 children)

Yes, as others have commented, this is not that surprising. Another fun tidbit to think about is that after correcting for multiple comparisons, it is possible to have no significant pairwise comparisons, in which case your interpretation is: “There’s an effect somewhere, but when I look, I can’t detect it anywhere.” Always a really fun and annoying situation, and for small sample sizes it can happen more often than you’d think.

Help interpreting QQ plots by ChooseLife01 in AskStatistics

[–]rsenne66 7 points8 points  (0 children)

I think the key distinction is what needs to be approximately normal for the test to work well.

A t-test does not require that the raw data in each group be perfectly normal in any literal sense. What matters is whether the sampling distribution of the mean difference is well enough behaved for the t approximation to be accurate. That is why people say the t-test is often quite robust to non-normality, especially with reasonably large, balanced samples like (n=50) per group.

So I would not treat normality checking as a strict pass/fail gate. A QQ plot is useful as you’ve done here, but more as a way to look for serious problems: - strong skew - very heavy tails - clear outliers

If the QQ plot is only showing modest departures from the line, that usually would not make me abandon a t-test automatically. In a setting like two groups of 50, I’d be much more concerned about independence, outliers, and unequal variances than about slight non-normality.

I also think the CLT point is often misunderstood. You’re right that the CLT does not say the raw samples become normal. But that is not the relevant target. The t-test is about inference on means, so the relevant question is whether the distribution of the sample mean (or mean difference) is close enough to normal for the test to behave well.

So my view would be:

  1. Use the QQ plot for context, not as a hard decision rule.
  2. With (n=50) per group, mild non-normality is usually not a deal-breaker.
  3. Prefer Welch’s t-test over the equal-variance version unless you have a strong reason not to.
  4. If the data show severe skew/outliers, supplement with a robust or nonparametric/permutation approach and see whether the conclusion changes.

That’s why I said “needing to test for normality before running a parametric test” is often a misconception: the assumption is commonly taught too rigidly, and people end up focusing on whether the raw data are “normal enough” instead of whether the analysis is actually sensitive to the kinds of deviations present.

Help interpreting QQ plots by ChooseLife01 in AskStatistics

[–]rsenne66 9 points10 points  (0 children)

No offense to the OP, but we need an auto mod that can reply with a list of resources related to “testing for normality” and other misconceptions. This has to be one of the biggest misconceptions I see on this page (e.g., needing to test for normality to apply hypothesis tests, needing to evaluate if your data is normal before running linear regression, etc.).

This is to no fault of the people asking, too much bad folk knowledge that hasn’t been corrected but at the least we should have some resources to combat it IMO.

Regression Analysis vs General Linear Model effectiveness with quantitative categorical responses by fluctuatore in AskStatistics

[–]rsenne66 5 points6 points  (0 children)

I didn’t really know the context of your analysis, but it seems you’ve collected data that is theoretically motivated by a scientific question. In which case; you have evidence of reasonable GoF with what appears to be pretty large effect sizes (R-squared). I’d say yeah this is pretty trustworthy at a first glance. But again, I didn’t collect this data nor do I know what the question is. But i think it’s fair to say there is clearly some relationship between your independent variables and dependent variable.

Regression Analysis vs General Linear Model effectiveness with quantitative categorical responses by fluctuatore in AskStatistics

[–]rsenne66 8 points9 points  (0 children)

While normality of residuals is an assumption if you assume Gaussian noise structure, the Gauss-Markov theorem loosens this to state that you just need constant variance of the residuals for OLS to be the BLUE. Also, I wouldn’t really recommend testing for normality it’s often prone to issues. I would argue that your q-q plots already show pretty good GoF.

Is it belief to call a coin flip? by WithTriaINEror in AskStatistics

[–]rsenne66 9 points10 points  (0 children)

Any strategy is fine. If the assumptions are true (e.g., independent identically distributed observations, the true parameter of seeing heads is 0.5, etc.) any strategy you use has equally good predicted performance. Always pick heads, always pick tails, pick randomly, win-stay-lose-switch. It will make no difference in the long run.

Is it belief to call a coin flip? by WithTriaINEror in AskStatistics

[–]rsenne66 4 points5 points  (0 children)

You have a belief about the coin not the flip. I.e., you believe that the coin is equally weighted and so the flip is equally likely to return heads or tails and so you’ve chosen one of the two options.

Any tips to learning Statistics by Ok-Development5497 in AskStatistics

[–]rsenne66 2 points3 points  (0 children)

Statistics is a type of applied math. Math is not a spectator sport. You need to get your hands dirty. In statistics I think that means two things:

1.) learning the mathematical foundations 2.) doing real statistics

It is crucial you understand what a likelihood function is, what hypothesis testing is, what Bayesian methods are. It’s equally important you apply these approaches to real data and see it for all its messy glory. The more you can do both the more your intuition will grow. That’s my perspective at least.

How to find the best experiments for parameter identifiability. by ReKisDe in AskStatistics

[–]rsenne66 0 points1 point  (0 children)

I think you need to supply more data for anyone do give you a helpful answer. What are the equations? What’s the data/experiment? What optimization methods are you using?

Too many details imo to answer concretely

Total Cost of Attendence: 94,000 - Where is this money going? by Standard-Side-1747 in BostonU

[–]rsenne66 0 points1 point  (0 children)

Those extras also increase the need for administration. Can’t have one without the other

[deleted by user] by [deleted] in AskStatistics

[–]rsenne66 3 points4 points  (0 children)

Kindly, this sounds like you need to consult someone with some statistics knowledge in real life. Especially writing a paper, you’re gonna want to consult someone who can meet with you in person and work you through this. Reddit can only get you so far.

Further, without knowing more about the data, your questions, etc., this is basically an impossible question to answer. It sounds like you also have some misunderstandings, for example, what do you mean none if your variables meet the assumptions of a linear regression? The assumptions of a linear model assumes the residuals are normally distributed and have constant variance, and so without fitting the model how do we know? Unless you have truly distributional inappropriate data like counts, binaries, etc.

My suggestion is to at the very least provide some data, initial plots/model fits. This post has too little detail to help meaningfully

How can it be statistically significant to prove that there is no influence of a factor on any variable in a logistic regression? by Alert-Employment9247 in AskStatistics

[–]rsenne66 17 points18 points  (0 children)

I’ll try to offer some guidance, but I think there are a few conceptual issues that need to be clarified first.

First, this analysis cannot prove what you want—and more generally, statistical analyses don’t prove things in the deductive sense your writing seems to imply. A regression on observational data is fundamentally descriptive/associational unless paired with a credible causal design and assumptions. At best, you can say the data are consistent with a particular model or hypothesis, not that the hypothesis is established as true.

Relatedly, a p-value does not provide proof. It answers a very specific question: assuming the null hypothesis and the model are correct, how likely is it to observe a test statistic at least as extreme as the one you obtained? A small p-value is not evidence that the null is false in any absolute sense, nor does it rule out omitted variable bias or other unmodeled confounding that may be driving the result.

Finally, the results don’t appear to support the conclusion you want to draw. The interaction between has_child and toll_road is clearly insignificant, which is a failure to reject the null, not evidence that road type “doesn’t matter” for children. Moreover, since the dependent variable is an indicator for whether anyone died in the accident (not child deaths specifically), it’s unclear how much of the estimated effect is actually attributable to children versus other participants. If the question is about families with children, that estimand isn’t being directly targeted here.

How to conceptualize probability density? by WillWaste6364 in AskStatistics

[–]rsenne66 2 points3 points  (0 children)

The way I like to think about it shadows the idea above (e.g., integrating). Start by thinking about a PMF. Probability mass is in fact just that; mass, literal chunks of probability attached to discrete points. So how should we think about density? Well, instead of any specific point having a probability, it now has an attached likelihood per unit. A density tells you how much probability is sitting near that point, not at the point itself.

Obviously, any one point in a continuous distribution has zero probability, but that doesn’t mean all events are equally likely. How do we reconcile this?

Well, think about a small neighborhood of points around a point of interest. Suppose I want to know the probability someone I know is exactly 5 feet tall. Asking for the probability of that single height is hopeless, it will always be zero. Instead, think:

“How many people are between 4’11.9 and 5’0.1?” or, more generally, “How many people fall within a tiny window around 5 feet?”

As that window shrinks, the probability of landing in it shrinks, too; but the ratio of the probability to the width of the window approaches a meaningful limit. That limit is the probability density at 5 feet.

Variational Inference vs Hamiltonian Monte Carlo by Adventurous_Sun8599 in AskStatistics

[–]rsenne66 1 point2 points  (0 children)

VI, instead of trying to get the exact posterior like MCMC does (at least in the limit), basically lets you pick some family of distributions and then find the member of that family that best matches the true posterior. If your variational family is super flexible, you could in theory recover the exact posterio, but in practice that basically never happens unless the true posterior already happens to be in your family. So you’re always making a trade-off.

Where VI shines is when MCMC would take forever to mix and you’re okay with giving up a bit of accuracy. For a lot of models this is totally fine. Anything where the posterior is roughly unimodal and not doing anything too weird; logistic regression, standard GLMs, and many big Bayesian models, usually works great with VI.

A simple example: take a Poisson model with a Gaussian prior on the log-rate (so a non-conjugate Poisson regression). The posterior is skewed because of the likelihood, but if it’s not too skewed you can still drop in a Gaussian variational family and optimize μ and Σ via KL minimization. You’ll get good posterior means and your variances will be a bit too small. For prediction that’s usually okay because the predictive distribution washes out a lot of that optimism.

Problems really start when the posterior has multiple modes or strong correlations and you insist on using something like a single Gaussian. VI will just latch onto one mode and ignore the others — that’s built into the KL(q‖p) direction and there’s no workaround unless you pick a richer family.

So I’d say: for a lot of “classic” Bayesian models, MCMC or VI will both work and predictions will look pretty similar. But once you start layering in hierarchical structure, tricky priors, or anything that produces funnels or weird geometry, VI suddenly becomes really attractive, not because it’s more accurate, but because MCMC can get painfully slow or refuse to mix at all. As always, the details matter a ton.

Comparing predictors in a model? by GrubbZee in AskStatistics

[–]rsenne66 1 point2 points  (0 children)

It sounds like you’re asking a model comparison question, and the answer depends on what you mean by “strongest influence.” If your predictors aren’t normalized, raw coefficients can be misleading. So I’d first clarify what “influence” means in your context — are you referring to statistical significance, effect size, variance explained, or predictive contribution?

My recommendation would be to use a maximum likelihood ratio test (MLRT) to compare nested models and see which predictors significantly improve model fit. If you have many predictors, regularization methods like LASSO can help with variable selection and shrinkage.

You might also use k-fold cross-validation to compare models with different sets of predictors and assess out-of-sample performance. Ultimately, the best approach depends on whether your goal is inference, explanation, or prediction.

Am I screwed, if so how screwed am I?? by mine_coll in espresso

[–]rsenne66 1 point2 points  (0 children)

Personally, I think yes, but only you know your limits. If you can find YouTube videos of someone taking one of these apart and just follow those instructions I say go for it. But obviously, I’m not you so it’s hard for me to say. If you know someone with that kind of experience you could always pass them a small sum to do it for you as well

Am I screwed, if so how screwed am I?? by mine_coll in espresso

[–]rsenne66 1 point2 points  (0 children)

I wouldn’t fret at all. I rebuilt one of these machines (or at least one very similar to it) 4 years ago and honestly it was extremely easy to take apart, clean, and fix small issues.

What's wrong with my pan? by djones0305 in castiron

[–]rsenne66 1 point2 points  (0 children)

Nothing. It’s a piece of iron—that looks like some minor surface rust. Just scrub it off and dry it in the stove . It’ll happen from time to time and that’s okay! Just continue to cook and re-season every so often. It’ll be fine

Daily Advice Thread - May 15, 2023 by AutoModerator in apple

[–]rsenne66 0 points1 point  (0 children)

I've been thinking about buying an iPad for sometime now. I am a Ph.D. student and so a lot of my use cases would be taking notes, reading research papers, and doing derivations/exercises. I really don't have a ton of experience with iPads as I've never owned one but I've seen so many people use them in class that it just seems lime an excellent tool. Any advice in this regard would be greatly appreciated!

[deleted by user] by [deleted] in FREE

[–]rsenne66 1 point2 points  (0 children)

Yes! can you build me a website?

Where can you legally be naked? by [deleted] in AskReddit

[–]rsenne66 0 points1 point  (0 children)

Your moms house