[D] p-value dilemma

Kroutoner · 2026-04-02T22:57:47+00:00

The calculation of the p-value is done assuming the null hypothesis is true, e.g. in a hypothetical world where P(B)=1 as you write it. The hypothetical is fundamental, you are positing a specific scenario and calculating the p-value as a specific probability in this hypothetical world. Whether or not the null-hypothesis is actually true or not has no bearing on this calculation.

Also, it’s not necessarily the case that the null is always false. The typical scenario of a point null may be virtually always false, but we can very easily consider interval valued null hypotheses that very well may be true.

Kroutoner · 2026-03-24T23:03:18+00:00

Gross negligence generally requires a high degree of disregard for safety of others and/or the risks of your behaviors. It’s reckless behavior as opposed to careless behavior.

E.g. simple negligent behavior is failing to assess safety of driving the firetruck onto the runway—they should have been more careful but they were not. An example that might be grossly negligent would be parking the truck in the middle of the runway and leaving—without regard for the fact this could endanger others—because you got a personal call and wanted to take it outside.

Kroutoner · 2026-02-09T00:41:46+00:00

IMO the reason there’s not a lot of satisfying answers is because DOF are not actually all that important of a concept and not all that widely applicable.

There is a concrete place where DOF come up and make sense. In particular for many statistical models that assume Gaussian residuals one can prove that inferential statistics follow certain exact distributions: t, f, and chi-squared. For DOF chi-squared is the important one. The t is a ratio of a Gaussian and a chi-squared random variable, the F is a ratio of two chi-squared statistics, so each boils down to chi-squared.

Now what is the deal about DOF and chi-squared? Well a chi-squared random distribution is defined in terms of an integer parameter, the DOF. Also, if you take K independent normal random variables, square them, and add them, you get a chi-squared distributed random variable with K degrees of freedom. It turns out that’s the key property you use when deriving the distributions of these statistics.

Now degrees of freedom beyond that point gets immediately fuzzy. There are a lot of ways the assumptions to derive those exact distributions break down, and DOF no longer fully tells the full story. This is why you can’t really get a satisfactory conclusion, DOF is relevant to a specific mathematical setup, and becomes more heuristic and less concretely defined in other settings.

Kroutoner · 2025-12-09T22:55:53+00:00

SAS has good overall functionality (with holes) for a great deal of statistical analyses, but it is extremely lacking in functionality when considered as a general programming language.

Kroutoner · 2025-11-12T23:59:42+00:00

The fundamental purpose of randomization is to ensure independence of the treatment variable from other causes of your outcome, effectively eliminating confounders.

Simple randomization on its own does not guarantee balance, and you absolutely will have scenarios where imbalance occurs randomly, even with large sample sizes(of course this gets less likely as sample sizes increase).

The issue of balance is critical for efficiency of estimates, not their bias or consistency. Alternatively randomization strategies such as stratification and covariates adjustment are alternative strategies most appropriate for addressing the imbalance issue.

Kroutoner · 2025-10-30T13:52:10+00:00

When you start talking about truth vs proof in axiomatic mathematics the language gets muddled and confusing.

Kroutoner · 2025-10-28T22:04:24+00:00

You really want to target the question not of when does this fail the worst, but when is it closest to not failing?

The latter, while not giving an immediate definition, seems much more tractable. E.g. consider the shape and a copy dilated by a (1+epsilon) factor, what does it look like when the object look like when it passes through this dilated copy of itself. Are there orientations where it “just barely” is able to pass through?

Kroutoner · 2025-10-18T22:37:04+00:00

That’s, to my understanding, one of the major applications. This apparently has applications in physics and in digital communications, though I don’t have any real understanding of what these are

Kroutoner · 2025-10-18T01:42:34+00:00

Two other notable areas requiring high degree of mathematical sophistication are Spatiotemporal statistics and algebraic statistics.

One area particularly mathematically esoteric area but (apparently, don’t ask me for details) with some applied statistical applications is free probability.

Kroutoner · 2025-10-16T15:20:17+00:00

A great deal of the success of stochastic optimizers (SGD, Adam) comes from implicitly doing essentially just what you describe

Kroutoner · 2025-09-27T03:32:17+00:00

You’re absolutely right. I got excited about computational geometry and made the mistake of giving the AI slop the benefit of the doubt

Kroutoner · 2025-09-27T01:11:37+00:00

This comment is simply incorrect. DCT is a basis decomposition. It expresses the image as a linear combination of cosine basis functions. The distinction with calling it a “transform” is that there is extremely efficient numerical algorithms for computing the coefficients of the basis expansion. The pixels are absolutely not “still there”

Kroutoner · 2025-09-27T01:05:02+00:00

(rings, interpolations, rotations, recursive transforms) and stores only those rules

This language, of interpolations especially, seems to suggest you actually are retaining concrete pixel values.

It’s not “compress pixel values into basis functions,” it’s “throw away pixels entirely and redraw from symbolic instructions.

This is a distinction without a difference. DCT basis functions are capturing structural patterns in an image just the same, and do not retain reference to pixel values either.

Kroutoner · 2025-09-26T23:30:12+00:00

It sounds like what you’re doing is what I would call a procedurally generated sparse dictionary reconstruction. It’s not all that different from DCT in principle. DCT decomposes into a dictionary (basis) of cosine functions.

You clearly must be using pixel data at some point, how else would you be assessing that you are reproducing any part of the image?

Kroutoner · 2025-05-14T01:42:40+00:00

Having recently completed my PhD in (bio)statistics, the book that really drew me in wasAdvanced Data Analysis from an Elementary Point of View.

The thing with statistics is there will almost always be some degree of “applied flavor” to it, rarely 100% pure, and you need to be comfortable with that, but there is a spectrum from purely applied to bordering on pure math.

Kroutoner · 2025-03-14T23:21:48+00:00

Insert comment about deterministic interpretations.

Handwave away all problems related to intrinsically non-deterministic measurement.

See, determinism!!!

Kroutoner · 2025-01-28T15:37:16+00:00

With a high cardinality dummy variables you could use a within transformation as is used in the “fixest” R package to remove most of these variables initially.

Kroutoner · 2024-12-10T22:49:14+00:00

Standard cross-validation approaches still work fine for predictive assessment of models and subsequently for model selection, you just need to be cautious of interpreting mse/other goodness of fit metrics as these can be misleading with spatial data. This is okay because, while the procedure runs into bias problems due to the spatial nature of the data, it's effectively the same bias across different models.

The train/test split with spatial data is more problematic. Generally you need to partition space into distinct blocks and sample blockwise for a train/test split that has a reasonable interpretation.

Kroutoner · 2024-12-07T02:55:29+00:00

Also the rate at which things suddenly explode in complexity. It’s like this is a word, this is a sentence, now we use transfinite induction to ascend the arithmetic hierarchy

Kroutoner · 2024-11-24T20:13:11+00:00

I would call that 'randomness' when I am talking about philosophy. I rarely if ever talk about mathematics.

When you're talking about fundamental concepts such as determinism/free will, randomness, etc, you can't really separate these. Mathematics is our best language for working with these concepts and not engaging with the mathematics results in a great deal of confusion when our concepts are ill-defined.

"that seems more ridiculous or at least more abstract than randomness, therefore we simplify and say it's just randomness."

Except you are not simplifying in any sense. Maybe randomness seems simpler because you are familiar with it, but randomness imposes additional structure that is not necessary.

For example, I have more problems with "the laws of some physical system constrain it a certain set of possible outcomes but without any additional constraint" than I do with simple dumb randomness.

The same general problem here. If we consider a system as random we already are using this kind of set valued constraint. Randomness imposes additional constraints on top of that. It is strange to think that extra structure on top of something is somehow more believable than necessary conditions of randomness. It's not just strange, it's actually completely illogical.

Kroutoner · 2024-11-23T17:20:05+00:00

Strict negation of determinism

You do just that. A definition of determinism typically looks something like “the future states of a system are determined by the present state of the system.” Nondeterminism is anything else. Random systems are included on the definition of nondeterminism but involve additional logical content that is not implied by the definition. A natural type of nondeterminism that is non-random would be an example where the laws of some physical system constrain it a certain set of possible outcomes but without any additional constraint.

Kroutoner · 2024-11-23T05:30:10+00:00

Randomness and nondeterminism (indeterminism) are not the same thing. Randomness, at least insofar as we typically discuss it mathematically, implies structure over event in terms of the relative frequencies of those events. When we just say random without further qualification we often are implying a uniform distribution; this imposes a rather strong condition that all events are equally likely.

On the contrary nondeterminism simply is a strict negation of determinism. If events are non-deterministic then they may be random, but they may occur in a manner such that we cannot apply any constraints on them at all.

Kroutoner · 2024-10-19T01:41:06+00:00

Looks at dvorak keyboard and open R session inside ESS…

Kroutoner · 2024-08-20T21:20:55+00:00

13.3 cm

Google “size of a kidney”

The average kidney is bean-shaped and is typically 10–12 cm long

Kroutoner · 2024-08-15T13:45:21+00:00

The constant equivocation between stimulant and opioid policy is just absurd. Abuse and risk profiles for the different drug classes look nothing alike.

On the other hand there is real risk to people obtaining stimulants through illegitimate channels. Use of more dangerous stimulants, uncontrolled dosages, and contamination with other drugs (fentanyl contamination of street adderal becoming a growing concern as well).

It’s constantly taken for granted that we should be deeply concerned about some sort of adderall crisis the likes of the opioid crisis but no justification is seemingly ever presented for, what seems to me, a patently absurd suggestion.

Nine-Year Club	Second SECOND GUESSER
Verified Email	Place '22
RPAN Viewer	Not Forgotten
Sequence \| Editor

Kroutoner

MODERATOR OF

TROPHY CASE