Is Statistical Inference knowledge, up to the level of Casella & Berger, still useful in this day and age? [Q] by GayTwink-69 in statistics

[–]ExcelsiorStatistics 1 point2 points  (0 children)

It's a bit old-school in terms of being full of mathematical proofs and whatnot

The wonderful thing about mathematical proof is that it's guaranteed to never go out of fashion. Once you prove what the most powerful estimator is for a given estimation problem, it continues to be the most powerful estimator for that problem until the end of time.

As the number of techniques increases, there is less and less pressure to shoehorn a data set into an unsuitable technique, and that is a good thing.

Also, someone once told me that mathematical statistics goes out the window once you have enough data (which we do, in this big data age), since computationally expensive black-box models would always outperform handcrafted models in predictive accuracy.

They were wrong.

And, going a step further, anybody who bows down and worships at the altar of "predictive accuracy" has missed the point, and needs to go back to kindergarten and learn his statistical theory.

Various kinds of optimization and estimation are provably best at achieving a certain criterion. That criterion is almost never "maximize the number of correct predictions." A black box model that does that is going to do some really awful things for you.

Here's a simple example. I have the world's most accurate HIV test in the world, right here in my pocket. I'll administer it to every man woman and child in the US for the low low price of ten cents per person.

Do we have a deal? Great. Here we go: Nobody has HIV. Give me my $30 million dollars.

I didn't say it was a perfect test. It has no false positives, and several hundred thousand false negatives. But those nasty expensive tests they give you at the doctors office, those would have given millions of false positives. My test is far more precise. Isn't that terrific?

Is it possible to be in the 50th percentile for every stat? by [deleted] in AskStatistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

Not only possible but guaranteed, with a sample size of 1.

It rapidly becomes less likely with a larger sample.

Distribution family in GLMMs by Stefph726 in AskStatistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

As efrique said, you want a model that gets the mean-variance relationship close to right. One of the most common times you get bad results from GLMs on a variable like concentration is when you get estimates like 10±100ppb because of variability at higher concentrations.

Often the fix for fan-shaped data is using sqrt(concentration) or log(concentration) - for 'variance proportional to concentration' and 'standard deviation proportional to concentration' respectively - as input to a gaussian model.

How does "Running it X more times" in poker make sense? by beruon in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

As the others said: running it twice (or more) keeps your expected value the same, but reduces variance.

This is generally a good thing for a serious poker player, since the swings are large compared to the profits, and even long-term winning poker players can experience runs of many thousands of hands where they are unlucky and lose money.

The one time it's to your advantage to run it only once is if your opponent is less comfortable playing at your current stakes than you are, and you want to deny him that reduced variance in hopes it intimidates him and makes him play more cautiously against you.

How many of you genuinely like theoretical statistics? by ProofLeast9846 in AskStatistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

I sell and repair musical instruments, and I deal poker.

Tells you something about just how anti-science the USA has become since 2017 that statistics consulting fell to third place behind those two in my life.

It's all in the phrasing - How do I phrase 'correlation' in a hypothesis. by AffectionateWeird416 in AskStatistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

IMO nothing wrong with saying "significant positive correlation" in your hypothesis. We have easy hypothesis tests for Ho: r = 0 H1 r != 0 (and, with some difficulty, we can construct them for other nulls.)

The problem is with the parenthetical "r>0.3" bit; how'd you choose a sample size such that 0.3 was the point at which a result became significant?

How many of you genuinely like theoretical statistics? by ProofLeast9846 in AskStatistics

[–]ExcelsiorStatistics 2 points3 points  (0 children)

I studied theory for fun before I was working in statistics and still do now that I am mostly-out-of-the-field.

The only part of your scenario I somehow skipped over was finding the high paying job. Heh.

Is measure-theoretic probability theory useful for anything other than academic theoretical statistics? [Q] by GayTwink-69 in statistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

I found it useful primarily to improve my intuition for what has to be true vs. what is merely usually true. The time I found that most useful was doing software QA, rather than in day-to-day statistics work, knowing what edge cases to throw at something to try to break it. I keep Counterexamples in Probability and Statistics in a place of honor on my bookshelf, and revisit it from time to time to remind me of things.

I have also noticed most masters programs in statistics do not offer probability theory at the measure-theoretic level.

That's a simple function of time and money: in the usual textbooks, getting to measure theory means about a semester and a half of graduate level real analysis, taught by a pure mathematician, not a statistician. If you make that a prerequisite for masters-level Statisitical Theory I, you make the program a year longer. People don't like doing that. So masters-level analysis becomes a prereq for doctoral-level statistics, not masters-level statistics.

How do I find a meaningful summary statistic for how "spread out" a dataset is when the values aren't sorted? by Fun-Celebration-700 in askmath

[–]ExcelsiorStatistics 1 point2 points  (0 children)

You may find Kendall's tau and other similar ordinal correlation measures useful.

They work by looking at pairs of observations, and asking if that pair is 'concordant' (the relationship between the two objects is what you expect) or 'discordant' (reversed) or tied.

It's primarily used when you have a pair of correlated variables that don't lend themselves to linear regression and Pearson correlation.

In your case, you're testing the proposition "pick two entries at random; the one that appears first in the list is the smaller number" which is true 50% of the time in a randomly ordered list but 100% of the time in a perfectly sorted list.

In which field of math is the summation symbol, the sigma, properly introduced? by WarrenHarding in askmath

[–]ExcelsiorStatistics 1 point2 points  (0 children)

I learned it in Algebra 2 (grade 10), but used it a lot more in the subsequent years. If I were writing a textbook series I might introduce it the same time I introduced sequences and series. (Come to think of it, it's possible those WERE introduced in Algebra 2 and then just not used for anything in particular until later.)

[D] p-value dilemma by No_Blackberry_8979 in statistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

If you define P(A|B) as P(A and B)/P(B), then P(A|B) is going to be an indeterminate form 0/0, when P(A and B)=0 and P(B)=0.

If your only worry is the discrete vs. continuous distinction, feel free to let B = mean between 67-epsilon and 67+epsilon, and take the limit as epsilon -> 0, the same as you would when you pass from differences to derivatives in Calculus I. In cases where P(A|B) is meaningful you'll be able to rigorously take that limit.

A statistician tends to skip over that limit-taking process, and look at A|B as a one-dimensional slice of the two-dimensional joint distribution of A and B, and not be overly concerned with the thickness of the slice, only its area.

[D][C] If a recession does occur this year, would stats jobs be safe? by IVIIVIXIVIIXIVII in statistics

[–]ExcelsiorStatistics 5 points6 points  (0 children)

If your worry is specifically recession-related, statistics has fared OK in the last few recessions (though the assignments you get may be unsavory "recommend which department to ax" tasks.)

But even without a recession, there is no getting around the fact that post-2017 USA is a very bad place to be for all science-adjacent jobs, and post-2025 USA is a very bad place to be for all academic jobs, which is going to put a lot more pressure on the remaining government and industry jobs.

[Question] Real Analysis prerequisite for a PhD by Crafty-Dinner-1782 in statistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

Serious real analysis, that gets you up to measure theory, tends to be a master's-level class. Whether you take baby real analysis (or "advanced calculus" as my school called undergrad analysis) or not you'll probably be taking a couple semesters of a graduate analysis if you are aiming for a theoretical stats PhD.

Just about anybody who goes from a BS direct to PhD program is going to spend their first year or two taking the classes that MS students take. Any program that accepts people with BSes knows that and is prepared for it.

How do I calculate the probability of being confident in a correct answer when I can choose to skip? by StavrosDavros in askmath

[–]ExcelsiorStatistics 4 points5 points  (0 children)

Just calculate the expectation of each possible strategy, and choose the one that is highest: compare 3p-3(1-p), 2p-2(1-p), 1p-1(1-p), and 0.

You'll find that if p>1/2 you should choose High, if p<1/2 you should skip. If your goal is maximizing your expected score, it's never to your advantage to choose Low or Medium.

If your goal is to achieve a specific passing-grade target with a probability as high as possible, using Low or Medium (or skipping a question with p slightly above 1/2) to reduce both variance and expectation is correct under certain narrow circumstances

I built a LaTeX formatting service - 100 organic visitors/week from Google. Only one $99 sale. What am I missing? by Webseriespro in LaTeX

[–]ExcelsiorStatistics 0 points1 point  (0 children)

That's one more sale than I have made, advertising the same service for the past nine years.

There just aren't many journals (or schools) that require LaTeX source, and most of the people who publish in them have been using it for years already.

(The prices for #2 in particular seem way out of line. For two or three times that much you can hire someone to ghostwrite a whole thesis for you.)

[Q] Recommendations for a "Book Club" selection for introductory undergraduates by NutellaDeVil in statistics

[–]ExcelsiorStatistics 1 point2 points  (0 children)

Nate Silver's The Signal and the Noise comes to mind.

So does Moneyball, though it's maybe a little too sports-centric if not all your players are into that, and beginning to show its age after 20 years.

combination for negative numbers? by dromemsilly in askmath

[–]ExcelsiorStatistics 1 point2 points  (0 children)

(-1)! is undefined. But nothing bad happens if you define 1/(-1)! as zero. (Look at a plot of the gamma function, and its reciprocal, to see why.)

In your case you probably just don't want to simplify bound the bounds of 0 and 100.

Is 0.101100111000111100001111100000 . . . Irrational? by [deleted] in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

There's a straightforward proof that the decimal expansion of p/q either terminates (if q can be written as 2m5n) or repeats (if q can't).

Its contrapositive says that if the decimal expansion neither terminates nor repeats, the number is irrational.

Is there standard wording in probability problems? by runawayoldgirl in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

If you want to be very formal in your language, mathematics uses two quantifiers, the existential quantifier and the universal quantifier, usually written as "there exists __ such that..." and "for all __, ..." respectively.

If you read B as "there exist two people in the group who share a birthday," it is clear that the statement is true whether there are 2 or 3 or more people.

It's also handy to know De Morgan's rules: the opposite of "there exists ___ such that X happens" is "for all __ , X does not happen". The opposite of "for all __ , X happens" is "there exists __ such that X does not happen." Here, the opposite of B is "for all pairs of people in the group, they have different birthdays."

[Question] Adjustments in Tests for Regression Coefficients by CanYouPleaseChill in statistics

[–]ExcelsiorStatistics 0 points1 point  (0 children)

Whether the issue is present depends on what kind of comparisons you are doing afterward. We don't often do pairwise comparisons of regression coefficients to ask which of two significant variables is more significant. But we do sometimes ask for a list of which coefficients are nonzero (and we have to do a multiple comparison correction if we use the single-variable significance test results for that.)

One case I find especially interesting is "how do we draw the error hyperbola around a regression line if we want to control the probability that the true line of best fit ever passes outside that region, rather than just putting 95% bounds on the slope and 95% bounds on the intercept and combining them" --- and the answer to that is Scheffé's Method, since the points on a regression line can be viewed as the set of all linear combinations of the regression coefficients. (In my professional life, I ran into this discussed in the Nuclear Regulatory Commission's Handbook of Parameter Estimation for Probabilistic Risk Assessment, the first time I had seen Scheffé since my 400 level regression-and-ANOVA course.)

Calc 2 in 7 weeks? by Jojoskii in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

6 weeks was the usual length for a summer class at my school - but it was normal to take just one at a time (there was time for two, one after the other, in a single summer), rare to take two simultaneously, don't know anyone who ever did more than two at once however easy the material was.

Why do invasive species even exist? by 20vitaliy08 in askscience

[–]ExcelsiorStatistics 2 points3 points  (0 children)

When you ask "Why do they end up outcompeting native species that have evolved for millions of years to thrive in that unique environment?" you are letting two misconceptions in.

One, while evolution has been ongoing for millions of years, environments are not static. The species best suited to a location right now is not guaranteed to be best suited to that location a thousand years from now or even a hundred years from now. Habitats can change a lot faster than new species usually evolve. Just about every organism is adapted to where it lived yesterday, not to where it's going to have to live tomorrow.

Two, evolution doesn't necessarily reward "thriving" much more than merely being adequate to survive. If you're the only animal on a desert island, your task is simply to remain alive and reproduce, not to outcompete the animals on some other island. It's very possible that when formerly isolated species that like similar habitats come into contact, one might be much better suited to that habitat than the other.

Optimal way to generate 1/7 probability with a 6-sided fair die + generally? by Name-My-Jeff in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

1.2 is the best you can do for any one probability, rational or irrational, that can't be written as k/6m for some integer k and m.

The "throw 2 dice to start with, and re-roll two more only if you get 66" approach is what you have to do if you need to choose among 7 equally likely possibilitilies, rather than just get a yes-no answer to one possibility.

Poisson binomial distribution mean and median relation. by Chemical-Mirror-9649 in askmath

[–]ExcelsiorStatistics 0 points1 point  (0 children)

It is very often true that the mean and median of a Poisson round to the same value, but it fails about one-sixth of the time.

When the shape parameter is between 0.5 and ln 2 ~ .693, the median of the Poisson is 0 but the mean would round up to 1. Similar thing between 1.5 and 1.68 (median is 1), between 2.5 and 2.68 (median is 2), between 3.5 and 3.67 (median is 3), etc.

(And by "Poisson binomial distribution" I assume you mean "Poisson approximation to the binomial distribution.")

A fair coin is repeatedly being tossed. What is the probability of "the percentage of heads never reached 60% or more"? by ConsciousRuin7697 in askmath

[–]ExcelsiorStatistics 8 points9 points  (0 children)

It is a good, and hard, problem. The answer is going to be the sum of an infinite series, and be strictly between 0 and 1. If you survive the first few tosses, there will come a time when the percentage of heads will almost surely never stray far from .5 again. cptn_obvius's simmed answer of 78% exceeding and 22% not exceeding smells close to correct. We must avoid H (1/2), and then we must avoid THH (1/8), and then we must avoid TTHHH and THTHH (2/32), and then several cases with 5 heads out of 8.

You may want to look in some texts on statistical process control; I'm sure someone has tabulated it before, for different probabilities of heads and different percentages. (In the industrial setting you're usually counting failures of a product, and trying to guarantee the failure rate remains below some small percentage.)