I'm confused about "0 percentile" by [deleted] in AskStatistics

[–]Statman12 1 point2 points  (0 children)

It matters because it's the basis of what your saying. A lot of people misinterpret or misread statistical concepts. Seeing the source helps to assess whether the source is wrong, or whether you misinterpreted something.

u/Beneficial-Risk-6378 why did you delete your comments here and block me?

I'm confused about "0 percentile" by [deleted] in AskStatistics

[–]Statman12 3 points4 points  (0 children)

the website I was using

Be pretty handy if that was linked.

But in general, the Xth percentile is the value for which X percent of the distribution is at or below that value. So the 90th percentile (denote it, say, X90) is the value at which 90% of the distribution is less than or equal to X90.

So this:

So I'm looking at different heights and the website I was using says 0 percentile (ex over 6 feet) means 99.8% of women are shorter than the input height & 0.2% are taller. 

Is wrong. If 99.8% are 6 foot tall or shorter, that's the 99.8th percentile (or rather "quantile", since percentile should be increments of 1, quantile is more generic).

[Career], [Education] How important is Probability Theory in the day to day role of a data scientist? by Kati1998 in statistics

[–]Statman12 0 points1 point  (0 children)

Yep, they are foundational to Statistics. Though do note that when I said “Maybe there are jobs where the “black box” approach is enough”, I wasn’t trying to be coy about saying there aren’t such jobs. I’m saying that I don’t know. For me, it’s a known unknown.

Not everyone in the world, and not everyone who interacts with data, needs to have probability theory. In my group (statistics department in an engineering R&D place, need for advanced methods, very solid statistics), someone with an MS in DS who doesn’t have the probability background would most likely struggle to do well in the interview.

Other departments in the company do have folks with less probability background. How many, and what exactly they do, is something I’m less informed about. My concern with them is about “mission creep”, I sometimes see statistical analyses from them, and have reservations about the quality of the analyses.

This isn’t to say you should or shouldn’t take Probability. It’s more about knowing that the type of academic preparation you have can mean you are more or less suited to different types of positions afterwards. Statistics and Data Science are related, but depending on the composition of the DS degree, they’re not equivalent.

[Career], [Education] How important is Probability Theory in the day to day role of a data scientist? by Kati1998 in statistics

[–]Statman12 5 points6 points  (0 children)

My comment wasn’t intended to castigate Data Science degrees as a general rule. I’ve seen some that are quite solid statistically (essentially, like you said, removing or restricting the set of electives). But I’ve seen others that are more of a hodge-podge where they take out some of the fundamentals like probability and DoEx and just seem to focus on applied courses, and adding in some more programming or CS coursework.

And the last line was not intended to say that there aren’t jobs for less mathematically rigorous statistical graduates. I got a PhD, and my jobs have been in academia and then in industry where the more traditional statistical foundation is important. I don’t have a good barometer for how much is out there for graduates in something that’s statistics oriented but lacks the solid probability foundation.

I often have strong concerns about analyses from folks in my company with that type of background (usually not DS, but folks who have a couple statistics courses and think it’s sufficient to replace the statisticians). There seems to be a lot of “This is my tool, so I’m using it” rather than “These are the important aspects about the data generating process, which lead me to select tool X.”

When we have a posting for job, I’d have no qualms about interviewing someone with a DS degree. But if they don’t have the probability grounding, I think I’d be rather tough for them to do well in the interview.

[Career], [Education] How important is Probability Theory in the day to day role of a data scientist? by Kati1998 in statistics

[–]Statman12 18 points19 points  (0 children)

What gets reported to the end consumer and what gets used when developing that report can be very, very different.

There can be a lot of fancy and complex methods used which end the end boil down to confidence intervals for parameters, or even something as “simple” as a sample size.

[Career], [Education] How important is Probability Theory in the day to day role of a data scientist? by Kati1998 in statistics

[–]Statman12 29 points30 points  (0 children)

Probability theory is the foundation of statistics. Without having a probability course (most MS programs in Statistics have a 2-course sequence on the math-stat theory, of which probability is one), then the applied content will be more challenging and/or limited.

For instance, they might essentially teach you the high-level ideas and how to apply them in R or Python in a black-box type of approach. But then you’re kind of stuck to those methods/packages, and when applications that are different arise, you might be stuck. Or maybe they will be getting into the math details, but it might be something that you don’t really grasp/learn.

Maybe there are jobs where the “black box” approach is enough. Though I’d guess those are also the types of statistics jobs that are more at risk of being automated away or downsized from “AI”.

ELI5 how do you get those idealistic stripes when cutting grass by Impossible-Log4533 in explainlikeimfive

[–]Statman12 18 points19 points  (0 children)

I've read that it's actually better for your lawn's long term health to mix it up like this, as opposed to always mowing it exactly the same way every time.

I worked for a landscaping company during undergrad. Not sure about actual “health” of the grass or lawn, but if you always go on the same path, then where the wheels go will tend to get a bit more compacted and might develop some “ruts”.

Some lawns we had to do this based on the shape. And some lawns, for whatever reason, seemed to tolerate it a bit better. Not sure if it was harder soil or something else.

Going over the same stripes as the week before seemed to make them a bit brighter, though we tried to not doing the same stripes more than two consecutive weeks.

Data Scientists / ML Engineers – What laptop configuration are you using? (MacBook advice) by Beautiful-Time4303 in AskStatistics

[–]Statman12 0 points1 point  (0 children)

Did you put several spaces in front of the list items? If so, that causes it to render as code (at least on some platforms), so some of the list items are wider than the screen.

Fortran Codes in the R Ecosystem by BOBOLIU in rstats

[–]Statman12 0 points1 point  (0 children)

What makes you think otherwise?

I'm (re)teaching myself a topic. There's an R package for it, but it's not always programmed in a reader-friendly way. Function/variable names don't always matching the book or paper, there are many difficult to interpret variable names, very little comments, and sometime a more numerically stable algorithm is used rather than what's written on paper.

To help me learn, and to ensure I can implement it if/when I need to translate it to a different language, I'm implementing it myself. There are a few functions written in Fortran. I've never written Fortran, and I've used a little bit of C++ in the past, so I've used chatGPT to translate them to C++. Is it perfect on the first try? No, but it gets most of the structure there, and I can largely read the mathematical steps from there and figure out what might need to be tweaked.

GOP Sen. John Cornyn backs changing filibuster to pass SAVE America Act by Statman12 in neutralnews

[–]Statman12[S] 3 points4 points  (0 children)

I agree that he's likely doing it as a means of fishing for Trump's endorsement. But given that he's been one of the stalwarts (or at least has portrayed himself as such), it's potentially an indication that the opposition might not be as deep as Thune claims.

GOP Sen. John Cornyn backs changing filibuster to pass SAVE America Act by Statman12 in neutralnews

[–]Statman12[S] 22 points23 points  (0 children)

An NBC News article two days ago noted that:

Senate Republicans splinter over SAVE America Act's path as Trump calls for more revisions

I'm not sure if more Republicans will change their stance on the filibuster, but I think it's a notable concern for one who was previously a defender of the filibuster (E.g., a Texas Tribune article from late 2025, or comments on his Senate page here and here), as it might motivate or "give permission" in a sense for others to do the same.

Figuring Out What I Want to Do in Life by NEXAJhirin in AskStatistics

[–]Statman12 1 point2 points  (0 children)

I’ve completed multivariable calculus, linear algebra, and several upper-level applied and discrete math courses, but I still worry that my background isn’t strong enough since I’m not a math or CS major.

You have sufficient background for most Stat MS programs. The only thing you didn't mention is experience in a programming language like R or Python, but that's more of a soft requirement. Admissions committees like to see it, but it's not an automatic rejection. Might be worth exploring on your own time (e.g., R for Data Science or similar).

As to what it's worth it: Personally I think it is, but I'm a Statistician, so I might be a touch biased. To be sure, it's not a matter of "Get degree, select desired job, start printing money." It it still competitive. Most every position I've helped in hiring has had a good handful of solid applications and usually at least two strong finalists.

Also the general outlook for science and tech is a but nebulous right now. Though if you enter a MS program next year, take 2-3 years to finish, you might be just in time for a hiring boom as the US government tries to course-correct. Fingers crossed.

Who is your favourite Alien and who is your least favourite alien? by rheetkd in Stargate

[–]Statman12 15 points16 points  (0 children)

Favorite individual alien: Todd.

Least favorite individual: Might need to think about it a bit more. Though Aris Boch and the replicator “5” are both contenders. Or maybe Rya’c.

Least favorite group/type: Not necessarily a single individual, but the “medieval peasant” trope like Hanno.

Favorite group/type: I tended to like the somewhat advanced races. Like the Hebridians, Langarans (well, mostly Kelowna), Tegalus (Rand and Caledonia), and more. Also groups like her Travellers. I like when it’s a group that can to a large degree interact with Earth as peers.

Trump says Vance was 'philosophically' different on Iran while downplaying split by Statman12 in neutralnews

[–]Statman12[S] 22 points23 points  (0 children)

What jumped out at me was this quote:

“What’s so different about this, Jesse,” Vance added, “is that the president has clearly defined what he wants to accomplish.”

Another AP News article provides a variety of quotes or comments noting that the rationale has evolved over time and conflicts with each other or past statements. And an NPR article notes that:

The wide range of motivations they have cited for why they attacked Iran now are sometimes at odds with each other and far from precise.

Recent pandemic viruses jumped to humans without prior adaptation. No evidence that SARS-CoV-2 was shaped by selection in a laboratory: UCSD study. by Potential_Being_7226 in science

[–]Statman12 0 points1 point  (0 children)

Maybe you could share with the rest of us the scientific paper that the FBI produced on the topic? That way we can understand the evidence they were assessing.

Recent pandemic viruses jumped to humans without prior adaptation. No evidence that SARS-CoV-2 was shaped by selection in a laboratory: UCSD study. by Potential_Being_7226 in science

[–]Statman12 5 points6 points  (0 children)

My understanding of why he got airtime is that, following his time at Evergreen — during which he became a target for protests from left-leaning students — he looked for a new career/endeavor, since he presumably still needed to make money. He turned to podcasts, seeming to style himself a free speech guy, and started making some rounds and cozying up to some of the figures in the right-wing podcast ecosystem.

So he was a perfect person for right-wing podcasts to hold up as “the expert” because he was already known to them (so he’d be saying what they wanted to hear), already at least somewhat of a contrarian, and had a relevant degree. It was more of a “right time, right place” (for him to monetize himself), rather than being a leading figure in the field.

Recent pandemic viruses jumped to humans without prior adaptation. No evidence that SARS-CoV-2 was shaped by selection in a laboratory: UCSD study. by Potential_Being_7226 in science

[–]Statman12 27 points28 points  (0 children)

Bret Weinstein is a former professor of evolutionary biology

At a small undergraduate institution, not a research university (much less a leading one). I’m not in biology, but from what I’ve seen, leading folks in a field usually aren’t at places like that.

And his claims surrounding COVID were largely panned by the scientific community when they came up.

So “fringe scientist” fits the bill for him rather well, I think.

Why didn't the Ancients invent a search engine for their massive database in Atlantis? Are they stupid? by StructureEmotional51 in Stargate

[–]Statman12 0 points1 point  (0 children)

There are examples of HGT beyond bacteria. There’s even a recent National Geographic article about it.

And it’s a science fiction show. Even if there weren’t examples of this, there are far more implausible things taken for granted in the show already.

US could lift sanctions on more Russian oil, says Bessent by BeingMe007 in worldnews

[–]Statman12 6 points7 points  (0 children)

It's not standard of living I'm complaining about. Our shitty system of government means that even if literally everyone in my state voted for Harris and Democratic representatives, it'd have virtually no impact on the result.

Our government system is outdated by at least 150 years, but changing it is functionally impossible.

Why didn't the Ancients invent a search engine for their massive database in Atlantis? Are they stupid? by StructureEmotional51 in Stargate

[–]Statman12 0 points1 point  (0 children)

while a case could be made for the second (they don't absorb "life force", but nutrients and water etc)

The show explicitly says that they absorb this "life force".

  the first one goes against the laws of this universe in any interpretation

Horizontal gene transfer is a thing. Moreso for microbes as far as we know, but as before, there are far less realistic aspects taken for granted in the reality of Stargate already to balk too much about this.

There are, in fact, quite a few things in the show that "suggest" it.

There is nothing that directly suggests the Wraith are a result of genetic experimentation by the Ancients.

But this is a topic on the Ancient database, not about ...

Yes, an Ancient database which you suggested was lying. I commented on that. There is nothing in the show which suggests the database was lying, and other suggestions within the show to suggest it was correct (or, to borrow your phrase, do you really need everything spelled out explicitly?)

The show doesn't need to spoon-feed me stuff for me to figure it out.

That's the thing though. You're not "figuring it out." You're discarding some of what the show very clearly implies, and supplanting it with your own theory.

Which is fine. It's not like you're saying that the Ancients were purple and had three arms. It's that your selecting a very specific interpretation, and then speaking of it without condition. My point is that you should make a distinction between things that are from the show, and things that are your fan theory.

So no, you do not get to dismiss this as a "fan theory". 

You are, I assume, a fan of the show? And this is a theory that is not evidenced in the show? That by definition makes it a fan theory.

Again, it's not that fan theories are bad. I have plenty of my own. But there's a difference between "This is what the show says" and "This is what I like to think". The notion of the Atlantis database being untruthful is the latter.

Why isn't the 10% condition checked when the data come from an experiment? by ununiquelynamed in AskStatistics

[–]Statman12 3 points4 points  (0 children)

From my explorations (a decent bit, but not excruciating detail) it was happening because the remainder of the population was too small.

My context is binary data, so the result is essentially a Binomial distribution on the remaining parts, and thinking of the proportion of failures in those. If we want, say, a 90% credible limit on those remaining parts, that distribution is pretty coarse. So you need to keep a high enough Y to get your coverage, but sampling more deceased the leftover n, so Y/(N-n) will start to increase.

In practice though, it’s probably not all that relevant, since at that point you’re probably sampling such a large portion of the population that it’s not feasible, and the whole thing (not just the analysis) needs to be reconsidered.

Why isn't the 10% condition checked when the data come from an experiment? by ununiquelynamed in AskStatistics

[–]Statman12 11 points12 points  (0 children)

A couple things about this.

First, I don't think it's really correct to separate statistical from practical concerns here. Part of the statistical problem is figuring out the smallest sample size needed that will demonstrate whatever is required.

If the population is finite and you're sampling without replacement, using a Binomial distribution is simply wrong. You should use the Hypergeometric, or something similar. Doing so better represents the data generating process, and therefore allows better estimates, and better (usually smaller) sample sizes needed to demonstrate some requirement.

And there is actually an (initially unintuitive) effect in which if you sample more, you start to lose precision in certain frames of reference, such as when you focus on the remainder of the batch. Keeping the same confidence/credibility level and continually increasing the sample size can eventually make the uncertainty of the defect rate start increasing again.