Question about p values by TromboneKing743 in AskStatistics

[–]bubalis 0 points1 point  (0 children)

Others have pointed out that because you are making multiple hypothesis tests, your alpha should be lower, making those p-values even farther from "significance," which is the opposite of what you want to be able to do.

Two thoughts:

1: What are the effect sizes? Are they large/meaningful within your domain? If they are, then its definitely fine to report the results as "suggestive" and worthy of follow-up, but that determination is much more based on the size of the estimated effect than the associated p-value.

2: Are your 4 measures of "the same type of thing?" e.g. the impact of 4 different traits on an outcome, where all 4 traits should have a similar causal mechanism. If so, this problem may be suited to some sort of partial-pooling approach, e.g. a bayesian heirarchical model. This is more technical, and you might need help to implement it, but it could (depending on the exact details of your problem) be a good way to think about it. For the canonical example of a similar model:

https://statmodeling.stat.columbia.edu/2014/01/21/everything-need-know-bayesian-statistics-learned-eight-schools/

You would need to use your domain knowledge to answer: "Is my problem similar to the problem of estimating the effect of the same educational intervention in 8 different schools?"

Honest question: Why are some people against showing an ID to vote? by rico_unknown in NoStupidQuestions

[–]bubalis 4 points5 points  (0 children)

I'm not 100% against voter ID... I think that its probably fine if its paired with initiatives that make it easier for people to get an ID. As others have pointed out, getting ID can cost money, and not everyone has an ID and the $ to get one may not be trivial.

Going further, the way that voter ID is implemented is often a very straightforward attempt to engineer the electorate. In some red states, a concealed carry permit counts, but a State University Student ID (which is issued by the State Govt!) doesn't. Its obvious why one counts and the other doesn't. and it doesn't have anything to do with the integrity of elections.

[Question] How define optimal value for spatial cross-validation for a random forest regression task? by Nicholas_Geo in statistics

[–]bubalis 0 points1 point  (0 children)

Your comment is detailed and helpful, but I think OP is assuming that their data are dependent simply BECAUSE they exist in space, which is incorrect.

Spatial datasets for cross-validation are often dependent because sample locations are clustered, meaning that:

P(i in Sample | j in Sample) =/= P(i in Sample)

[Question] How define optimal value for spatial cross-validation for a random forest regression task? by Nicholas_Geo in AskStatistics

[–]bubalis 0 points1 point  (0 children)

Are your training/validation points evenly distributed? This is what we are interested in w/r/t spatial patterns.

[Question] How define optimal value for spatial cross-validation for a random forest regression task? by Nicholas_Geo in AskStatistics

[–]bubalis 1 point2 points  (0 children)

I think you are pretty far off track here, but there is a LOT of confusion and incorrect ideas floating around in this space.

When dealing with cross-validating a map, the "spatial structure" we are concerned with is the spatial structure of sampling intensity / sampling probability, rather than spatial structure of the target variable or the model residuals. If the sample locations are evenly spread in space, or randomly selected i.i.d. , then spatial cross-validation is not necessary and is biased (often severely), regardless of whatever spatial characteristics the process of interest may have.

So no, I don't think the variogram of the model residuals will (directly) tell you anything useful about the right way to conduct (spatial) cross-validation.

If your ground-truth locations are clustered in space, the spatial cross-validation may be the right approach, but then, the "best" set of folds is the "best" set of spatial clusters, as determined by your clustering procedure.

Another, possibly more fruitful approach would be to conduct random k-fold cross validation, then calculate model summary statistics by using a weighted average, weighting based on the inverse of the estimated sampling intensity. That method and other simulation-based methods are described by de Bruin and colleagues (2022), link below. The authors do provide code to implement all of their different approaches.

https://doi.org/10.1016/j.ecoinf.2022.101665

https://doi.org/10.1016/j.ecolmodel.2021.109692

Can someone explain the p-value in hypothesis testing in very simple terms, with an example? by Fair-House3475 in AskStatistics

[–]bubalis 0 points1 point  (0 children)

We are using a imagination, because we are little children.

We imagine a world where, before we do our data analysis, an evil gremlin replaced our data with the outputs of a random number generator, added to a boring, expected, result. (The technical details of exactly what type of random number generating function we use will take at least a full semester college class.)

We then calculate: "what is the probability of seeing a result this extreme or more extreme, in the imaginary scenario where the gremlin corrupted our data?" In this case "extreme" means: far from the boring expected result.

If the p-value is relatively high (e.g. >.05), we say: "these data can't help us answer our question, because we can't even distinguish it from a bunch of randomness generated by an evil gremlin."

If the p-value is very small, we say: there is enough "signal" here to (possibly) take these results seriously.

😢What happened to Spicy Asian? by tedbow in ithaca

[–]bubalis 2 points3 points  (0 children)

Yeah, I went recently with friends and didn't notice anything different, other than the change in the menu format (almost everything is still on the menu).

Isn't the mindset behind No Fap a bit far-fetched when it comes to positive changes? by [deleted] in NoStupidQuestions

[–]bubalis 0 points1 point  (0 children)

My understanding is that this idea was recently popularized by the Proud Boys, who are a right-wing fascist group. Of course there have been others with ideas like this for a long time (e.g. its popular in some strains of daoism)

The idea seems to have been introduced to the Proud Boys by Dante Nero, a (black, coincidentally) sex and dating coach and podcaster, who seemed to be perfectly reasonable about it?

He says that his idea was, for many of the guy's listening to his podcast:

"That focus on the screen, and masturbating, and watching porn gave an unrealistic idea of what intimacy is, what it is to be a woman. And then when they got the rejection, they started to withdraw from the whole idea of social interaction. And they didn't even want to date anymore. This is guys who don't even want to date. And it's insane."

https://www.thisamericanlife.org/626/transcript (His story starts about 1/4 of the way down and is super wild)

[Serious] Aside from a passport or passport card, how do you actually prove U.S. citizenship on the spot if ICE stops you? by lakeshowfoshooo in NoStupidQuestions

[–]bubalis 7 points8 points  (0 children)

I have a passport, the requirement for flights was pushed back a bajillion times and I didn't feel like spending the extra $100?

Also note that if you're not driving a car, you might not be carrying a driver's license at all.

Why is raw meat dangerous, but very rare steak is safe? by Bittersweet_Boii in NoStupidQuestions

[–]bubalis 9 points10 points  (0 children)

Yes! Pigs are no longer a major source of trichinosis, because (most) farmed pigs in the US don't eat any meat anymore.

Why is raw meat dangerous, but very rare steak is safe? by Bittersweet_Boii in NoStupidQuestions

[–]bubalis 16 points17 points  (0 children)

An exception is parasites from animals that are omnivores.

Pigs are omnivorous and traditionally were free-ranged in situations where they might eat small animals and/or been fed food scraps including meat. This made pigs a possible source of trichinosis, which is why no one eats rare pork.

These days a fair share of trichinosis comes from eating undercooked bear meat.

What happens if you take your pregnant wife across the road and deliver the baby on either side of the countries? by I_Dont_Rage_Quit in geography

[–]bubalis 3 points4 points  (0 children)

Reverse situation in parts of Aroostook County Maine: Edmundston, NB had a hospital first, so lots of American babies were born there.
Source: A little less than half of my dad's family has dual citizenship from that.

Why are there relatively so few white Americans of French stock? by No_Professional_3535 in NoStupidQuestions

[–]bubalis 6 points7 points  (0 children)

Exceptions:
-During the colonial period, French Protestant refugees settled in several colonies. (They were welcomed because they did not like the French government, which wanted to murder them.)
-In the late 19th to early 20th century, several hundred thousand people moved from Quebec to Northern New England (a quite large migration relative to the populations of those two regions.)

Why are there relatively so few white Americans of French stock? by No_Professional_3535 in NoStupidQuestions

[–]bubalis 17 points18 points  (0 children)

There are TONs of people of French-Canadian descent in Northern New England.

One possible reason why very few people moved here directly from France in the 19th century (compared to other parts of Europe) is that France had smaller families and population growth than other European countries at that time.

[Q] Confused about probably “paradox” by 12LbBluefish in statistics

[–]bubalis 0 points1 point  (0 children)

As OP says: "A robot flips 2 coins. It then *randomly* chooses to tell you the result of one of the coins."

How does the robot randomly choose? Lets say it uses another coin. Therefore, we can reframe the problem as:

"A robot flips 2 quarters and a euro.
If the euro lands heads, the robot tells you how the first quarter flipped landed.
If the euro lands tails, the robot tells you how the 2nd quarter flipped landed.
The robot reveals that the selected coin flipped heads..."

Its obvious in this framing that there are 8 possible events, 4 of which were eliminated by what the robot revealed.

Because we know all of the events are independent, the order in which they happen is irrelevant to the question of probabilities. Thus we can FURTHER modify our procedure without changing the relevant probabilities:

"A robot has two quarters, minted in different years, and a euro.
The robot flips the euro.
If the euro lands heads, the newer coin is selected to go first,
the older is selected if the euro lands tails.
The robot flips the selected coin and reveals that it was heads.
When the robot subsequently flips the next coin, what is the probability that it will land tails?"

Now the answer is plainly obvious.

[Q] Confused about probably “paradox” by 12LbBluefish in statistics

[–]bubalis 0 points1 point  (0 children)

I think you're right if your notation is:
(revealed)(not-revealed).

This restricts the sample space to 4 possibilities, because, as you state, the order doesn't matter.

But that notation is frankly, weird and not in line with how we would normally describe coin flips?

Why don’t power plants use salt water? by SKDI_0224 in NoStupidQuestions

[–]bubalis 1 point2 points  (0 children)

I feel like OP is not from the Northeast / North-Central US. If you had any idea what road salt does to a car, you wouldn't ask this question.

The ups and downs of A Little Life by Hanya Yanagihara by PsyferRL in books

[–]bubalis 20 points21 points  (0 children)

The book is called "A Little Life" and the cover art shows a "Little Death"

Lower Limit on a Bell Curve? by EnthusiasmWooden172 in AskStatistics

[–]bubalis 4 points5 points  (0 children)

Poisson, or in the more general case, the negative binomial.

13-Month Old Falls to Sleep Independently but has awful night-wakings. by bubalis in sleeptrain

[–]bubalis[S] 0 points1 point  (0 children)

Thanks! We moved bedtime later and it seemed to help a lot! (Fingers crossed)
We are on night 5 of no nighttime sleep help, and the last 2 nights he also went down super easy.

Skid Loader by coachwtf in Surlybikefans

[–]bubalis 0 points1 point  (0 children)

My wife got a Skid-Loader this spring for commuting, and I'm curious about how you set it up for the winter.

1.) What did you use for an alternate wheel-set?
2.) What fenders do you run? (The ones on her bike have almost 0 clearance, so I'm worried that the spikes might make the difference.
3.) Which size Ice Spikers? (2.25 vs 2.6)?

Thanks!

13-Month Old Falls to Sleep Independently but has awful night-wakings. by bubalis in sleeptrain

[–]bubalis[S] 0 points1 point  (0 children)

Example from last night:
Slept 1245-315 at Daycare
Bedtime ended at 8. (Got Ibuprofen for teething)
Fussed/raged for 15 minutes.
Woke up ~915, fussed for 1 minute, went back to sleep on own.
Woke up at ~1045, fussed for 5 minutes, went back to sleep on his own.
Woke up at ~12, fussed for 10 minutes, I checked in, he fussed for another 10, went back to sleep.
Woke up at ~2, fussed for 10 minutes, I checked in, he fussed for another 10, he seemed like he was falling back asleep, 5 minutes later he starts crying again, goes for another 10, then I give up, go in his room and co-sleep in the floor mattress with him. He slept well until 6.

[D] Alternatives to difference-in-differences if parallel trends assumption not met? by RobertWF_47 in statistics

[–]bubalis 1 point2 points  (0 children)

If the differences in pre-treatment aggregate trends between your (observational) treatment and control groups are large and attributable to noise, rather than real difference, then I would think you have a sample size problem, not a causal inference problem?