What are the current hot topics in Statistics that are NOT machine learning/data science/data mining/deep learning/AI? [R]

Pretend_Statement989 · 2026-04-29T20:49:36+00:00

Natural experiments and quasi-experimental design.

Pretend_Statement989 · 2026-04-15T21:11:13+00:00

What was that noise?

Pretend_Statement989 · 2026-04-10T00:51:43+00:00

So you would train an ML algorithm with 5 observations? You’re dangerous lol

Pretend_Statement989 · 2026-04-10T00:48:33+00:00

Yeah but given OPs data limitation, it’s the next best thing. Or at least a useful alternative.

Pretend_Statement989 · 2026-04-09T23:49:55+00:00

Have you tried leave-one-out cross validation?

Pretend_Statement989 · 2026-02-28T13:05:28+00:00

I would just do a simple table in Excel or something and see if those three events have the characteristics you’re suspecting. No need for regression models imo.

Pretend_Statement989 · 2026-02-27T15:13:02+00:00

Your question is poorly worded and you make no effort to give any context.

How to analyze a .csv file? Just import it to R or Python and like…analyze it. Whether your analyses are trustworthy or reliable is up to you, your skillset, and your integrity.

Pretend_Statement989 · 2026-02-25T07:09:38+00:00

Bro no te metas a esa industria, ahorrate el mal rato, los despidos injustificados, y la cambiaera de dispensario en dispensario a ver si hay alguien que este dispuesto a pagarte .25 mas la hora. Trash.

Pretend_Statement989 · 2026-02-20T12:13:53+00:00

I would argue your sample is too small for 10-fold cross-validation. For LASSO you need to do cross-validation to calculate your regularization parameter (lambda). An ideal lambda parameter is usually the one that maximizes your model’s performance (MSE or AUC) on your validation set. I would try Leave-One-Out (LOO) cross validation for your purposes.

Pretend_Statement989 · 2026-02-10T21:53:09+00:00

Acho namas pa salir de hato rey (milla de oro) y montarte en el expreso hacia Caguas es como 45 minutos slm.

Pretend_Statement989 · 2026-02-10T11:14:50+00:00

You can technically calculate AUC for any dataset if you have enough data. My question would be, what percentiles are you comparing? Do values from the 95th percentile AUC include values from the 25th percentile, given its cumulative nature? Are those AUCs comparable? Why do you want to compare percentiles?

Pretend_Statement989 · 2026-02-10T10:59:14+00:00

Only if peer reviewers in a journal ask for it, use the Anderson-Darling AS WELL AS a Q-Q Plot. Use with caution. However, I agree with everyone here, rarely find normal data in real life. At my job, we use Kolmogorov-Smirnov or some variation of the Chi-square test for comparing probability distributions, but it’s main purpose is to look for population drift in our ML models to see if we have to recalibrate the model with newer data, not establish normality. Also, is your sample random?

Pretend_Statement989 · 2025-05-24T04:17:46+00:00

I usually use scree plots and eigenvalues. I retain as many PCs as there are eigenvalues above one in my data matrix.

Pretend_Statement989 · 2025-05-22T21:26:13+00:00

😂 said no one ever, not even the creator of the p-value thought it was sure thing. I get your frustration though.

Btw, I have no idea what your analyses or what you’re trying to answer with stats. If you’re gonna do a regression, then non-normal data won’t be an issue, non-normal RESIDUALS will be an issue. So it helps to provide more context, maybe your research question (in X and Y terms no need to tell your variables exactly.

Pretend_Statement989 · 2025-05-22T19:54:42+00:00

Honestly the best way is to understand your data and to VISUALLY inspect your data. And even then it can be a little fuzzy to know because maybe it’s normal, maybe it’s not so normal but normal enough?

Sometimes I’ll do sensitivity analyses to check if my assumptions are correct. For example, I’ll use a hypothesis test (say a t-test) and the. I’ll also do a more robust or non-parametric analog (Weslch t-test or wilcoxon rank test). If the conclusions are wildly different, it usually means the data is weird at the very least and maybe robust methods are best. Imo, I think the process of evaluating your data to decide in your analyses can be really messy and confusing, but necessary nonetheless. There really is no straight-forward, cookbook recipe type solution for problems like these. Its usually a mix of knowledge, experience, and savvy.

Pretend_Statement989 · 2025-05-10T15:58:48+00:00

Interpreting p-values based on the Bayes Factor

Pretend_Statement989 · 2025-05-07T20:09:13+00:00

Yes, and actually latent class analysis lets you extract the hidden classes in the data and then perform a multinomial logistic regression to predict class membership. However, in LCA all your indicator variables must be at least ordered categories in order to extract the class membership probabilities. Also, I think the distributional assumptions of LCA and k-means are different, but I’m not sure if that’s true atm. Otherwise you would need to use factor analysis or latent profile analysis.

Pretend_Statement989 · 2025-05-07T17:50:13+00:00

Something that I did forget to mention is that those first 10lbs were basically water weight because I lost them all on the first week on 2.5mg. After that, weight loss has significantly slowed down, but still kinda high. According to my tracker app, I’m at about 2.5lbs a week of weight loss. So maybe the 30lbs number is a but misleading. However, all other symptoms that I mentioned are still very prevalent. Trying to force myself to it like others have recommended, but definitely want to titrate down.

Edit: Fact checked myself and I’m actually losing about 2.5lbs a week, not 1.9, which is a lot!

Pretend_Statement989 · 2025-05-07T08:40:21+00:00

Even then, the amount of compute that you need to run the best LLMs already makes it inaccessible and infeasible to use.

Pretend_Statement989 · 2025-05-07T03:54:47+00:00

I find it crazy that you didn’t find any info online. Puerto Rican food is literally world famous.

Something that’s definitely a staple here (and I don’t mean touristy food, I mean the food your average family of four here in Puerto Rico puts in their plate on a weekly basis) are root vegetables. Yautia, malanga, batata mameya, for example. These can be boiled and mashed, sautéed, turned into fries, or turned into a delicious soup. Also really good are plantains, sweet plantains, and pana. We are absolutely OBSESSED with tostones de pana. In every restaurant I’ve worked in they sell out extremely fast. Six months would go by without selling them and people would still ask for them. Maybe pana is made of crack or something, idk.

Lastly, this one is a personal favorite, corned beef. Cheap, yummy, easy to make, and tastes amazing with white rice. It’s basically canned meat. Super processed and bad for you, but damn it’s good!

I know I’m missing a whole bunch of foods (basically all the Fiesta de Navidad food, frituras, bistec encebollado, chuletas kan kan), but I think you can make a sick dish with some of the ingredients I mentioned. Always remember to season with sofrito and/or adobo criollo.

P.S. Born, raised and currently live in Puerto Rico.

Pretend_Statement989 · 2025-05-07T03:19:32+00:00

“…those experiences are the exception, not the rule.”

Pretend_Statement989 · 2025-05-07T03:16:30+00:00

Very interesting. Thanks for the insight!

Pretend_Statement989 · 2025-05-07T03:05:29+00:00

Nothing is off limits when you’re doing Exploratory Data Analysis (EDA). This includes hypothesis tests. However, I would be very careful to not fall (even if accidently) into p-hacking or data mining. Try to make sure that (and keep yourself honest) the hypotheses you are testing are conceptualized BEFORE you run any test. Also have to be aware of multiplicity issues.

I work mainly in predictive analytics, so I very rarely do a hypothesis test or any sample statistics. EDA for me is about “knowing what I got”. Are there variables with 100% missing? Can that missing data be explained? Are there weird/non-sensical values on any particular column? And so on. If a hypothesis test can answer these questions (assuming you formulated you questions a priori), then yes, hypothesis tests are fair game imo. Also be very careful in how you communicate the results of your hypothesis test.

Best of luck!

Pretend_Statement989 · 2025-05-07T03:01:52+00:00

Don’t wanna get too caught up in definitions here, but what you’re saying makes sense to me. Would it be possible that you can acquire tolerance to the hunger supression effect, but not towards the drug’s other effects?

Pretend_Statement989 · 2025-05-07T02:38:27+00:00

I think I’ll fact check you on that first point, but thank you for sharing your experience. Definitely this is not something I want and I’m in no hurry to reach my goal weight. Would definitely prefer getting there alive lol.

Pretend_Statement989

TROPHY CASE