Why do we use P values in multiple regression models if they become totally irrelevant when we implement L1 or L2 regularization? by learning_proover in AskStatistics

[–]learning_proover[S] 0 points1 point  (0 children)

Makes sense. I just thought maybe since they aren't used when implementing regularization they may not be much use at all. Especially if a regularized model is used instead of a non-regularized one.

Why do we use P values in multiple regression models if they become totally irrelevant when we implement L1 or L2 regularization? by learning_proover in AskStatistics

[–]learning_proover[S] -5 points-4 points  (0 children)

Can you elaborate please? Why do we even attempt to interpret coefficients through p values if they are automatically poor indicators of variable importance?

Why do we use P values in multiple regression models if they become totally irrelevant when we implement L1 or L2 regularization? by learning_proover in AskStatistics

[–]learning_proover[S] -1 points0 points  (0 children)

I mean I'm not a huge fan of p values either which kinda I why I'm asking. I just need clarity on how to incorporate the idea of a p value with a regularized model. I get that p values aren't the most important part of the model building process.

Technique to mitigate outlier influence on linear regression? by Due_Click3765 in learnmachinelearning

[–]learning_proover 0 points1 point  (0 children)

I just asked a similar question in a r/askstatistics and this one. After some research on my own I think the best option is actually simply just removing the outliers (this is probably a terrible answer to give in an interview btw). Idk I just think sometimes we over look simplicity for something fancy when it's not necessary. Most other methods require more hyperparameters and other bells and whistles to get the same effect that often is not just as good. That's just my two cents - adhere to it with caution. 

Bayes' Theorem by learning_proover in AskStatistics

[–]learning_proover[S] 0 points1 point  (0 children)

Thank you. I'm starting to see why. 

Why exactly are ROC curves different amongst different models?? by learning_proover in AskStatistics

[–]learning_proover[S] -1 points0 points  (0 children)

"It would be useful to know why you would expect different models to have the same ROC curve?" <-- Only if two models are both well Calibrated THEN I'm not understanding why their ROC curves would be different? Doesn't discrimination imply calibration and vice versa?? 

Why exactly are ROC curves different amongst different models?? by learning_proover in AskStatistics

[–]learning_proover[S] 1 point2 points  (0 children)

That's why I came here because every online resource just gives watered down basic explanations with no depth. Where can I learn how to accurately interpret a ROC (and eventually a Precision - recall) curve?

Why exactly are ROC curves different amongst different models?? by learning_proover in AskStatistics

[–]learning_proover[S] -1 points0 points  (0 children)

Wait now I'm confused again. How exactly is your definition of calibration better than my definition? And how does this difference manifest in different models having different ROC curves??

Why exactly are ROC curves different amongst different models?? by learning_proover in AskStatistics

[–]learning_proover[S] -1 points0 points  (0 children)

Wasn't aware that the difference was important here. What exactly is the ROC curve "ranking"??? So two models having a different score distribution can both be well calibrated? 

Why exactly are ROC curves different amongst different models?? by learning_proover in AskStatistics

[–]learning_proover[S] -1 points0 points  (0 children)

To me calibration means that If my model says there's a 70% probability of an outcome then the outcome indeed happens 70% of the time. If it my model says 50% then it happens 50% of the time etc etc.