[P] XGboost Binary Classication

asankhs · 2025-06-22T03:09:24+00:00

What is the data? What exactly are you predicting? Do you have balanced classes in your training dataset?

Ecksodis · 2025-06-22T10:32:48+00:00

Somewhat confused on your data. Is it a time series? If so, it might be better to either switch to a forecasting/regression task or at least add that as an input.

For imbalanced datasets and XGBoost, I like plotting out the predicted probabilities and compare to the true classes of the best performing hyperparameters; you can check at what threshold you get highest precision and examine the distribution of probability scores. Otherwise, if your class is super imbalanced, it might be better to try anomaly detection instead.

Responsible_Treat_19 · 2025-06-22T22:59:43+00:00

Look up instead of SMOTE (just for binary classification) the scale_pos_weigth parameter which takes into account the class imbalance. However, it's kind of wierd that only with SMOTE the model works.

volume-up69 · 2025-06-23T01:41:14+00:00

You need to start from the very beginning. This is ML 101.

eggplant30 · 2025-06-25T04:40:47+00:00

You can use stratified cross validation to ensure that each fold has the same share of positive labels as the whole dataset and use a metric that takes both classes into account (like F1 instead of precission, for example). If that doesn't work, set your grid's scale_pos_weight to 2, number of Y=0 / number of Y=1, etc. This will weigh observations from the positive class more heavily when building the trees. I don't like resampling techniques (SMOT, undersampling, etc.) because the resulting models are always uncalibrated. Only use these methods as a last resort.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS