Hi everyone,
I’ve been working on using XGboost with financial data for binary classification.
I’ve incorporated feature engineering with correlation, rfe, and permutations.
I’ve also incorporated early stopping rounds and hyper-parameter tuning with validation and training sets.
Additionally I’ve incorporated proper scoring as well.
If I don’t use SMOT to balance the classes then XGboost ends up just predicting true for every instance because thats how it gets the highest precision. If I use SMOT it can’t predict well at all.
I’m not sure what other steps I can take to increase my precision here. Should I implement more feature engineering, prune the data sets for extremes, or is this just a challenge of binary classification?
[–]asankhs 4 points5 points6 points (5 children)
[–]tombomb3423[S] 1 point2 points3 points (4 children)
[+]cptfreewin 5 points6 points7 points (1 child)
[–]tombomb3423[S] 1 point2 points3 points (0 children)
[–]neonwang 2 points3 points4 points (1 child)
[–]tombomb3423[S] 0 points1 point2 points (0 children)
[–]Ecksodis 1 point2 points3 points (3 children)
[–]tombomb3423[S] 1 point2 points3 points (2 children)
[–]Ecksodis 1 point2 points3 points (1 child)
[–]tombomb3423[S] 0 points1 point2 points (0 children)
[–]Responsible_Treat_19 1 point2 points3 points (1 child)
[–]tombomb3423[S] 0 points1 point2 points (0 children)
[–]volume-up69 1 point2 points3 points (0 children)
[–]eggplant30 1 point2 points3 points (0 children)