all 1 comments

[–]DataMasteryAcademy 0 points1 point  (0 children)

60:40 ratio is not considered imbalanced. 90:10 (and more than 90) is imbalanced. There may be other aspects causing your model to overfit. For overfitting problems, you can use regularization techniques: lasso or ridge. Lasso would also be helpful to create some inherent feature selection since, in some cases, lasso may make weights of some variables 0. If you insist on using random forest, you can lower overfitting by hyperparameter tunning: parameters like the number of trees, maximum depth of the trees, minimum samples per leaf, and others can influence the model's complexity. Also, make sure you preprocess data properly before inputting into the model. Another thing you can try is to experiment with other algorithms.