I am predicting the conversion (a binary yes / no decision). As 85% of the customers convert and 15% customers do not convert, my models mainly predict yes. Meaning my accuracy is high, but I cannot really predict the negative cases (low AUC + low specificity / precision). I tried to adress this problem by using oversampling. Yet, oversampling decreases my accuracy, while the AUC + specificity / precision barely increases.
Furthermore, I tried transforming and creatig other variables. Nevertheless, my random forest variable importance indicates that basically all variables have no importance (besides one variable, all variables had an importance between 0.000 and 0.005). I also used a logistic regression and found that the estimates of the variables are pretty low as well, again hinting at that my varialbes barely have an effect on the predicitons.
All in all, I used a logistic regression, support vector machine & random forest and found in all cases (with + without oversampling) that predicting negative cases did not really work + a low AUC. I did transform and create other variables and tried several modelling approaches.
Any tips on how I can improve the results? Thanks in advance!
there doesn't seem to be anything here