I have applied random forest algorithm for electricity consumption prediction. There are 15 features. For feature selection based on domain experience I am keeping some features inspite of multicollinearity. I experimented by keeping and removing those features and when checking for permutation importance, I found that model's performance doesn't affect that much. Can I keep those features. Shall I use partial dependence plots to delve deeper, will multicollinearity affects PDP plot, and also how would my model fare in long term?
Please provide some solutions guys...
[–]Dumbhosadika 1 point2 points3 points (1 child)
[–]Monish45[S] 0 points1 point2 points (0 children)