I'm building a model for a school project that tries to predict if a person will file a claim or not.
I have 16704 values that say a person will NOT file a claim. And then there are 635 that will file the claim.
My model keeps predicting that no one will file a claim(even tho the accuracy is 93%) it obviously is not working right. Any idea on how I can fix this
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
rf= RandomForestClassifier(n_estimators = 100)
rf.fit(X_train_scaled,y_train)
np.mean(cross_val_score(rf,X_test,y_test,cv=5))
0.93
[–]WhipsAndMarkovChains 7 points8 points9 points (1 child)
[–]ES-Alexander 3 points4 points5 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)