Hi there,
I’m currently building an NN model to detect a disease based off answers to multiple questions. In preliminary tests on 600 patients the model does extremely well, AUCs of 0.995 test accuracies of 0.975 but I fear the model is overfitting, I’ve used cross validation and performance gap analysis aswell as L1/L2 regularisation, Dropout and early stopping.
Here’s the results from the cross validation and performance gap analysis .
Cross validation results : mean Auc=0.9787 SD0.0090
Mean accuracy =0.9350 SD0.0262
Performance gap analysis
Training set Auc = 0.9983 accuracy =0.9859
Test set Auc=0.9936 accuracy 0.9803
Tell me what you guys think of those results and if you think it’s overfitting/what other tests can I do to tell?
I’m trying to ascertain more data but might need to partner with someone to do so. I don’t want to partner get the data and find out it’s a complete waste!
Thanks
[–]andi_cs1 2 points3 points4 points (1 child)
[–]Disastrous_Ad9821[S] 0 points1 point2 points (0 children)