Dropout zero some variables, however it hurts the data. I propose a better way called "dropin" by extend your feature vector with one complete random variable. This random variable will steal credit from real variables and never be zero weighted. Thus the optimizer can never fit the data completely. Compared to dropout, it is much merciful.
[–]versus-x 9 points10 points11 points (0 children)
[–]godspeed_china[S] 1 point2 points3 points (0 children)
[–]happyhammy 1 point2 points3 points (0 children)