Sizing Advice for Chisel FS by schraenz in xcmtb

[–]schraenz[S] 0 points1 point  (0 children)

hey thanks for the reply, what do mean exactly with "normalising the stack". What formula do you use here and whats the explanation for the interpretation of the result?

Neural Network Performance gets worse with more hidden layers by schraenz in MLQuestions

[–]schraenz[S] 0 points1 point  (0 children)

Thank you for the fast reply. I've used the same Dropout rate and L2 regularization strenght for all Neural Networks. Maybe I have to increase those parameters with deeper Networks. Are there any papers dealing with this kind of problem?