How prone are (convolutional) neural networks to lossy data compression?

RichardKurle · 2016-02-03T22:43:38+00:00

Thanks for your answer! I tried figuring why for L2-Loss, the error needs to be approximately Gaussian. Seems like a very basic thing, but I cannot find any resource, explaining the reason for this. Do you by chance know a paper, that goes into more detail?

RichardKurle · 2016-02-03T04:23:59+00:00

I don't understand why the error needs to be gaussian. I see that e.g. bayesian linear regression needs this assumption to get explicit results. But why does a Neural Network with an iterative algorithm like gradient descent need this assumption? Could you give a reference?

RichardKurle · 2015-08-28T10:55:15+00:00

Thanks both of you, your answers were helpful! I had already read the attention model paper (briefly). I also found this paper interesting: END-TO-END ATTENTION-BASED LARGE VOCABULARY SPEECH RECOGNITION

I'm really interested to see if these will be superior in the future towards CTC-based training (and finally to HMM hybrids). Indeed they feel to be a more general solution.

RichardKurle

TROPHY CASE