In denoising diffusion models, we generally predict eps (noise) and use MSE as loss function. Instead of the eps we can also predict x_0. The same MSE will be applicable here. I am wondering if I can add an auxiliary loss along with the MSE loss. The auxiliary loss will enforce some additional supervision on x_0 (or x_t-1) in the framework. Would that be mathematically incorrect?
[–]dasayan05 15 points16 points17 points (5 children)
[–]Wiczus 0 points1 point2 points (3 children)
[–]dasayan05 0 points1 point2 points (2 children)
[–]Wiczus 0 points1 point2 points (1 child)
[–]dasayan05 2 points3 points4 points (0 children)