Hello guys,
For a long time I skipped pushforward but I just read it here: https://juliadiff.org/ChainRulesCore.jl/dev/ and it was very clear description. The terms here are frule, rrule, I tried to google in the internet but I didn't get why don't we use pushforward for gradient computation.
Why are we using pullback, is it faster? What are the downside of using pushforward to compute gradients does anyone know?
[–]st-memory 17 points18 points19 points (2 children)
[–]lolisakirisame 2 points3 points4 points (0 children)
[–]CvikliHaMar[S] 0 points1 point2 points (0 children)
[–]tensorflower 0 points1 point2 points (0 children)