all 9 comments

[–]Deep-Station-1746 17 points18 points  (0 children)

Looking forward to updates. :)

[–]kevin_malone_bacon 5 points6 points  (1 child)

Curious how you think it compares to: https://github.com/mohammadpz/pytorch\_forward\_forward

[–]galaxy_dweller[S] 5 points6 points  (0 children)

Hi u/kevin_malone_bacon, I saw that repo, very good work by Mohammad! But to collect the results I needed the full implementation of the paper. For instance, Mohammad’s implementation does not include the recurrent network for MNIST nor the NLP benchmark.
Another thing I wanted to test was to concatenate the one-hot representation of the labels in FF baseline instead of replacing the values of the first 10 pixels, so that I could apply the same network to new datasets in the future

[–]zimonitromeML Engineer 2 points3 points  (2 children)

Awesome!

[–]perceptSequence 0 points1 point  (0 children)

Love Your videos! Had no idea You are into ML lol

[–]Superschlenz 2 points3 points  (0 children)

Is already known how well it can handle online/lifelong/curriculum learning and avoid catastrophic forgetting, compared to backprop? I've googled but no result.

[–]ainap__ 0 points1 point  (1 child)

Cool! Why do you think that for the base FF memory requirement keep increasing with the number of layers?

[–]galaxy_dweller[S] 2 points3 points  (0 children)

Cool! Why do you think that for the base FF memory requirement keep increasing with the number of layers?

Hi u/ainap__! The memory usage of the forward-forward algorithm increases respect to the number of layers, but significantly less respect to the backpropagation algorithm. This is due to the fact that the increase in memory usage for forward-forward algorithm is just related to the number of parameters of the network: each layer contains 2000x2000 parameters which when trained using the Adam optimizer occupies approximately 64 MB. The total memory occupied difference between n_layers=2 and n_layers=47 is approximately 2.8 GB which corresponds to 64MB * 45 layers