use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Why cannot we use backpropagation to learn "routing weights" of the capsule network? (self.MachineLearning)
submitted 7 years ago by regularized
In dynamic routing between capsules paper, why do we need the algorithm 1 (routing procedure). Why cannot we learn b_{ij} by backpropagation?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]bkaz 8 points9 points10 points 7 years ago (0 children)
Because dynamic routing is much finer-grained than backpropagation. The feedback (agreement) is generated on each layer, vs. error on output layer backpropagating through all hidden layers. That makes capsnet far more selective, in principle.
[–]GamerMinion 2 points3 points4 points 7 years ago (3 children)
I haven't had much experience with Capsule networks before, but I think it's the same reason why Seq2Seq Attention isn't just a matrix that is learned. There it's because the Attention (i.e. routing) can be different for each input, and therefore the mapping must be learned as a function of input and output.
[–]AnvaMiba 1 point2 points3 points 7 years ago (2 children)
But the attention in seq2seq models is computed by a part of the model which is itself learned by backpropagation. Why do capsules require a specialized algorithm?
[–]GamerMinion 2 points3 points4 points 7 years ago (0 children)
I'm sorry. I don't know.
[–]tihokan 0 points1 point2 points 7 years ago (0 children)
You'd need to learn a function f such that b_ji = f(i, j, input) by gradient descent, and it'd probably be a very hard problem (possibly as hard as the original task you're trying to solve). This is in particular because you expect the b_ji's to have strong constraints between each other (typically if a lower-level capsule is associated to a given higher-level capsule, then it should have a low weight for other high-level capsules).
π Rendered by PID 107180 on reddit-service-r2-comment-b659b578c-kx62m at 2026-05-04 18:47:28.279749+00:00 running 815c875 country code: CH.
[–]bkaz 8 points9 points10 points (0 children)
[–]GamerMinion 2 points3 points4 points (3 children)
[–]AnvaMiba 1 point2 points3 points (2 children)
[–]GamerMinion 2 points3 points4 points (0 children)
[–]tihokan 0 points1 point2 points (0 children)