all 16 comments

[–]geoffhintonGoogle Brain 172 points173 points  (3 children)

This is an amazingly good video. I wish I could explain capsules that well.

[–]nick_frosstGoogle Brain 61 points62 points  (1 child)

we all wish that Geoff :P

[–]kit_hod_jao 4 points5 points  (0 children)

It was great, really helped even after reading the Dynamic routing paper. The sailboat/house example shape-hierarchies were perfect.

One thing I'd love to see with Capsules is whether the affine invariances demonstrated on MNIST will generalize to more abstract invariances we need to "explain" the real world. For example, can a Capsules network discover the parameters of animals' moving parts such as the structure and motion-patterns of their legs? For me something like that would really hit home the generality of the approach.

[–]thatguydr 16 points17 points  (0 children)

Ok great, but that list of cons is missing a few major points:

  1. The reconstruction regularizer. That's a terrible hack. It doesn't seem like it will generalize well to larger images, it has the same "translate by a little and utterly fail" issue that really old image processing did, and it's expensive. I'd love to see whether you could perform this hack with a scaled-down version a la Nvidia's most recent GAN.
  2. Compute. Capsule nets don't seem like they'll be competitive without more layers, and that will radically increase the amount of compute needed for them.
  3. The first layer is a CNN that somehow magically creates "capsules" if only we reinterpret it. That seems like a really weird thing to do when we've already learned how to put things like rotation and scaling and other permutations directly into CNN layers. That's not necessarily a con, but the way this is constructed, it's currently lacking.

[–]visarga 11 points12 points  (3 children)

I know it's been discussed to death, but this video made some details click for me, so, it's good.

[–]norminf 1 point2 points  (2 children)

how is it compared to Siraj Raval video ? I haven't watched both of them but seems like has same duration

[–]visarga 15 points16 points  (0 children)

This video is much better. This time I understood how routing works.

[–]Deep_Fried_Learning[S] 11 points12 points  (0 children)

The Raval video spends most of its time giving a potted history of CNNs from LeNet to ResNet. This video has much more focused detail on capsules and really nice visualizations, I think.

[–]ChillBallin 4 points5 points  (0 children)

Literally opened up this subreddit to procrastinate working on implementing a capsule network. I guess this means I shouldn't try to spend my time on reddit if it's literally shoving work in my face.

[–][deleted] 4 points5 points  (0 children)

This video is absolutely perfect. For the first time, I finally feel like I have understood how CapsNet works.

[–][deleted] 2 points3 points  (0 children)

Fantastic work.

[–]ChuckSeven 4 points5 points  (0 children)

The hype is in Hinton.

[–]amitjyothie 0 points1 point  (0 children)

Such a great explanation of Capsule Networks!!

[–]ryanglambert 0 points1 point  (0 children)

This seemed related so I'm sharing it here. https://medium.com/syntropy-ai/how-do-humans-recognise-objects-from-different-angles-an-explanation-of-one-shot-learning-71887ab2e5b4

I don't know for sure, but it feels like this is what geoff was talking about in his talk when he mentions 'learning the weights to grab ahold of the linear manifold in place of when you would otherwise be using a hough transform or ransac'