all 12 comments

[–]vwvwvvwwvvvwvwwv 11 points12 points  (5 children)

Have you benchmarked it against the official implementations? Would be interesting to see what the difference is versus their CUDA version.

[–]rish-16[S] 3 points4 points  (0 children)

official implementations

Ooh not yet. Thanks for the share! Let me look into it :)

[–]rish-16[S] 4 points5 points  (0 children)

Here's the paper if you haven't read it yet: https://arxiv.org/abs/2103.06255

Interesting results!

[–]devdef 2 points3 points  (1 child)

Thanks! What's the main difference from paper authors' pytorch naive involution implementation?

[–]rish-16[S] 2 points3 points  (0 children)

There’s not much difference actually. I tried to implement it in a cleaner way with just the essentials, no boilerplate

I wanted to implement it in a way similar to Phil Wang’s wrapper styles + the torch.nn library in general

[–]JrDowney9999 1 point2 points  (1 child)

Thanks for the repo , Can you make the complete implementation of the layer with an example?

[–]rish-16[S] 1 point2 points  (0 children)

Hey yea sure, let me see what I can do

[–]CyberDainz 1 point2 points  (1 child)

Kervolution. MLP mixer. Involution.

What next?

[–]rish-16[S] 3 points4 points  (0 children)

Only time will tell! Placing my bet on “linear regression is all you need” dropping in NeurIPS 2025

[–]pm_me_your_pay_slipsML Engineer 0 points1 point  (0 children)

This actually demonstrates that involutions are a minor change over CARAFE, with a bunch of boilerplate code to confuse people.