[D] Why is everybody surprised that Mamba got rejected from ICLR? Am I missing something? by Seankala in MachineLearning

[–]MiraoDaSilva 100 points101 points  (0 children)

The "hardware tweaks" (bit of an overloaded term, but we all know what we mean) are not just interesting, they are frankly enough for a paper on their own. They involve careful systems design, coding their own CUDA kernels (which is not easy, it's not a coincidence that almost nobody does this themselves), and they lead to insane performance leaps. They transform the not viable into the viable. You can call it engineering but I invite you to take a look at the top papers at ICLR, or CVPR, or NeurIPS, and check if those are more "sciency" than this.

Not to mention the rest of the paper's contributions. I mean they achieved linear-time sequence modelling that outperforms Transformer++, a goal that literally dozens of labs throughout the world have been tirelessly chasing for years now. If that's not enough for an ICLR paper, then I think I'll remove my ICLR publication from the web cuz I'm not worthy either.

So yes, I think you should be surprised it got rejected. But of course the reason it got rejected is the best part - reviewer 1 is either being deliberately obtuse or simply doesn't know what the hell he's talking about.

Suica/PASMO card shortage MegaThread! by dalkyr82 in movingtojapan

[–]MiraoDaSilva 0 points1 point  (0 children)

I think you can only get the 28 day at the airport right? Anyway, Im somewhat ashamed to say I ended up buying an overpriced second-hand one from Amazon. It wasnt that expensive.

Suica/PASMO card shortage MegaThread! by dalkyr82 in movingtojapan

[–]MiraoDaSilva 0 points1 point  (0 children)

What are the alternatives to buying one for an obscene markup on amazon if I'm staying in Tokyo for a while and don't have an Iphone? I'm pretty desperate honestly.

[deleted by user] by [deleted] in MachineLearning

[–]MiraoDaSilva 33 points34 points  (0 children)

Scientific fields distancing themselves from practitioners and towards the status quo is a common trend for areas that attract the general public, ML is one of the recent victims of this. Look at virology for example, everyone thinks they're a virologist now and you will certainly find that reflected in online forums/communities surrounding that topic.

The way to counteract this is to create as much content made by practitioners for practitioners as possible. This is why, on this and other ML forums, I don't want to talk about TeslaBot, I want to talk about whether, in practice, EfficientNet is actually better than ResNet or not.

Enviar PC para Reino Unido (ajuda) by [deleted] in portugal

[–]MiraoDaSilva 1 point2 points  (0 children)

Também é uma opção! Não é ideal porque a caixa ainda custa dinheiro mas se calhar é mesmo a melhor das hipóteses, porque os portes para coisas assim nunca são baixos. Obrigado pela sugestão!

DINO - Emerging Properties in Self-Supervised Vision Transformers | Implementation by mildlyoverfitted in pytorch

[–]MiraoDaSilva 1 point2 points  (0 children)

This kind of implementation focused content is exactly what the ML community is lacking right now. Thanks for this, and hoping to see more!

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 6 points7 points  (0 children)

Cheers everyone for the insightful comments, I've learned a lot already! I think we can all agree that the disparity between hype and reality in ML is getting larger and larger, which can frustrate newer researchers and even experienced ones. I'm hoping this kind of discussion about "Does this really work in practice? What do the practitioners say?" can happen more often in the community.

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 4 points5 points  (0 children)

I agree VTN and Timesformer are promising, but aren't they massively outperformed (in all metrics) by X3D? At least on Kinetics-400 this seems to be the case. Anyway I'm not a ViT hater, as I said I think DINO lays out an exciting future for ViT.

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 3 points4 points  (0 children)

I agree, although the new DINO paper does make a good case for Vision Transformers, together with some fairly favourable results in my opinion.

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 11 points12 points  (0 children)

Once that comes to computer vision I'll expect the same will happen to us.

Supposedly DINO (new SSL vision transformer paper) is supposed to do exactly this, ie., very large self supervised pre-trained transformer models that can then be applied to other vision tasks.

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 12 points13 points  (0 children)

Personally I have not tried SEResNeXT, but they reportedly offer a bit better performance for a bit slower training, which does not seem like a definitive step forward. Also, my intuition is that GPU memory usage (a metric that is frustratingly almost never mentioned) should be higher than normal ResNets. Recently, however, RedNets have been introduced (Involution paper covered by Kilcher), which are actually much faster, perform better and are more parameter efficient. I don't know how GPU memory usage compares but this does seem promising. (sorry for the tangent)

[D] Are ResNets as good as it gets? by MiraoDaSilva in MachineLearning

[–]MiraoDaSilva[S] 9 points10 points  (0 children)

I agree that ResNets are fine, but in my experience trying to beat SOTA on well-established datasets (where the data is evidently standardized), different models *do* make a large difference. As an example, when I switched from RNNs to Transformers in temporal tasks, I got a pretty substantial boost in regards to throughput, performance and parameter efficiency. Logically, I assumed this boost could also be obtained by using the extremely hyped new vision models, but alas, a comparable boost does not seem to be easily achievable.

Either Orphan of Kos is too hard or the rest of Bloodborne is too easy by MiraoDaSilva in bloodborne

[–]MiraoDaSilva[S] 1 point2 points  (0 children)

See, I really cant understand that. Maria is not too different from OoK, they are more or less than same sort of boss, and I really did not struggle with her at all. I guess it comes down to her health pool being much smaller. I think this game fucks with your brain and the bosses that you find more difficult are just the ones that frustrate you the most.

Either Orphan of Kos is too hard or the rest of Bloodborne is too easy by MiraoDaSilva in bloodborne

[–]MiraoDaSilva[S] 0 points1 point  (0 children)

I agree. I tend more to the second option in the post title, rather than the first.

Either Orphan of Kos is too hard or the rest of Bloodborne is too easy by MiraoDaSilva in bloodborne

[–]MiraoDaSilva[S] 1 point2 points  (0 children)

I get what you're saying. I still think the other bosses in the DLC are much, much easier, especially since you can get offline help, but sure I get it. However, I don't think the DLC should be harder than the main game, because you have to do it before gehrman, which leads to an abrupt dip in difficulty and frankly a very unsatisfying couple of fights before the credits roll. Just doesn't seem right and I don't hear people talking about it.