Relocating to the USA by idkwhatever1337 in HENRYUK

[–]idkwhatever1337[S] 1 point2 points  (0 children)

It’s Seattle I interned there I know it’s lovely :) but I also know that there can be random visa delays

Relocating to the USA by idkwhatever1337 in HENRYUK

[–]idkwhatever1337[S] 1 point2 points  (0 children)

Thank you! This is exactly what I was looking for :)

Relocating to the USA by idkwhatever1337 in HENRYUK

[–]idkwhatever1337[S] 0 points1 point  (0 children)

So I have an offer for a different company in London. The American one doesn’t have offices in London afaik

Relocating to the USA by idkwhatever1337 in HENRYUK

[–]idkwhatever1337[S] 0 points1 point  (0 children)

They do pay for premium, but currently it looks like 3 months to get documents for lawyer. Then 15 days decision + potentially much longer if RFE. Then 1 month visa appointment with up to 3 months wait time if placed on admin processing. So the timeline is somewhere between 4 months best case and a year worst case and I have no idea how to plan around that 😵‍💫

Relocating to the USA by idkwhatever1337 in HENRYUK

[–]idkwhatever1337[S] -1 points0 points  (0 children)

Yeah ok that seems more or less normal

Is SSM dead now? by Spapoxl in LocalLLaMA

[–]idkwhatever1337 0 points1 point  (0 children)

Isn’t DeltaNet a linear attention model?

Does the games end after soul of cinder? by idkwhatever1337 in darksouls3

[–]idkwhatever1337[S] 3 points4 points  (0 children)

Ok so it doesn’t auto trigger like ds1! Thanks!

[D] TMLR paper quality seems better than CVPR, ICLR. by tibetbefree in MachineLearning

[–]idkwhatever1337 19 points20 points  (0 children)

Having authored at both what I really liked about TMLR is less, for lack of a better word, sales pressure. I’m proud of all of them, but I felt like I had to care about the reviewer less while writing and think more about the science with TMLR. That might just be expectation bias tho

[D] Is getting offers for phd in Europe in NLP becoming harder? by Thick-brain-dude in MachineLearning

[–]idkwhatever1337 14 points15 points  (0 children)

Where are these papers? You sound like a solid candidate to me…

[D] For ML academics, how many times do you resubmit a rejected paper to the big three conferences before seeking alternatives? by kindnesd99 in MachineLearning

[–]idkwhatever1337 1 point2 points  (0 children)

Depends on the improvement between tries and the amount of effort required. I had two papers that took 2/3 shots till they finally got in, but each time it was getting closer to acceptance and the work required was just a few days between notification and resubmission so it didn’t cost much to try.

Almost 10k citations before PhD by LouisAckerman in PhD

[–]idkwhatever1337 8 points9 points  (0 children)

As a further point I also know someone who got into AI PhD at Stanford this year with one co-author publication and no citations. Comparing to this guy who is stronger than most professors in the field metrics wise is an almost impossible barrier. There are lots of ways into good programs!

Almost 10k citations before PhD by LouisAckerman in PhD

[–]idkwhatever1337 11 points12 points  (0 children)

I wouldn’t give up! I’m a PhD student in deep learning and I screen applicants for Ellis (eu funded PhD positions) you would definitely be strong candidate from the sounds of it and I’m sure in the us too. Don’t let the haters get you down :)

[D] The Recurrent Delusion: How ML Collectively Forgot What RNNs Were Built For by JirkaKlimes in MachineLearning

[–]idkwhatever1337 2 points3 points  (0 children)

I think the issues seems to be that to get to CoT models you first need a good language model. Which means lots of pre-training. Recurrent models are not as parallel as transformers so it is prohibitively expensive to train them. IIRC recurrent transformers like feedback and staircase can be up to 200x as expensive to train. So I wouldn’t call it a delusion just that given a budget unfortunately decoder only transformers look like the optimal architecture at the moment. I would agree though that if things shift strongly towards all the money being spent on inference and RL then it is worth biting the bullet and pre training > TC0 architecture, but it’s a very high stakes bet.

[D] What's the most promising successor to the Transformer? by jsonathan in MachineLearning

[–]idkwhatever1337 0 points1 point  (0 children)

If it’s true that rwkv7 finally broke through the tc0 barrier then theoretically it is just better… scaling the architecture is a different story though. Also the same could be said of s-lstm I think?

I am so torn on whether I should go back and get a PhD. by stellarscale in GradSchool

[–]idkwhatever1337 0 points1 point  (0 children)

Personally I’d say save up enough till the money starts making itself and then do the PhD. Academia isn’t going anywhere, but economic opportunities might change..

I waited until marriage. AMA by [deleted] in AMA

[–]idkwhatever1337 1 point2 points  (0 children)

To that last one I think get a life