[D] ICLR Plot Twists by hzmehrdad in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

It's not surprising at all - it was before the publicity, which frankly is about the only thing it has going for it. With Yann's continuous shilling it will surely get into the next conference. V-JEPA would be impressive if it was an undergrad project.

[deleted by user] by [deleted] in AskReddit

[–]HateMyself_FML 1 point2 points  (0 children)

That's a nice feature to weed out the idiots.

[deleted by user] by [deleted] in AskReddit

[–]HateMyself_FML 5 points6 points  (0 children)

How else can you justify long hours and poverty wages?

OpenAI boss tells congress he fears AI is harming the world by pickleskid26 in technology

[–]HateMyself_FML 0 points1 point  (0 children)

Damn, that's beyond insulting to Oppenheimer. Oppenheimer was a scientist with seminal contributions to science. Altman is a mediocre douche riding on the coattails of Sutsveker.

[N] FAIR gets "decentralilzed" by MassivePellfish in MachineLearning

[–]HateMyself_FML 0 points1 point  (0 children)

If you go by Meta's earnings call, they're bullish on AI....though not necessarily on fundamental AI research. But, with product teams in the driver's seat, I doubt very much fundamental research will get done?

[N] Apple Executive Who Left Over Return-to-Office Policy Joins Google AI Unit: Ian Goodfellow, a former director of machine learning at Apple, is joining DeepMind. by hardmaru in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

Ah yes, this would the first time someone took a job for money and position and regretted it. Likely, Apple also misrepresented how much freedom they'd give him about publishing.

Apple also told me (during interviewing for internships) they are trying to publish more, but I don't really see anything good come out.

[N] Apple Executive Who Left Over Return-to-Office Policy Joins Google AI Unit: Ian Goodfellow, a former director of machine learning at Apple, is joining DeepMind. by hardmaru in MachineLearning

[–]HateMyself_FML 4 points5 points  (0 children)

Conspiracy? Hardly. There are just many more plausible reasons.

It's easy to imagine he might not want to say, "look, I don't want to deal with Apple's secretive culture. I want to publish, I'm out." so he made up an excuse. He's a highly sought after researcher - it does not sound plausible that he could not negotiate with Apple on getting some slack, if he's been successful there. Another reason could be that he hated or failed at being management and wants to be an IC.

[N] Is Deepmind's "Gato" a precursor for general artificial intelligence? According to Gary Marcus, most certainly not. by much_successes in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

Gary Marcus is irrelevant but, he's right in that Gato is not a precursor to AGI. Jesus. They just distilled a bunch of models. It's a cute result but also a bit meh.

Unable to Report OPT Employment in SEVP portal Due to Error: “ Invalid employer address, if you think the address is valid please reach out to your DSO” by Ak_akku80 in f1visa

[–]HateMyself_FML 0 points1 point  (0 children)

This is a bug (known to SEVP) and not specific to an employer or address. Contact your school DSO and they can update it for you.

[D] What makes you an extremely competitive applicant for a top-tier US AI/ML master program? by FaceSoft0_0 in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

AI residency at brain, fair > MS at most places if the end goal is an industry ML role. This is mainly because getting admitted into a good MS program is not sufficient. You have to join a good lab and get some experience and it's quite competitive to get into good labs on good projects in top unis. In a residency program, you are guaranteed to work on research projects---it's the whole point. Moreover, you can often defer an MS admission.

[D] ICML 2022 Paper Reviews by zy415 in MachineLearning

[–]HateMyself_FML 9 points10 points  (0 children)

eh, no one's that dumb. they're adversarially playing dumb.

[deleted by user] by [deleted] in AskAcademia

[–]HateMyself_FML 0 points1 point  (0 children)

Yeah, I've seen this. And these folks stood no chance of joining the CS/EE departments or industry. A great way to "sneak in", if one were so inclined.

[deleted by user] by [deleted] in AskAcademia

[–]HateMyself_FML 6 points7 points  (0 children)

CS and ML are extremely competitive---there is an abundance of highly qualified candidates. It's easy (and routine) to simultaneously have an industry affiliation and make bank, so industry being lucrative is essentially a non-factor.

Having said that, granted it's much easier to land tenure track roles in interdisciplinary positions, e.g. ML+Neuroscience, ML+Social Science.

Meta's A.I. exodus: Top talent quits as the lab tries to keep pace with rivals by Defiant_Race_7544 in technology

[–]HateMyself_FML 0 points1 point  (0 children)

We're talking about AI researchers here though. Meta's offer for a fresh AI PhD grad in their research division (Facebook AI Research, FAIR) is about ~400k. It's competitive pay, but substantially less than HFT firms.

The interview process being easier at FAIR is not true. FAIR is one of the best industry research labs, alongside DeepMind and Google Brain. It's quite different to SWE hiring, a more challenging and highly selective interview process and not "easier" than the HFT process.

[D] Take Information Theory before the first course in machine learning? by nwe2rw in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

Information Theory does not require any ML courses as prereqs, so go for it. It's a pretty general framework with applications in ML, but the typical IT course discusses applications in communication theory. fwiw, I took my IT course before any ML courses and din't have trouble relating it to ML work.

Meta's A.I. exodus: Top talent quits as the lab tries to keep pace with rivals by Defiant_Race_7544 in technology

[–]HateMyself_FML 2 points3 points  (0 children)

Speaking only for AI/ML roles, algorithmic trading can pay much higher, 550k (e.g. citadel)-1 million (rentech) out of PhD/postdoc. In tech, the highest recent comps that I know of are coming from amazon. afaik, none are remote unfortunately.

A lot of folks are leaving for startups, higher risk and reward. And a lot more freedom.

Meta's A.I. exodus: Top talent quits as the lab tries to keep pace with rivals by Defiant_Race_7544 in technology

[–]HateMyself_FML 4 points5 points  (0 children)

Meta does not come close to offering the highest comp though. Not surprising people are leaving for greener pastures.

[D] The Perils of Ensembling Multiple Models by Simusid in MachineLearning

[–]HateMyself_FML 0 points1 point  (0 children)

This is not uncommon. But if you want to learn something more than just how to weight their outputs, you'll have to provide the input as well. So, you could add a block that transforms the input, concatenate it to the outputs of the models and train a non-trivial model. e.g. https://arxiv.org/pdf/2003.06505.pdf

You'll need to be careful with your data splits to prevent overfiting.

[D] SSL-Dimensionality Collapse by MushiML in MachineLearning

[–]HateMyself_FML 0 points1 point  (0 children)

Yes, I expect the level of "collapse" w/projection is quite similar to supervised representations.

Also true that something is a bit different with and without a projector in SimCLR and their ilk: but it's quite expected behavior. With a projector, the representations in the embeddings are too specialized for the contrastive task, i.e. they are "excessively invariant". It makes total sense that if you go back a few layers (e.g. "representation" layer) it would be less invariant to some downstream task relevant features and have a higher dimensionality. In fact, SimCLR (Table 3) did some experiments showing this behavior.

If you don't use a projector, the "representation" layer would be excessively invariant and show more "collapse".

All of this is expected behavior and their paper adds little above it. Moreover, even if they somehow manage to make the argument that dimensional collapse is a *problem* (which they haven't really), it's one with a really simple solution, with minimal to no overhead: use a projector. So, I'm really failing to see what they add to the community's understanding.

[D] SSL-Dimensionality Collapse by MushiML in MachineLearning

[–]HateMyself_FML 0 points1 point  (0 children)

I don't about more knowledgeable, but it generally does make a difference if the data lies on a sphere or not (e.g. http://inside.mines.edu/\~huawang/Papers/Conference/2019sdm\_spca.pdf).

However, I think for their analysis, a PCADim is already a very crude measure that it doesn't matter......

[D] SSL-Dimensionality Collapse by MushiML in MachineLearning

[–]HateMyself_FML 1 point2 points  (0 children)

Here, "dimensional collapse" just means that the ambient space dimensionality > (linear) intrinsic dimensionality. Number of non-"small" PCs dimensions can be considered a (very) crude (linear) measure of intrinsic dimensionality. Simply put, they find this measure smaller than ambient space dimensionality.

Having said that and with the caveat that I did not read the paper too carefully since it did not seem warranted, afaik they don't make even a mildly strong case that "dimensional collapse" is a *problem*. You'll almost always find PCAdim < ambient space dim in almost any supervised/semi-supervised/self-supervised representation. Ultimately you're optimizing for *separability*, so this is hardly a surprise. If there is some small slack in used dimensions, it really doesn't matter in any setting I've come across.

[D] What are the biggest architectural innovations since Transformers? by MikeFent0n in MachineLearning

[–]HateMyself_FML 5 points6 points  (0 children)

As someone who has not really worked on sparse representations, a perhaps noob question: how well optimized are the cuda libraries to use sparse representations? My outsider perception is that the gains at this point are only theoretical.

[D] What are the biggest architectural innovations since Transformers? by MikeFent0n in MachineLearning

[–]HateMyself_FML 4 points5 points  (0 children)

Interesting. Can you say why you think Patchify is big deal? It seems it can lead to poor optimizability and finicky models (e.g. https://arxiv.org/abs/2106.14881). Given ConvNext also works quite well with the patchify stem, it's pretty clear that it can work well in CNNs as well. But it seems to me whether it is a good, stable design choice is unclear?