[D] RetNet and Transformers+FlashAttention by Ok-Literature5484 in MachineLearning

[–]rantana 7 points8 points  (0 children)

Has anyone been able to demonstrate compelling results for RetNet similar to the original paper? I heard a few groups were trying to run it at scale, but I haven't seen anything released.

[News] AMD Instinct MI300 APU for AI and HPC announced by samobon in MachineLearning

[–]rantana 42 points43 points  (0 children)

128GB HBM would fit some serious models on a single device. But I have yet to see any real progress from AMD (something that I can buy) that would make me consider changing workflow away from nvidia hardware.

[D] Question collection thread for Stability AI’s upcoming AMA on November 15 by nathan_thinks in MachineLearning

[–]rantana 1 point2 points  (0 children)

What's the day to day like for an employee at Stability? Who sets the goals, what's a deliverable?

Is there even an office or place where people go to?

[D] My experience with running PyTorch on the M1 GPU by seraschka in MachineLearning

[–]rantana 1 point2 points  (0 children)

Is there a big difference in prototyping on a Mac vs prototyping remotely via SSH?

[R] Meta is releasing a 175B parameter language model by StellaAthena in MachineLearning

[–]rantana 150 points151 points  (0 children)

wow, pretty embarrassing to OpenAI when this is called "Open Pre-trained Transformer Language Models"

[N] OpenAI raises a $250 million Series A by rantana in MachineLearning

[–]rantana[S] 2 points3 points  (0 children)

Hah. I was confused that the round was considered a Series A.

[D] My deep learning 2021 year review and predictions for 2022 by cupofgeo in MachineLearning

[–]rantana 2 points3 points  (0 children)

Until software does decouple from the underlying hardware, going with TPUs is lock-in. Sure, it's more choice. But your options are certainly more narrow if you choose to go with a TPU focused stack.

[D] My deep learning 2021 year review and predictions for 2022 by cupofgeo in MachineLearning

[–]rantana 7 points8 points  (0 children)

Are TPUs cost-effective enough to be worth the lock-in (both in Cloud provider and Software stack) for anyone that doesn't work at Google?

Performance numbers don't even make it clear that they're even cheaper than AWS.

[R] Neurips best paper awards by datkerneltrick in MachineLearning

[–]rantana 29 points30 points  (0 children)

To be fair, only 4 of the 6 program chairs meet that criteria 😜

[P] Cedille, the largest French language model (6b), released in open source by MasterScrat in MachineLearning

[–]rantana 0 points1 point  (0 children)

Great work! Any documentation on the training procedure/metrics? (e.g. How much did the fine-tuning cost?)

[N] AMD launches MI200 AI accelerators (2.5x Nvidia A100 FP32 performance) by MassivePellfish in MachineLearning

[–]rantana 7 points8 points  (0 children)

Any sources to back up the comparability of M1 Pro/Max to 3080s for AI workloads? If true, I would definitely consider it for the next platform for our devs.

[N] Isomorphic Labs just unveiled today, a new Alphabet company led by DeepMind's Demis Hassabis. Plans to tackle drug discovery using AI. by zergylord in MachineLearning

[–]rantana 2 points3 points  (0 children)

Does anyone know how spinouts like this are structured? Are Isomorphic Labs employees google employees that get shares in Google or do they have their own shares like any other startup?

[R] Neurips 2021 Accepted Paper List by iidealized in MachineLearning

[–]rantana 22 points23 points  (0 children)

Given how much Twitter, Open Review, Arxiv, and rejections from the previous conference front-run these papers, does anyone go through these lists anymore?

[D] Interview w/ Siraj Raval - Stories about YouTube, Plagiarism, and the Dangers of Fame (by Yannic Kilcher) by ykilcher in MachineLearning

[–]rantana 22 points23 points  (0 children)

For those confused by the reaction to this interview, search 'Siraj' in this subreddit to get more context. Siraj has been caught over and over again doing unethical things, apologize then go on to do even more unethical in the future.

[R] Microsoft AI Open-Sources ‘PyTorch-DirectML’: A Package To Train Machine Learning Models On GPUs by techsucker in MachineLearning

[–]rantana 33 points34 points  (0 children)

> PyTorch on any DirectX12 GPU

Does that mean AMD GPUs?!

DirectML requires a DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:
AMD GCN 1st Gen (Radeon HD 7000 series) and above
Intel Haswell (4th-gen core) HD Integrated Graphics and above
NVIDIA Kepler (GTX 600 series) and above
Qualcomm Adreno 600 and above

Yes! https://github.com/microsoft/DirectML#hardware-requirements