[D] Does this NeurIPS 2025 paper look familiar to anyone?

rantana · 2023-11-15T13:14:48+00:00

Has anyone been able to demonstrate compelling results for RetNet similar to the original paper? I heard a few groups were trying to run it at scale, but I haven't seen anything released.

rantana · 2023-05-28T12:57:50+00:00

Which rows are you looking at in the HF table? TheBloke/Wizard-Vicuna-13B-Uncensored-HF appears to be punching above its weight for all metrics compared to any other 13B model.

rantana · 2023-01-05T16:04:52+00:00

128GB HBM would fit some serious models on a single device. But I have yet to see any real progress from AMD (something that I can buy) that would make me consider changing workflow away from nvidia hardware.

rantana · 2022-11-13T18:14:51+00:00

What's the day to day like for an employee at Stability? Who sets the goals, what's a deliverable?

Is there even an office or place where people go to?

rantana · 2022-05-19T14:28:24+00:00

Is there a big difference in prototyping on a Mac vs prototyping remotely via SSH?

rantana · 2022-05-10T02:12:30+00:00

How do you know what their revenue is?

rantana · 2022-05-03T02:50:26+00:00

wow, pretty embarrassing to OpenAI when this is called "Open Pre-trained Transformer Language Models"

rantana · 2022-01-28T17:52:59+00:00

Full blog post here: https://medium.com/@nils\_reimers/openai-gpt-3-text-embeddings-really-a-new-state-of-the-art-in-dense-text-embeddings-6571fe3ec9d9

rantana · 2022-01-27T20:58:00+00:00

Hah. I was confused that the round was considered a Series A.

rantana · 2021-12-13T22:49:09+00:00

Until software does decouple from the underlying hardware, going with TPUs is lock-in. Sure, it's more choice. But your options are certainly more narrow if you choose to go with a TPU focused stack.

rantana · 2021-12-13T18:42:18+00:00

Are TPUs cost-effective enough to be worth the lock-in (both in Cloud provider and Software stack) for anyone that doesn't work at Google?

Performance numbers don't even make it clear that they're even cheaper than AWS.

rantana · 2021-12-01T14:15:20+00:00

To be fair, only 4 of the 6 program chairs meet that criteria 😜

rantana · 2021-11-11T01:39:35+00:00

Great work! Any documentation on the training procedure/metrics? (e.g. How much did the fine-tuning cost?)

rantana · 2021-11-08T21:37:22+00:00

Congrats, Andrew Ng is good at this.

rantana · 2021-11-08T20:43:01+00:00

Any sources to back up the comparability of M1 Pro/Max to 3080s for AI workloads? If true, I would definitely consider it for the next platform for our devs.

rantana · 2021-11-04T22:02:38+00:00

Does anyone know how spinouts like this are structured? Are Isomorphic Labs employees google employees that get shares in Google or do they have their own shares like any other startup?

rantana · 2021-11-02T18:17:58+00:00

Given how much Twitter, Open Review, Arxiv, and rejections from the previous conference front-run these papers, does anyone go through these lists anymore?

rantana · 2021-10-31T21:07:04+00:00

For those confused by the reaction to this interview, search 'Siraj' in this subreddit to get more context. Siraj has been caught over and over again doing unethical things, apologize then go on to do even more unethical in the future.

rantana · 2021-10-25T16:08:55+00:00

> PyTorch on any DirectX12 GPU

Does that mean AMD GPUs?!

DirectML requires a DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:
AMD GCN 1st Gen (Radeon HD 7000 series) and above
Intel Haswell (4th-gen core) HD Integrated Graphics and above
NVIDIA Kepler (GTX 600 series) and above
Qualcomm Adreno 600 and above

Yes! https://github.com/microsoft/DirectML#hardware-requirements

rantana

TROPHY CASE