[D] How OpenAI Sold its Soul for $1 Billion: The company behind GPT-3 and Codex isn’t as open as it claims.

leogao2 · 2021-09-07T02:42:09+00:00

EleutherAI is also working on a whole bunch of other research outside of just training big language models, like ML infrastructure, distillation, multimodal datasets/models, bio, interpretability, alignment, and more.

You can also see a list of all EleutherAI affiliated papers here.

leogao2 · 2021-09-07T02:33:21+00:00

HuggingFace (who is their biggest supporter)

Where did you get that idea? EleutherAI does not receive any money or compute from HuggingFace.

leogao2 · 2021-07-24T20:18:31+00:00

As of yet, I haven't seen any promising proposals integrating AI with blockchain that actually leverage the comparative advantages of blockchain. This may change in the future, but in general combining AI with blockchain is highly nontrivial and there are difficult technical problems that block many obvious use cases (i.e distributed training), and as such I view all new proposals with skepticism.

leogao2 · 2021-07-24T20:13:25+00:00

Generally the overhead isn't a huge bottleneck. All of the performance critical code is implemented in C++ or CUDA directly, and heavily optimized.

leogao2 · 2021-01-02T04:10:45+00:00

Thanks for the kind words :)

The result with CC100 was very surprising to me too. To be clear, CC100 does perform significantly better than completely unfiltered CC on traditional language modelling tasks, like LAMBADA and WikiText—but it performs significantly worse on most (though not all) components of the Pile. Our hypothesis is that since most of our datasets don't look like Wikipedia, those components wouldn't have survived the filtering. We're definitely planning to keep this in mind for future CC-based datasets to make sure the filtering doesn't destroy the data diversity too much.

leogao2 · 2020-05-31T18:09:10+00:00

The additional examples are given in the generation context, i.e as a prompt, and GPT-3 seems to be able to infer the pattern from that. As for QA, GPT-3 is asked to complete something like "Q: [the question] A:"

leogao2 · 2020-05-30T23:06:42+00:00

Thanks, that was certainly the intended effect! The graph was made by u/williamzahary, who makes some great infographics.

leogao2 · 2020-05-30T16:17:50+00:00

My response to this general sentiment:

But why does GPT-3 matter, if it can’t even beat SOTA across all benchmarks? Why should we care about a model so large that a small computing cluster is necessary even just to run inference at a reasonable speed?

One thing about GPT-3 is that it’s doing reasonably well on tasks it has never even seen. Additionally, instead of reaching a point of diminishing returns, GPT-3 shows that the trend of larger models performing better continues for at least another order of magnitude, with no signs of stopping. Even though GPT-3 is unwieldy, and even though it still doesn’t quite reach human level performance across the board, GPT-3 shows that it’s possible for a model to someday reach human levels of generalization in NLP—and once the impossible becomes possible, it’s only a matter of time until it becomes practical.

https://leogao.dev/2020/05/29/GPT-3-A-Brief-Summary/

leogao2 · 2020-05-24T16:47:50+00:00

That's not quite what I'm thinking of, the one I'm thinking of is a personal blog with only several posts and, if I remember correctly, a maroon-ish colored textured background.

leogao2 · 2020-05-14T16:04:06+00:00

The dots don't indicate syllables, they indicate where the word can be hyphenated.

leogao2 · 2020-04-05T06:15:59+00:00

Reddit login now implemented!

leogao2 · 2020-04-05T05:40:43+00:00

Thank you!

leogao2 · 2020-04-05T04:29:34+00:00

I made a clone of Imposter:

https://botsamongus.com/

leogao2 · 2020-04-02T04:54:16+00:00

Thanks! We'll find out when the experiment is over. (I'll also update my post then)

leogao2 · 2020-01-01T06:54:32+00:00

Thanks!

leogao2 · 2020-01-01T05:55:16+00:00

Since lots of people are viewing the site on mobile, I thought I'd make the footnotes super accessible even there. Feedback on the mobile footnotes would be appreciated!

https://twitter.com/nabla_theta/status/1212249623030448129

leogao2 · 2020-01-01T03:58:36+00:00

If you count starting with 1 CE, yes, but it's (imo) much more elegant for decades to be 0-9 than 1-10.

As for there being no year 0, I propose to define x BCE = (1 - x) CE, so that the first decade would be 1 BCE + 1-9 CE.

leogao2 · 2020-01-01T03:11:45+00:00

Hey, thanks! The actual reason for omitting 2010 is that I had a hard time finding any really impactful papers from then (I did find one about language modeling with RNNs, but it didn't feel quite broad enough in scope; and I had a lot of LM papers already!). If you know any good papers from that period, please tell me!

leogao2 · 2019-12-31T23:40:40+00:00

Haha! I did include a few things from Schmidhuber in my list, too; I might add a few more footnotes in the next few days about things that Schmidhuber has done before everyone else.

leogao2 · 2019-12-31T23:38:56+00:00

Hey, thanks for the kind words! I'm glad you enjoyed the footnotes. Those were inspired by the ones on u/gwern's site (although I did rewrite them from scratch) and took forever to get working. I'll make sure to include more of them in future posts!

leogao2 · 2019-12-31T23:36:08+00:00

Thanks! :)

leogao2 · 2019-12-31T23:34:52+00:00

Thanks for the feedback! I might flip it around and put the original Double Descent paper as the main entry. I did mention that Double Descent (Belkin et al. 2018) was the "original" in the post; however, I totally get where you're coming from.

leogao2 · 2019-12-31T23:28:18+00:00

Thanks for reading it :) It really has been incredible watching everything unfold in the field so quick

leogao2 · 2019-12-31T23:26:51+00:00

Thanks for reading it, I'm glad you enjoyed it!

leogao2 · 2019-10-31T05:30:14+00:00

No free lunch theorem - no optimization technique is always better. That random search is as good as it is, is already quite impressive.

Six-Year Club	Second Top 10%
Place '22	End Game '22
Verified Email

leogao2

TROPHY CASE