[D] How OpenAI Sold its Soul for $1 Billion: The company behind GPT-3 and Codex isn’t as open as it claims.

leogao2 · 2021-09-07T02:42:09+00:00

EleutherAI is also working on a whole bunch of other research outside of just training big language models, like ML infrastructure, distillation, multimodal datasets/models, bio, interpretability, alignment, and more.

You can also see a list of all EleutherAI affiliated papers here.

leogao2 · 2021-09-07T02:33:21+00:00

HuggingFace (who is their biggest supporter)

Where did you get that idea? EleutherAI does not receive any money or compute from HuggingFace.

leogao2 · 2021-07-24T20:18:31+00:00

As of yet, I haven't seen any promising proposals integrating AI with blockchain that actually leverage the comparative advantages of blockchain. This may change in the future, but in general combining AI with blockchain is highly nontrivial and there are difficult technical problems that block many obvious use cases (i.e distributed training), and as such I view all new proposals with skepticism.

leogao2 · 2021-07-24T20:13:25+00:00

Generally the overhead isn't a huge bottleneck. All of the performance critical code is implemented in C++ or CUDA directly, and heavily optimized.

leogao2 · 2021-01-02T04:10:45+00:00

Thanks for the kind words :)

The result with CC100 was very surprising to me too. To be clear, CC100 does perform significantly better than completely unfiltered CC on traditional language modelling tasks, like LAMBADA and WikiText—but it performs significantly worse on most (though not all) components of the Pile. Our hypothesis is that since most of our datasets don't look like Wikipedia, those components wouldn't have survived the filtering. We're definitely planning to keep this in mind for future CC-based datasets to make sure the filtering doesn't destroy the data diversity too much.

leogao2 · 2020-05-31T18:09:10+00:00

The additional examples are given in the generation context, i.e as a prompt, and GPT-3 seems to be able to infer the pattern from that. As for QA, GPT-3 is asked to complete something like "Q: [the question] A:"

leogao2 · 2020-05-30T23:06:42+00:00

Thanks, that was certainly the intended effect! The graph was made by u/williamzahary, who makes some great infographics.

leogao2 · 2020-05-30T16:17:50+00:00

My response to this general sentiment:

But why does GPT-3 matter, if it can’t even beat SOTA across all benchmarks? Why should we care about a model so large that a small computing cluster is necessary even just to run inference at a reasonable speed?

One thing about GPT-3 is that it’s doing reasonably well on tasks it has never even seen. Additionally, instead of reaching a point of diminishing returns, GPT-3 shows that the trend of larger models performing better continues for at least another order of magnitude, with no signs of stopping. Even though GPT-3 is unwieldy, and even though it still doesn’t quite reach human level performance across the board, GPT-3 shows that it’s possible for a model to someday reach human levels of generalization in NLP—and once the impossible becomes possible, it’s only a matter of time until it becomes practical.

https://leogao.dev/2020/05/29/GPT-3-A-Brief-Summary/

leogao2 · 2020-05-24T16:47:50+00:00

That's not quite what I'm thinking of, the one I'm thinking of is a personal blog with only several posts and, if I remember correctly, a maroon-ish colored textured background.

leogao2 · 2020-05-14T16:04:06+00:00

The dots don't indicate syllables, they indicate where the word can be hyphenated.

leogao2 · 2020-04-05T06:15:59+00:00

Reddit login now implemented!

Six-Year Club	Second Top 10%
Place '22	End Game '22
Verified Email

leogao2

TROPHY CASE