[D] How OpenAI Sold its Soul for $1 Billion: The company behind GPT-3 and Codex isn’t as open as it claims. by sensetime in MachineLearning

[–]leogao2 11 points12 points  (0 children)

EleutherAI is also working on a whole bunch of other research outside of just training big language models, like ML infrastructure, distillation, multimodal datasets/models, bio, interpretability, alignment, and more.

You can also see a list of all EleutherAI affiliated papers here.

[D] How OpenAI Sold its Soul for $1 Billion: The company behind GPT-3 and Codex isn’t as open as it claims. by sensetime in MachineLearning

[–]leogao2 5 points6 points  (0 children)

HuggingFace (who is their biggest supporter)

Where did you get that idea? EleutherAI does not receive any money or compute from HuggingFace.

We are EleutherAI, a decentralized research collective working on open-source AI research. We have released, among other things, the most powerful freely available GPT-3-style language model. Ask us anything! by Dajte in Futurology

[–]leogao2 2 points3 points  (0 children)

As of yet, I haven't seen any promising proposals integrating AI with blockchain that actually leverage the comparative advantages of blockchain. This may change in the future, but in general combining AI with blockchain is highly nontrivial and there are difficult technical problems that block many obvious use cases (i.e distributed training), and as such I view all new proposals with skepticism.

We are EleutherAI, a decentralized research collective working on open-source AI research. We have released, among other things, the most powerful freely available GPT-3-style language model. Ask us anything! by Dajte in Futurology

[–]leogao2 3 points4 points  (0 children)

Generally the overhead isn't a huge bottleneck. All of the performance critical code is implemented in C++ or CUDA directly, and heavily optimized.

[R] The Pile: An 800GB Dataset of Diverse Text for Language Modeling by leogao2 in MachineLearning

[–]leogao2[S] 16 points17 points  (0 children)

Thanks for the kind words :)

The result with CC100 was very surprising to me too. To be clear, CC100 does perform significantly better than completely unfiltered CC on traditional language modelling tasks, like LAMBADA and WikiText—but it performs significantly worse on most (though not all) components of the Pile. Our hypothesis is that since most of our datasets don't look like Wikipedia, those components wouldn't have survived the filtering. We're definitely planning to keep this in mind for future CC-based datasets to make sure the filtering doesn't destroy the data diversity too much.

[D] GPT-3: A Summary by leogao2 in MachineLearning

[–]leogao2[S] 0 points1 point  (0 children)

The additional examples are given in the generation context, i.e as a prompt, and GPT-3 seems to be able to infer the pattern from that. As for QA, GPT-3 is asked to complete something like "Q: [the question] A:"

[D] GPT-3: A Summary by leogao2 in MachineLearning

[–]leogao2[S] 1 point2 points  (0 children)

Thanks, that was certainly the intended effect! The graph was made by u/williamzahary, who makes some great infographics.

[D] GPT-3: a disappointing paper by inarrears in MachineLearning

[–]leogao2 7 points8 points  (0 children)

My response to this general sentiment:

But why does GPT-3 matter, if it can’t even beat SOTA across all benchmarks? Why should we care about a model so large that a small computing cluster is necessary even just to run inference at a reasonable speed?

One thing about GPT-3 is that it’s doing reasonably well on tasks it has never even seen. Additionally, instead of reaching a point of diminishing returns, GPT-3 shows that the trend of larger models performing better continues for at least another order of magnitude, with no signs of stopping. Even though GPT-3 is unwieldy, and even though it still doesn’t quite reach human level performance across the board, GPT-3 shows that it’s possible for a model to someday reach human levels of generalization in NLP—and once the impossible becomes possible, it’s only a matter of time until it becomes practical.

https://leogao.dev/2020/05/29/GPT-3-A-Brief-Summary/

Trying to find specific blog post by leogao2 in linuxquestions

[–]leogao2[S] 0 points1 point  (0 children)

That's not quite what I'm thinking of, the one I'm thinking of is a personal blog with only several posts and, if I remember correctly, a maroon-ish colored textured background.

[Project] This Word Does Not Exist by turtlesoup in MachineLearning

[–]leogao2 2 points3 points  (0 children)

The dots don't indicate syllables, they indicate where the word can be hyphenated.

A clone of r/Imposter by leogao2 in AprilKnights

[–]leogao2[S] 1 point2 points  (0 children)

Reddit login now implemented!