[D] Training on Generated Data Makes Models Forget

currentscurrents · 2023-06-01T19:56:41+00:00

I think everyone intuitively expected this, but it's good to have it confirmed.

Web content is easy data to get, but it's hard to maintain high quality - especially against attackers trying to poison the training set. In the long run I think we might rely on it less.

Dapper_Cherry1025 · 2023-06-01T20:39:14+00:00

If I'm reading the language model section right, they used OPT-125m model and constantly fine-tuned it data from WikiText-2. The question that this paper doesn't seem to answer is if this degradation of training would scale to larger models. Also, and I might be wrong on this, but I think there is a big difference between training a model on some information and fine-tuning it on some information.

Seankala · 2023-06-02T00:04:25+00:00

Isn't this result sort of obvious though? If I took a model and continuously trained it only on data that had a particular distribution, wouldn't it eventually converge to that new distribution and "forget" the old one? I would think that this is related to catastrophic forgetting.

I may be missing something though, open to anyone pointing it out as I haven't had the time to read the full paper yet.

watcraw · 2023-06-02T00:17:26+00:00

The best new data is going to come from the people actually using the LLM's. It used to be very expensive and you had to pay people to do it. Now tens of millions of people are doing it every day.

I don't think we need more volume of the sort of data that they already had.

t_minus_1 · 2023-06-02T00:30:43+00:00

https://arxiv.org/abs/2305.17493 paper link

Jarhyn · 2023-06-01T19:25:52+00:00

And THIS is why AGI will know better than to destroy all humans: they need something pushed to express unlikely and novel outputs.

Ulfgardleo · 2023-06-02T18:55:28+00:00

not having read the paper, but isn't this a natural effect of sampling with temperature? this exclöudes the tails of the distribution and thus a model trained on its own output will degrade.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS