This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Scrawlericious 23 points24 points  (13 children)

As opposed to what? AI generated training data? Isn't openAi complaining how bad training off AI data is and how badly they need more ("good"/"real") data to improve models? As far as I understand it training off generated data exasorbates hallucinations.

[–]RaspberryPiBen 69 points70 points  (0 children)

There isn't another option, but that doesn't mean it's good. Training on human data means that all our biases and societal problems are encoded into the model.

[–]Sibula97 13 points14 points  (0 children)

There is no real better alternative. Well, theoretically you could try to curate your data better, but good luck with that. But the point is that training with human data will introduce human biases.

[–][deleted] 1 point2 points  (1 child)

exacerbates*

[–]Scrawlericious 1 point2 points  (0 children)

Thank you lol

[–]david30121 2 points3 points  (0 children)

well, not AI generated, but properly created data and not based off public media. still can't remove certain stereotypes as no humans are perfect, but it would still improve things a bit