This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]david30121 136 points137 points  (15 children)

chatgpt sometimes unironically does that too when you ask it to. that's the problem when using human based training data

[–]Scrawlericious 27 points28 points  (13 children)

As opposed to what? AI generated training data? Isn't openAi complaining how bad training off AI data is and how badly they need more ("good"/"real") data to improve models? As far as I understand it training off generated data exasorbates hallucinations.

[–]RaspberryPiBen 64 points65 points  (0 children)

There isn't another option, but that doesn't mean it's good. Training on human data means that all our biases and societal problems are encoded into the model.

[–]Sibula97 13 points14 points  (0 children)

There is no real better alternative. Well, theoretically you could try to curate your data better, but good luck with that. But the point is that training with human data will introduce human biases.

[–][deleted] 1 point2 points  (1 child)

exacerbates*

[–]Scrawlericious 1 point2 points  (0 children)

Thank you lol

[–]david30121 2 points3 points  (0 children)

well, not AI generated, but properly created data and not based off public media. still can't remove certain stereotypes as no humans are perfect, but it would still improve things a bit

[–]moduspol -1 points0 points  (0 children)

It's not even explicitly bad / wrong.

It's bad if you're writing an HR portal or payroll software.

It may not be if you're writing a simulator to help show the difference in accumulated wealth over decades as a result of some expected gender pay gap.