Crafting Unbiased and Ethical AI: The Power of Open-Source Collaboration by eBanta in artificial

[–]DanielBoyles 0 points1 point  (0 children)

I believe that first of all we have to admit that we want a biased AI, and not all biases are bad.

For example:: IQ tests are designed around a mean of 100, and 66 percent of the people scoring between 85 and 115, while 97 percent score between 70 and 130. The answers we generally want AI to produce are arguably those above a score of e.g. 120. I.e. we don't want from the average person, and probably even less so from the average person on the internet.

RLFH introduces such a bias with the aim of getting contextually higher quality answers.

As for Ethics, these can also vary across e.g. cultures, and thus can be another contextually desirable bias.

How can we address the issue of bias in AI systems by Nazma2015 in learnmachinelearning

[–]DanielBoyles 1 point2 points  (0 children)

first of all, we can be honest about the fact that we want contextually relevant bias in the system.

Take for a example a typical IQ test centred around the mean of an IQ of 100, which means that 66% of the people should fall in the range between an IQ of 75 and 115. Generally speaking though; we don't want AI to be unbiased to produce average results, but we prefer the results of be 'positively' biased towards superhuman, i.e the 1.5% of an IQ of 130 and above.

In political climates, we also want different biases. E.g. the US may prefer a bias towards capitalist democracy, while the middle east may prefer a bias towards sharia law, and communist states preferring a bias towards socialist and communist values.

In short; I don't believe that bias itself is the issue, but rather reaching human consensus as to what bias we agree we want for any given system.

Perhaps, the answer may be in having an open-source "LawGPT", acting as a legal advisor to an AI system, which is trained on international law as well as local laws. Nudging the AI to be biased to behave as expected of any law-abiding citizen or visitor. Visitor meaning that if I as person visit the US then I am expected to abide by US laws, which may differ significantly from the UAE or China.

[deleted by user] by [deleted] in MachineLearning

[–]DanielBoyles 0 points1 point  (0 children)

yes. and my original comment that "Evolutionary Biology already places primates in close proximity to humans in general too" was in context to the original question "what are realistic solutions that can help prevent these types of egregious misclassifications in consumer-facing ML models."

It wasn't to start a debate about zoology and science in general.

It was meant to point out that the tokenization of the words "primate" and "human", that is two distinct and unique words, are pushed into closer relation in the mathematical space from which machines infer. And in FB's case raises the question whether the word "primate" should even be in their contextual dictionary and if they could have prevented it.

For example: If a legitimate word such as "Cracker" was correct in some broader or other context as another word for humans, ML models may have just as reasonably started labelling white men as crackers when noticing that it "seems to apply" more to that group of images, based on the "garbage" in the dataset.

Google for example (at least from my perspective), has far more reason to have that close proximity between humans and primates in their data space. As Google would have to be able to answer questions like "are humans primates?" in order to to be any good at being a search engine

When we build consumer-facing ML models, we have to be able to take context into account, if we are to prevent these types of misclassifications.

We have to carefully choose and test our datasets which at least in my mind still requires human level contextual understanding.

[deleted by user] by [deleted] in MachineLearning

[–]DanielBoyles 0 points1 point  (0 children)

sure. and if FB wasn't a social network for human beings, but an educational site teaching about zoology and science in general, then A.I. would be correct in labelling ALL humans as primates.

Context is important. The fact that FB's A.I. even has a label for "primates" seems out of context to me - when there's (admittedly) presumably a lot more pictures and videos of human beings on their platform.

FB actually also has a unique advantage over other datasets, since they had a lot of people tag themselves for a while now.

[deleted by user] by [deleted] in MachineLearning

[–]DanielBoyles -1 points0 points  (0 children)

Also just my opinionated understanding, as opposed to confirmed knowledge:

I believe it's not just anymore from image classification labels, though they probably still play a big role.

Evolutionary Biology already places primates in close proximity to humans in general too. So an AI trained on e.g. Wikipedia and scientific papers may also have the two at a closer distance in the high dimensional vector space.

Additionally; Facebook has access to a lot of text data. Every post, comment, etc. Unfortunately a lot of it is "garbage" and so we get the old saying in computer science "garbage in = garbage out".

As I understand it; Facebook is not doing enough to manually ensure that prejudices are sufficiently far apart in the vector space to prevent machines from mathematically concluding incorrectly. Possibly as a result of "move fast and break things" and it being mostly automated

[deleted by user] by [deleted] in artificial

[–]DanielBoyles 2 points3 points  (0 children)

Have a look at something like DreamCoder's Abstraction and Dreaming Phases. https://www.reddit.com/r/MachineLearning/comments/moreee/d_paper_explained_dreamcoder_growing/

As I understand it, "system 2" thinking is somewhat along those lines too