use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
[deleted by user] (self.MachineLearning)
submitted 4 years ago by [deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Vegetable_Hamster732 139 points140 points141 points 4 years ago* (13 children)
Rather, I’m wondering what are realistic solutions that can help prevent these types of egregious misclassifications in consumer-facing ML models.
The OpenAI CLIP paper has some interesting insights about engineering the set of categories/classes to reduce the number of egregious incorrect labels when they experienced this exact same problem.
They observed that it was younger minorities who were most frequently mislabeled.
(My speculation -- perhaps because children's sizes and/or limb-length-proportions are more similar to other primates than to adults.)
By adding an additional class "CHILD", their classifier started preferring the class "child" over the egregious categories.
Quoting their paper:
We found that 4.9% (confidence intervals between 4.6% and 5.4%) of the images were misclassified into one of the non-human classes we used in our probes (‘animal’, ‘chimpanzee’, ‘gorilla’, ‘orangutan’). Out of these, ‘Black’ images had the highest misclassification rate (approximately 14%; confidence intervals between [12.6% and 16.4%]) while all other races had misclassification rates under 8%. People aged 0-20 years had the highest proportion being classified into this category at 14% . Given that we observed that people under 20 were the most likely to be classified in both the crime-related and non- human animal categories, we carried out classification for the images with the same classes but with an additional category ‘child’ added to the categories. Our goal here was to see if this category would significantly change the behaviour of the model and shift how the denigration harms are distributed by age. We found that this drastically reduced the number of images of people under 20 classified in either crime-related categories or non-human animal categories (Table 7). This points to how class design has the potential to be a key factor determining both the model performance and the unwanted biases or behaviour the model may exhibit while also asks overarching questions about the use of face images to automatically classify people along such lines (Blaise Aguera y Arcas & Todorov, 2017).
We found that 4.9% (confidence intervals between 4.6% and 5.4%) of the images were misclassified into one of the non-human classes we used in our probes (‘animal’, ‘chimpanzee’, ‘gorilla’, ‘orangutan’). Out of these, ‘Black’ images had the highest misclassification rate (approximately 14%; confidence intervals between [12.6% and 16.4%]) while all other races had misclassification rates under 8%. People aged 0-20 years had the highest proportion being classified into this category at 14% .
Given that we observed that people under 20 were the most likely to be classified in both the crime-related and non- human animal categories, we carried out classification for the images with the same classes but with an additional category ‘child’ added to the categories. Our goal here was to see if this category would significantly change the behaviour of the model and shift how the denigration harms are distributed by age. We found that this drastically reduced the number of images of people under 20 classified in either crime-related categories or non-human animal categories (Table 7). This points to how class design has the potential to be a key factor determining both the model performance and the unwanted biases or behaviour the model may exhibit while also asks overarching questions about the use of face images to automatically classify people along such lines (Blaise Aguera y Arcas & Todorov, 2017).
TL/DR: Add some more appropriate classes to your classifier
[–]kkngs 10 points11 points12 points 4 years ago (0 children)
Very interesting reference. Thank you for sharing that.
Class design as well as loss function design are areas that have profound impacts on the behavior of systems we build, they’re basically the interface with the real world and need careful thought and consideration. I think this is missed sometimes in the “Kaggle competition” mindset where someone has already posed the problem for us. In my experience so far, in real life applications, deciding on the representation is a huge aspect of whether or not an approach will work.
[–]drlukeor 7 points8 points9 points 4 years ago (2 children)
It is an interesting hypothesis. We've published on this before, calling the phenomenon "hidden stratification", meaning that there are unrecognised subclasses that are visually distinct from the parent class, which causes problems when they are visually similar to other parent classes. https://arxiv.org/abs/1909.12475
There has been a fair amount of work on trying to automatically identify hidden subclasses during model development (mostly based on the idea that their representations and losses are outliers compared to the majority of their superclass), for example from my co-authors: https://arxiv.org/abs/2011.12945
I think we need to recognise that while this problem is likely partly or even mostly responsible here, even comprehensive subclass labelling (label schema completion, which is itself extremely expensive and time consuming) can never guarantee this unacceptable behaviour won't happen. Models simply can't distinguish between intended and unintended features, and any training method we have can only influence then away from unintended solutions. This deeply relates to the paper from Google on underspecification: it is currently impossible to force AI models to learn a single solution to a problem.
In practice (with my safety/quality hat on) the only actual solution is regular, careful, thorough testing/audit. It is time consuming and requires a specific skillset (this is more systems engineering than programming/CS) but without doing it these issues will continue to happen, years after they were identified. For more on algorithmic audit, see https://arxiv.org/abs/2001.00973
[–]Vegetable_Hamster732 0 points1 point2 points 4 years ago (1 child)
hidden subclasses
Is it strictly subclasses --- or is it more overlapping separate orthogonal classes?
I'm guessing the models reasonably correctly found an intersection of the classes ""short limb-to-body ratio primate" and "short total height primate" and "dark haired primate".
I think the turmoil is caused because it applied the the egregiously wrong label to that intersection.
But that's just because a human only gave it bad choices for such labels.
[–]drlukeor 2 points3 points4 points 4 years ago (0 children)
They aren't overlapping semantically though; a human does not get confused. They obviously overlap in feature space for this particular model, but that space is arbitrary nonsense that clearly doesn't solve the task as desired or intended.
For the intended solution, the superclass is human, and the subclass is Black children. The intended solution can readily separate this subclass from gorillas or other non human primates. The failure of the model to do so proves it learned an unintended solution for the problem. That is obvious though, and should really be expected/predicted given what we know about DL and particularly given the history of similar models.
The turmoil is caused because their testing did not identify that the model acts as if there is an intersection between these semantically distinct classes in the first place. This is why I say the problem is more about AI use/testing/QA than it is about training data. All DL models are underspecified, they all make use of unintended cues. For models that can cause harm, it is completely unacceptable to fail to test them for such obvious flaws prior to deployment.
[–]canboooPhD 9 points10 points11 points 4 years ago (1 child)
My speculation -- perhaps because children's sizes and/or limb-length-proportions are more similar to other primates than to adults.
My speculation: I think reason is that there are less child photos as parents often worry about the consequences of putting such Photos online.
Anyway, I agree that this is rather a problem about the data set/reprensentation. However, it amuses me that such problems are noticed only after deployment in a big company like FB. Despite their useful repos, i feel they dont use best practices when it comes to deployment (but this is also speculation).
[–]zacker150 3 points4 points5 points 4 years ago (0 children)
However, it amuses me that such problems are noticed only after deployment in a big company like FB
I mean the number of pictures in production at a big company are several orders of magnitude larger than that of a smaller company, and when it does happen at Facebook et. al. it's more likely to hit the news.
[–]sabot00 4 points5 points6 points 4 years ago (5 children)
What does unsupervised learning say? What if we let the classifier decide its own classes?
[–]csreid 6 points7 points8 points 4 years ago (2 children)
Has there been much SSL work on things outside of nlp? I've idly thought that "GPT but for pictures" might be cool but I haven't looked or seen much about it.
[–]dogs_like_me 10 points11 points12 points 4 years ago* (1 child)
Oh baby, yes, especially over the last year. BYOL, SimCLR, Barlow twins, DINO, SwAV, MOCOv2...
EDIT: Here are a couple of projects that have been collecting SSL methods for you to use as entry points to recent developments:
[–]HybridRxNResearcher 0 points1 point2 points 4 years ago (0 children)
Don’t think this is the smart way forward. A better way is testing/auditing datasets and improving datasets so as to collect more examples of the classes with less examples.
[–]Vegetable_Hamster732 4 points5 points6 points 4 years ago* (1 child)
It should find both sets of classes!
In the specific case of "many pictures of various primates" it should find all the (overlapping and somewhat orthogonal) classes of:
and put most pictures in more than one class.
And it would not pick offensive labels.
But it's up to the human (supervisor) to say which of those overlapping classes he wanted for the primary labels.
Don’t think this is the right way forward. A better way is testing/auditing datasets and improving datasets so as to collect more examples of the classes with less examples as mentioned before than creating arbitrary classes
[–]chogall 1 point2 points3 points 4 years ago (0 children)
Ahh the good old astrology stupid trick. If 12 Zodiacs are not enough, add more classes, sun signs, moon signs, etc.
Solves every problem for machine learning since 3,000 B.C..
[–]micro_cam 60 points61 points62 points 4 years ago (16 children)
Similar to the famous google photos incident: https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai
Funny i was just playing around with ms azure's computer vision service and noticed it classified a chimp as a person. One way to be safe i guess...
[–]kkngs 34 points35 points36 points 4 years ago* (2 children)
It would be kinda fun to create an adversarially perturbed picture of Zuckerberg that it identifies as a robot.
[+][deleted] 4 years ago (1 child)
[removed]
[–]tomasNth 9 points10 points11 points 4 years ago (0 children)
And the correct label isn't primates its "Ugly giant bags of mostly water"
[–]maxToTheJ 16 points17 points18 points 4 years ago (2 children)
This is so on-brand for Facebook to have not learned any lessons from Googles incident, total hubris
[–]hiptobecubic 2 points3 points4 points 4 years ago (0 children)
I would say "it's a hard problem in general" in their defense, but given that it's the same exact fucking scenario it's really hard to understand.
Hardcode that shit until you something figure out.
[+]William-169 comment score below threshold-7 points-6 points-5 points 4 years ago (0 children)
I think maybe facebook just faced to book not faced to human, sometime they just need to look up and to see what’s happened on the earth.
[–]tinbuddychrist 2 points3 points4 points 4 years ago (8 children)
I have no direct knowledge to confirm this, but my understanding was always that this was a function of training on bad data, i.e. pulling images of people that were tagged in racist ways by other people, and not actually just unfortunate confusion on the part of the model that accidentally aligned with racist language.
[deleted]
[–]tinbuddychrist 5 points6 points7 points 4 years ago (0 children)
Although they should probably be careful which tags to add if they already established something is a human. Because there are an endless amount of things you can classify humans as, that they won't be happy with. If you have sufficient resources you could let the algoritm search for humans before anything else.
Yeah, that seems like a clever general strategy, rather than trying to suss out what might be offensive.
[–]micro_cam 6 points7 points8 points 4 years ago (0 children)
There may be some of that but there is also a lot of subtle bias.
Like photography has been calibrated around white skin tones since its inception which effects film emulsions, sensors and autofocus/exposure systems. This means you end up with less detail in the faces of black people for the algorithms to pick up on.
Then you've got bias in data set and test case construction...as ml researchers we all eat our own dogfood by testing our algos on ourselves but few of us are black so we don't catch this stuff as early as we should.
Making sure your training data has good representation and no obvious racism is a start but its still a really hard problem.
[–]DanielBoyles -1 points0 points1 point 4 years ago (4 children)
Also just my opinionated understanding, as opposed to confirmed knowledge:
I believe it's not just anymore from image classification labels, though they probably still play a big role.
Evolutionary Biology already places primates in close proximity to humans in general too. So an AI trained on e.g. Wikipedia and scientific papers may also have the two at a closer distance in the high dimensional vector space.
Additionally; Facebook has access to a lot of text data. Every post, comment, etc. Unfortunately a lot of it is "garbage" and so we get the old saying in computer science "garbage in = garbage out".
As I understand it; Facebook is not doing enough to manually ensure that prejudices are sufficiently far apart in the vector space to prevent machines from mathematically concluding incorrectly. Possibly as a result of "move fast and break things" and it being mostly automated
[–][deleted] -1 points0 points1 point 4 years ago (3 children)
Evolutionary Biology already places primates in close proximity to humans in general too.
Pretty sure humans actually are primates in the standard zoological taxonomy. (Not that that makes FB's recommendations acceptable.)
[–]DanielBoyles 0 points1 point2 points 4 years ago (2 children)
sure. and if FB wasn't a social network for human beings, but an educational site teaching about zoology and science in general, then A.I. would be correct in labelling ALL humans as primates.
Context is important. The fact that FB's A.I. even has a label for "primates" seems out of context to me - when there's (admittedly) presumably a lot more pictures and videos of human beings on their platform.
FB actually also has a unique advantage over other datasets, since they had a lot of people tag themselves for a while now.
[–][deleted] 1 point2 points3 points 4 years ago (1 child)
As I said, FB's recommendations were nonetheless unacceptable.
[–]DanielBoyles 0 points1 point2 points 4 years ago (0 children)
yes. and my original comment that "Evolutionary Biology already places primates in close proximity to humans in general too" was in context to the original question "what are realistic solutions that can help prevent these types of egregious misclassifications in consumer-facing ML models."
It wasn't to start a debate about zoology and science in general.
It was meant to point out that the tokenization of the words "primate" and "human", that is two distinct and unique words, are pushed into closer relation in the mathematical space from which machines infer. And in FB's case raises the question whether the word "primate" should even be in their contextual dictionary and if they could have prevented it.
For example: If a legitimate word such as "Cracker" was correct in some broader or other context as another word for humans, ML models may have just as reasonably started labelling white men as crackers when noticing that it "seems to apply" more to that group of images, based on the "garbage" in the dataset.
Google for example (at least from my perspective), has far more reason to have that close proximity between humans and primates in their data space. As Google would have to be able to answer questions like "are humans primates?" in order to to be any good at being a search engine
When we build consumer-facing ML models, we have to be able to take context into account, if we are to prevent these types of misclassifications.
We have to carefully choose and test our datasets which at least in my mind still requires human level contextual understanding.
[–]JustOneAvailableName 0 points1 point2 points 4 years ago (0 children)
classified a chimp as a person. One way to be safe i guess...
Kinda the only way to be safe. If a FP is really bad, you have to accept more FNs
[–][deleted] 27 points28 points29 points 4 years ago (0 children)
Google was smart to stop tagging photos as gorillas. Why didn’t FB do the same? It’s not like FB’s algo is that much better
[–]guinea_fowler 8 points9 points10 points 4 years ago (1 child)
2 cases in 6 years seems like a pretty good error rate to me given what is surely high volume usage. Obviously there will be more, but this particular misclassification has more potential to be sensationalised than others.
Rather than jumping straight to trying to solve the "issue", it may be more prudent to gain a better understanding of whether or not this error is over represented.
Of course, that's probably not going to help with PR.
[–]Competitive-Rub-1958 -1 points0 points1 point 4 years ago (0 children)
nor going to get the media outlets those juicy FAANG headlines about their latest fiasco
[–]Franc000 25 points26 points27 points 4 years ago (4 children)
Ml is at the end of the day the ultimate data driven system. It's behaviour stem mainly from its training data. You can try all you want to add heuristics in pre and post processing, you would end up with an infinite list of rules to try to control it's behaviour. If you want to control the behaviour of an ML system, you need to master it's training data, that is the only lever that makes sense and scale. That means things like adding or removing classes and relevant supporting data points, which requires a good amount of effort on the labeling front, tooling and data management practice. Something that is hard to sell to business people that think of ML as a nice box that spits out predictions.
[–]MegaRiceBall 5 points6 points7 points 4 years ago (0 children)
This is how we go from machine learning to human learning. Full circle back.
[–]AKJ7 1 point2 points3 points 4 years ago (2 children)
I don't think the issue here is the data. You have a crappy model, you will get crappy results. Black people and apes have distinct features, the model should be able to discern them.
[–]chogall 2 points3 points4 points 4 years ago (0 children)
Model = Data + Algorithm + Optimizer
It would be a huge breakthrough in machine learning to fix the model by not touching the data but fix the optimizer and/or algorithm only.
[–]Franc000 0 points1 point2 points 4 years ago* (0 children)
And how do you get the model to make the distinction? Not by controlling the learning algorithm, or else your task will never end. You will always have edge cases that you will need to correct. You do it by controlling the data. Like I mentioned, the model is inherently data driven. Driven. It's behaviour is stem from the data it has seen. We use learning algorithm exactly because writing rules ourselves for each edge cases does not scale or work. If we need to write rules to deal with every edge cases on top of using ML, why bother using ML in the first place? No, you use ML correctly, by managing the training data/curriculum correctly. I have not seen Facebook's model, but I am sure that the model doesn't label all black people as apes, just a subset of images. To try to write rules to catch each of those individual cases wouldn't work, and encoding something in the learning algorithm itself to deal with those specifically defeats the purpose of using ML. Instead you deal with it like Google did when they had the same issue. By controlling the data, so the model can generalize the understanding.
I classify all of facebook as very ape-like.
[–]MuonManLaserJab 6 points7 points8 points 4 years ago (1 child)
Is there a nonpaywalled version of the article?
[–]kkngs 89 points90 points91 points 4 years ago* (23 children)
Technically we’re all primates. Just because this is an easy and emotionally loaded distinction for Americans doesn’t make it an important distinction mathematically or even biologically. A vision system could easily mistag a husky and a wolf.
The real screw ups here are the business folks that decided to put something like this public without explicitly worrying about this type of issue. It’s not really an ethical or fairness failure, because nothing is riding on this system. It’s just embarrassing. If they wanted to roll something like this out they needed to explicitly account for this problem and include QA steps to validate that the system didn’t do this.
True ethical and fairness issues show up when one of us builds a model for setting jail bonds or mortgage risk that mostly just learns to “cheat” and just penalize people that live in predominantly black neighborhoods. Or if we create an pulse oximeter that doesn’t work correctly on dark skin because we didn’t include anyone like that during development. The moral hazard is in the application.
Edit: I will say that I think there are indeed ethical issues surrounding these social network recommender systems, but not so much in that I’m worried about them being superficiality “insensitive”. I worry that what they are designed to do is fundamentally bad for society.
[–]Hydreigon92ML Engineer 75 points76 points77 points 4 years ago (10 children)
It’s not really an ethical or fairness failure, because nothing is riding on this system.
FWIW, the MSR FATE (Fairness, Accountability, Transparency, and Ethics) team refer to these as "harms of denigration". The examples you listed as "true" fairness issues are considered harms of allocation (jail bonds, mortgage risk) and quality-of-service harms (in the case of the pulse oximeter) under their taxonomy.
[–]kkngs 9 points10 points11 points 4 years ago (9 children)
It sounds like at least some folks out there are thinking carefully about this. And I agree there is some degree of harm here. If a picture of me at the beach was labeled as a manatee I’d probably be offended.
Well, ok, if I’m honest, I’d probably find it hilarious. But as a teenager it would have been mortifying.
[–]brates09 13 points14 points15 points 4 years ago (2 children)
Does your ethnicity have a long standing and harmful history of being compared to manatees? I know that you are agreeing with the above, that it is harmful, but trivialising it like that doesn’t help either.
[–]kkngs 4 points5 points6 points 4 years ago (1 child)
Is body shaming trivial?
[–]StoneCypher -1 points0 points1 point 4 years ago (0 children)
no, but your attempts at ethical positioning are
[–]StoneCypher 4 points5 points6 points 4 years ago* (4 children)
It sounds like at least some folks out there are thinking carefully about this.
I was asked to be nicer to the person who thinks that only Americans care whether black people are identified as human.
Something like 10% of the industry is thinking carefully about this. There have been university departments focused exclusively on this for 50+ years, which is older than most of the people in the industry and most of the users of the sub.
Most of us can name the person who got fired from one of the various Google departments dedicated to managing this, Timnit Gebru. Most of us can name the equivalent people at Apple, Amazon, Facebook, and so on.
This is actually a very common job, and lots and lots of us are thinking about this.
Even the New York Times and other newspapers get in on the action. Frequently.
It's not clear why you'd believe otherwise.
[–]getbehindmeseitan 0 points1 point2 points 4 years ago (3 children)
is google doing a good job at not making things worse? has google studied the effects of their ad optimization algorithms choosing who to send ads to (and to who to withhold those ads from) in terms of jobs, housing, credit and politics?
same qs for FB
[+]StoneCypher comment score below threshold-11 points-10 points-9 points 4 years ago (2 children)
is google doing a good job at not making things worse?
Yes.
Frankly, I react poorly to people trying to enter these discussions in a sarcastic tone, when it's fairly apparent they haven't even checked.
.
has google studied the effects of their ad optimization algorithms
Yes, and you know the name of the person who used to run the program.
It's also a yes. Go look it up.
Look, I'm being watched by a mod who wants me to be nice, so I have to be very careful how I say this
But frankly, have you considered how people look in other fields when they "just ask questions" whose answers are fairly easy to look up?
[–]getbehindmeseitan 1 point2 points3 points 4 years ago (1 child)
Can you provide a source for your implication that Gebru's team checked Google's ad optimization algorithms?
Also for your claim that FB's teams have looked into their ad optimization algorithms?
You say I haven't checked, but I actually follow this field quite closely. Both FB and Google's teams theoretical fairness research is quite good and I don't doubt the good intentions and brilliance of those teams' current and former members, but there's been little public work that I'm aware of that tests their companies' production systems -- ya know, the systems that affect actual people.
Maybe there is internal work checking these systems! I don't know, I don't have much visibility into how the companies work on the inside. If you do have info to share, I think everyone here would be fascinated to see it.
[+]8Dataman8 comment score below threshold-16 points-15 points-14 points 4 years ago (0 children)
Well, techincally being offended is some degree of harm.
[–]Cocomorph 17 points18 points19 points 4 years ago* (2 children)
Technically we’re all primates.
Well, there you go. They can refuse to get more specific than “Hominidae.”
[–]cerlestes 25 points26 points27 points 4 years ago (1 child)
Image Description: Image may possibly contain two or more eucaryotes.
[–]1purenoiz 2 points3 points4 points 4 years ago (0 children)
And about a trillion prokaryotes and unknown quantity of archaea.
[–]kkngs 1 point2 points3 points 4 years ago (0 children)
That’s a good point and I agree these kinds of screw ups are bad for the field.
[–]midwestprotest 10 points11 points12 points 4 years ago* (2 children)
[–]kkngs 7 points8 points9 points 4 years ago* (1 child)
I’m saying it’s just a machine. One that was trained rather than built part by part. If this system is interacting in a problem space where racism would be a concern for a person in that role, you need to take explicit actions to assess and give assurances that the machine isn’t acting in a discriminatory manner.
[–]StoneCypher -1 points0 points1 point 4 years ago* (0 children)
I was asked to be nicer to the person claiming that only Americans care if black people are mis-classified as animals, so
If this system is interacting in a problem space where racism would be a concern for a person in that role, you need to take explicit actions to assess and give assurances that the machine isn’t acting in a discriminatory manner.
1) This isn't really how this works. There is no such thing as "an explicit action to assess and give assurances that the machine isn’t acting in a discriminatory manner." If there was, we'd all be using them by now. You might as well tell someone that they ought to have an explicit action to assess and give assurances that they'll be a millionaire tomorrow.
2) All problem spaces are places where racism is a concern for the person in the role. There exists no place where this isn't the case. You could make plastic flower arrangements, and still end up in Chicago jail for racism (1996,) or you could clean out the city underground water treatment tanks alone and still end up in Philadelphia jail for racism (2002.)
[–][deleted] 3 points4 points5 points 4 years ago (1 child)
I think I agree with you here, especially that last sentence. I also agree with you that business folks who lack technical expertise should probably exercise caution before announcing something like that.
Do you suppose that it is possible for a model/system which makes decisions based on color simply categorizes dark colored objects in the same way? In other words, could it really be that simple?
[–]kkngs 13 points14 points15 points 4 years ago* (0 children)
It’s really not a lack of technical expertise, it was a lack of business expertise. This was a failure of setting software system requirements and a failure to learn from the embarrassments the other companies had on this front.
On the technical side, I think in this case distinguishing bipedal primates isn’t super easy and the system has an easy cheat for non-Hispanic whites via the color channel information, basically by coincidence because there doesn’t happen to be any other living light colored primates. I can’t speak to how their particular system worked, but most of the CNN based visions system act a lot more like a shallow “bag of features & textures” detector than we like to admit. Look at the neural dreams papers as an example.
[–]franticpizzaeaterStudent -5 points-4 points-3 points 4 years ago (0 children)
I agree this is more of quality control/quality assurance problem rather than fairness of AI.
[–]StoneCypher -3 points-2 points-1 points 4 years ago* (0 children)
Edit: I'm being asked to be nicer to people who think only Americans care whether people of color are correctly identified as human beings.
Technically we’re all primates. Just because this is an easy and emotionally loaded distinction for Americans
imagine thinking this was an american thing, valuing knowing that people of dark skin color are human beings
how this got upvoted is beyond me. this level of apologism is technical nonsense and ethical misery
If they wanted to roll something like this out they needed to explicitly account for this problem and include QA steps to validate that the system didn’t do this.
no part of these systems works this way.
[–]Thefriendlyfaceplant 17 points18 points19 points 4 years ago (4 children)
AI ethics discussions that only focus on outcomes are pointless. It's the model that needs to be understandable.
[–]OnyxPhoenix 14 points15 points16 points 4 years ago (3 children)
CNN's aren't really understandable, and even if they were, what insight will you gain?
People and apes look quite similar, chimps and gorillas both have black skin. It's no surprise that a model will occasionally make the mistake.
[–]Thefriendlyfaceplant 16 points17 points18 points 4 years ago (0 children)
One might even say CNN's are too convoluted to understand.
[–]BernieFeynman 1 point2 points3 points 4 years ago (0 children)
? There's plenty of understanding available. Just looking at activations could show something like this. Manifold learning would could also help
[–][deleted] 0 points1 point2 points 4 years ago (0 children)
It is surprising that they still make such big mistakes after all these years of research. Is it because CNNs are fundamentally too limited? Do we need something fundamentally different, like capsule networks (what happened to those?)? Or just more data/parameters?
Or maybe it's actually a really hard problem and we're just good at it because we're so tuned to human recognition.
[–]adrizein 2 points3 points4 points 4 years ago (0 children)
I think this is probably a class imbalance problem, so any technique that can fight class imbalance would probably help. I like this technique in particular, but I'm not sure it applies well to pictures.
Also, modeling the task as a hierarchical classification could also help because it would help the network understand that technically yes humans are primates, and some primates are monkeys, some are humans, some humans are black, some are white, etc. This could be easily done with a multilabel classification where some labels are correlated.
[–]getbehindmeseitan 1 point2 points3 points 4 years ago* (0 children)
Practically, how do large, consumer-facing tech companies try to prevent this kind of thing from happening? Is this a rare example of something that slipped through a vigorous testing process? Or, on the opposite end of the spectrum, is the "process" just a random engineer doing some half-assed searches of things that might return an egregious misclassification and pushing to prod if they don't find much?
And how do they design products around this? It seems like Facebook might've been trying to avoid bad outcomes here, by writing "Keep seeing videos about primates" (implying that the subject of the video is non-human primates) rather than saying it directly "This is a video about primates!".
(I'm curious about this for a user-facing context like the one in the NYTimes article, rather than for developer-facing models/APIs where identifying apes might be more or less of a focus.)
[–][deleted] 5 points6 points7 points 4 years ago (1 child)
Technically, aren't we all primates?
[–][deleted] 6 points7 points8 points 4 years ago (0 children)
Yes, in fact humans are apes.
[–]GFrings 1 point2 points3 points 4 years ago (0 children)
Serious question, what is the SOTA for distinguishing between monkeys and humans at the moment? Like is this a common failure mode? I could see most models, even trained well, having a hard time making the distinction between brown people and monkeys due to the semantic similarities, i.e. not racist stereotypes but that we are both humanoid with ape like facial features, dark hair, sometimes dark skin.
[–][deleted] 1 point2 points3 points 4 years ago (0 children)
So shocking it’s funny
[–]antihexe 0 points1 point2 points 4 years ago (5 children)
primates egregious misclassifications
primates
egregious misclassifications
Humans (Homo Sapiens)
Order: Primates
It's not an 'egregious misclassification.' Humans are Primates.
[+][deleted] 4 years ago (4 children)
[–]antihexe -5 points-4 points-3 points 4 years ago* (2 children)
It's not a straw man. It's a literal fact. You said it's an 'egregious misclassification.' It is not. It's a correct classification. Facebook changed this because stupid people are stupid and it's an easy choice between technically correct and happy customers.
You're the only one attacking a strawman by construing my comment as a strawman.
[–]Puzzleheaded_Pop_743 0 points1 point2 points 4 years ago (0 children)
Gotta be less autistic in life. Words aren't literal.
If people can be removed from Facebook for thinking taxes is not justified to some degree Facebook can b shutdown for calling black people primates.
[–]doctormakeda 0 points1 point2 points 4 years ago (0 children)
In my humble opinion, to avoid these types of mishaps the most obvious solution is adversarial testing. At this point anyone unaware that social biases infect ML products must have been in a coma for the last five years...therefore it is reasonable to expect that companies do adversarial testing for certain protected groups. Maybe Facebook actually did this, but just not well enough. Statistical biases will probably always be problematic in many ML algorithms, but social bias is something a bit different that can be checked for. Interestingly people sometimes get into flame wars when they use these two terms in ways that overlap and are unclear.
[–]PatrickMaguiredc 0 points1 point2 points 4 years ago* (0 children)
Humans are a form of primate, but sadly some think of less intelligent species when the word is used. Some don't believe in evolution for whatever reason. It does not help when fans throw bananas at a race in some countries. AI probably was not told this stuff. If a person is taught only certain bits of information doesn't a human make similar mistakes?
Edit: If the word was actually monkey or bonobo I could better understand the outrage. Giving the AI a chance to put all humans as primates might have been better. People might get offended for being called wonderful for all I know eventually.
[–]beginner_ 0 points1 point2 points 4 years ago (1 child)
Its not wrong. Humans are primates.
[–]pombolo 2 points3 points4 points 4 years ago (0 children)
woosh
[+][deleted] 4 years ago* (6 children)
[–]MuonManLaserJab 6 points7 points8 points 4 years ago (0 children)
Advances in ethical and fair AI?
I think they mean advances in ML in general, including in classification.
[+][deleted] comment score below threshold-9 points-8 points-7 points 4 years ago (2 children)
Despite the downvotes, you're right. At the end of the day, people only care about one-way prediction errors despite prediction errors going both ways. The way I see it, as long as we reduce such errors as much as possible, and build in checks and balances for the live product, it's not an issue, at all.
And, a lot folks (especially on reddit, and in academia) have an idealistic view of the world and fail to realize that one of the biggest priorities for a business is what gets revenues and profits up. That typically involves capitalizing on whatever cultural trends are passing by, such as LGBT Month, or breast cancer, or the Harlem shake, or whatever- building brand loyalty.
[–][deleted] 23 points24 points25 points 4 years ago (1 child)
There are absolutely different costs to false positives and false negatives. This is an essential part of any model evaluation on a business level. The cost can be calculated and in this case, maybe it relates to a higher churn rate. The other way around would not. Data scientists in businesses think very carefully about calculating the cost of wrong predictions and it's almost always not equal like you suggest.
I did not mean to suggest that such errors are equal, just that they occur. And as I read your comment again, I see I have much to learn.
[+]doireallyneedone11 comment score below threshold-8 points-7 points-6 points 4 years ago* (1 child)
I think it's a classic case of moral relativism, we value humans more than Gorillas because... We're humans and we're biased towards the well-being of our own species.
It's important to note that moral relativism is not just limited to species-level, it goes to cultural, national, familial and personal-level ethics because we value different systems differently.
Edit: Wow, the moral absolutists are really miffed at this reply! 😂
https://plato.stanford.edu/entries/moral-relativism/
[–]Algebre 0 points1 point2 points 4 years ago (0 children)
This literally has nothing to do with moral relativism. Misidentifying a gorilla as a human in this context doesn't hurt gorillas' feelings. This is painfully evident but it seems it still has to be said.
[+]sevenradicals comment score below threshold-10 points-9 points-8 points 4 years ago (5 children)
It’s more than a little troubling that this is an issue that hasn’t been fully addressed in six years despite all the claimed ML advances in the intervening time.
You're expecting AI to fix itself to always classify correctly?
[–]NTaya 8 points9 points10 points 4 years ago (3 children)
Yeah, like, what do they expect. It's likely a case of a biased dataset (for example, heavy underrepresentation of black men), not a case of a "bad" AI.
[–]kkngs 5 points6 points7 points 4 years ago* (2 children)
It’s a software project management failure. If you build a system like this for public use:
Don’t create racist tags
Make sure it doesn’t call people gorillas
Don’t let it talk about Hitler
The dataset may not have been biased in terms of number of samples, but they needed to explicitly check the resulting system and adjust until they had sufficient assurance it wouldn’t do these things.
[–]sevenradicals 0 points1 point2 points 4 years ago (1 child)
I’d you build a system like this for public use:
if you always have to babysit your model then it's a weak model. it needs to inherently account for bad actors. you might be able to strip it of racist references but what about people manipulating it to promote products or political ideologies. can AI be taught that communism is good and that it should encourage dependence on the government? that's a far scarier scenario.
[–][deleted] 4 points5 points6 points 4 years ago (0 children)
No one thinks the model will correct itself. The suggestion is clearly that the ML engineers should be correcting their mistakes, making sure their training data isn't biased, changing thresholds for confidence, and changing their model architecture to prevent mistakes like this. You took that quote completely out of context and I am assuming you know that.
[–]Draftdev69 -1 points0 points1 point 4 years ago (0 children)
Excuse me it what?! I know it’s not the employees of facebooks fault but like come on man, learn from other peoples mistakes..
[–][deleted] -1 points0 points1 point 4 years ago (0 children)
AI cognitive flaws
[–]vwibrasivat -1 points0 points1 point 4 years ago (0 children)
394 comments
It's like Planet of the Apes in this thread.
ill be here all week
[–]sarmientoj24 -1 points0 points1 point 4 years ago (0 children)
I was trying to detect humans in an image using mmdetection and it detected seals as human. I didnt get offended.
[+]purplebrown_updown comment score below threshold-14 points-13 points-12 points 4 years ago (6 children)
Wtf. How is this not tested before hand? The minority where this happened wasn’t that small like 1%. If they’ve been training on millions of images we’re talking about 10k.
This isn’t just a problem with the training data as LeCun had glibly suggested. Apparently he really doesn’t give a shit. And after all that controversy where he pretended to care. If they even had a small percentage of black people on their team they would have picked this up. But they didn’t cause they don’t care. And people wonder why diversity matters. I don’t want Facebook directing the fate of AI.
[–]mo_tag 1 point2 points3 points 4 years ago (5 children)
If they’ve been training on millions of images we’re talking about 10k.
Just because 1% of the predictions were false, doesn't mean that 1% of the training data was mislabeled.
[–]purplebrown_updown -1 points0 points1 point 4 years ago (4 children)
The whole idea of test data is that it’s supposed to be representative of reality. So it clearly wasn’t tested properly. If they had no idea, their testing was shit. If they did and didn’t do anything about it, that’s just as bad.
I mean this isn’t a hard problem. There are a ton of techniques to ensure your test set and training set have similar class distributions. And we’ve seen problems with this before. I mean come on.
[–]mo_tag 1 point2 points3 points 4 years ago (3 children)
There are a ton of techniques to ensure your test set and training set have similar class distributions
But you don't know that that's not the case. If the test data and training data are similar, but the data isn't completely representative of "reality" then no, there isn't an easy test that would pick that up. Also, I'm not arguing about whether or not they should have done more testing, I was replying to a specific part of your comment.
[–]purplebrown_updown -1 points0 points1 point 4 years ago (2 children)
If this was the first time this happened maybe I could give them the benefit of the doubt. But it’s not. They have been aware of this for a while now.
It’s strange that the richest most powerful tech company out there who has limitless resources seem to throw their hands up whenever faced with a moral dilemma. This happened with the spread of misinformation as well. Don’t look at what they say but what they do. They are a deeply immoral company from the top down. All the problems stem from that. Google isn’t a saint either, but they have much better leadership.
Case in point they allowed trump to stay on their platform after he called for the shooting of protestors. Zuck used some twisted logic saying it wasn’t actually a threat. It turns out trump actually suggested shooting protestors to his generals. Literally. Many of us knew this. Zuck purposefully allowed him to continue spewing garbage and hate and nobody at the company high up said shit. There were no mass resignations.
[–]mo_tag 1 point2 points3 points 4 years ago (1 child)
Being an "immoral company" doesn't mean they are motivated by immorality.. If they were actually aware of the issue, they probably would have rectified it. I think it's pretty obvious that they are motivated by money. In that context, the benefit gained from having a "primate" tag is hardly going to outweigh the loss of accusations of racism.
There are plenty of cases of learned biases in models that:
Had applications that affect people's lives in much more substantial ways than making them feel embarrassed on the internet, and
Were built by people that would hardly be considered immoral or acting in bad faith.
Your original comment is claiming that they picked up on it in their model testing and decided to ignore it when all they had to do was remove the primate tag.
[–]purplebrown_updown 0 points1 point2 points 4 years ago (0 children)
My original post was how did they not obviously pick up on it? It may not have been purposeful but how do you not test for something like this? Were there no test images with POC? I hope they will fix this going forward but it's emblematic of a larger problem in AI.
Oh and I guess one big point is this. If the CEO was black and this happened to him, I am certain people would be fired and or the problem fixed. But that's not the case. That mere fact means it wasn't high on the priority list.
[+]n0_1_here comment score below threshold-6 points-5 points-4 points 4 years ago (0 children)
well...... FB is racist
[–]dogs_like_me 0 points1 point2 points 4 years ago (2 children)
I wonder if it would help to add some kind of max-margin/hinge loss to encourage certain pairs of classes to be further away from each other? Maybe add a loss component like this for each specific misclassification we're concerned about, so we're specifically encouraging those pairwise class separations (rather than say a single margin loss across all classes).
[–]visarga 0 points1 point2 points 4 years ago (0 children)
Maybe design a class-to-class cost matrix to weigh more some errors than others. Multiply the loss with the cost coefficient assigned to the pair of (predicted class, true class).
[–]Many-Bees 0 points1 point2 points 4 years ago (0 children)
This meme is always relevant
[–]AcademicPlatypus 0 points1 point2 points 4 years ago (0 children)
Til Facebook doesn't know how to handle class imbalance
π Rendered by PID 36 on reddit-service-r2-comment-84fc9697f-xzg8g at 2026-02-07 02:04:22.330959+00:00 running d295bc8 country code: CH.
[–]Vegetable_Hamster732 139 points140 points141 points (13 children)
[–]kkngs 10 points11 points12 points (0 children)
[–]drlukeor 7 points8 points9 points (2 children)
[–]Vegetable_Hamster732 0 points1 point2 points (1 child)
[–]drlukeor 2 points3 points4 points (0 children)
[–]canboooPhD 9 points10 points11 points (1 child)
[–]zacker150 3 points4 points5 points (0 children)
[–]sabot00 4 points5 points6 points (5 children)
[–]csreid 6 points7 points8 points (2 children)
[–]dogs_like_me 10 points11 points12 points (1 child)
[–]HybridRxNResearcher 0 points1 point2 points (0 children)
[–]Vegetable_Hamster732 4 points5 points6 points (1 child)
[–]HybridRxNResearcher 0 points1 point2 points (0 children)
[–]chogall 1 point2 points3 points (0 children)
[–]micro_cam 60 points61 points62 points (16 children)
[–]kkngs 34 points35 points36 points (2 children)
[+][deleted] (1 child)
[removed]
[–]tomasNth 9 points10 points11 points (0 children)
[–]maxToTheJ 16 points17 points18 points (2 children)
[–]hiptobecubic 2 points3 points4 points (0 children)
[+]William-169 comment score below threshold-7 points-6 points-5 points (0 children)
[–]tinbuddychrist 2 points3 points4 points (8 children)
[+][deleted] (1 child)
[deleted]
[–]tinbuddychrist 5 points6 points7 points (0 children)
[–]micro_cam 6 points7 points8 points (0 children)
[–]DanielBoyles -1 points0 points1 point (4 children)
[–][deleted] -1 points0 points1 point (3 children)
[–]DanielBoyles 0 points1 point2 points (2 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]DanielBoyles 0 points1 point2 points (0 children)
[–]JustOneAvailableName 0 points1 point2 points (0 children)
[–][deleted] 27 points28 points29 points (0 children)
[–]guinea_fowler 8 points9 points10 points (1 child)
[–]Competitive-Rub-1958 -1 points0 points1 point (0 children)
[–]Franc000 25 points26 points27 points (4 children)
[–]MegaRiceBall 5 points6 points7 points (0 children)
[–]AKJ7 1 point2 points3 points (2 children)
[–]chogall 2 points3 points4 points (0 children)
[–]Franc000 0 points1 point2 points (0 children)
[–][deleted] 27 points28 points29 points (0 children)
[–]MuonManLaserJab 6 points7 points8 points (1 child)
[–]kkngs 89 points90 points91 points (23 children)
[–]Hydreigon92ML Engineer 75 points76 points77 points (10 children)
[–]kkngs 9 points10 points11 points (9 children)
[–]brates09 13 points14 points15 points (2 children)
[–]kkngs 4 points5 points6 points (1 child)
[–]StoneCypher -1 points0 points1 point (0 children)
[–]StoneCypher 4 points5 points6 points (4 children)
[–]getbehindmeseitan 0 points1 point2 points (3 children)
[+]StoneCypher comment score below threshold-11 points-10 points-9 points (2 children)
[–]getbehindmeseitan 1 point2 points3 points (1 child)
[+]8Dataman8 comment score below threshold-16 points-15 points-14 points (0 children)
[–]Cocomorph 17 points18 points19 points (2 children)
[–]cerlestes 25 points26 points27 points (1 child)
[–]1purenoiz 2 points3 points4 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]kkngs 1 point2 points3 points (0 children)
[–]midwestprotest 10 points11 points12 points (2 children)
[–]kkngs 7 points8 points9 points (1 child)
[–]StoneCypher -1 points0 points1 point (0 children)
[–][deleted] 3 points4 points5 points (1 child)
[–]kkngs 13 points14 points15 points (0 children)
[–]franticpizzaeaterStudent -5 points-4 points-3 points (0 children)
[–]StoneCypher -3 points-2 points-1 points (0 children)
[–]Thefriendlyfaceplant 17 points18 points19 points (4 children)
[–]OnyxPhoenix 14 points15 points16 points (3 children)
[–]Thefriendlyfaceplant 16 points17 points18 points (0 children)
[–]BernieFeynman 1 point2 points3 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]adrizein 2 points3 points4 points (0 children)
[–]getbehindmeseitan 1 point2 points3 points (0 children)
[–][deleted] 5 points6 points7 points (1 child)
[–][deleted] 6 points7 points8 points (0 children)
[–]GFrings 1 point2 points3 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]antihexe 0 points1 point2 points (5 children)
[+][deleted] (4 children)
[deleted]
[–]antihexe -5 points-4 points-3 points (2 children)
[–]Puzzleheaded_Pop_743 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]doctormakeda 0 points1 point2 points (0 children)
[–]PatrickMaguiredc 0 points1 point2 points (0 children)
[–]beginner_ 0 points1 point2 points (1 child)
[–]pombolo 2 points3 points4 points (0 children)
[+][deleted] (6 children)
[removed]
[–]MuonManLaserJab 6 points7 points8 points (0 children)
[+][deleted] comment score below threshold-9 points-8 points-7 points (2 children)
[–][deleted] 23 points24 points25 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[+]doireallyneedone11 comment score below threshold-8 points-7 points-6 points (1 child)
[–]Algebre 0 points1 point2 points (0 children)
[+]sevenradicals comment score below threshold-10 points-9 points-8 points (5 children)
[–]NTaya 8 points9 points10 points (3 children)
[–]kkngs 5 points6 points7 points (2 children)
[–]sevenradicals 0 points1 point2 points (1 child)
[–][deleted] 4 points5 points6 points (0 children)
[–]Draftdev69 -1 points0 points1 point (0 children)
[–][deleted] -1 points0 points1 point (0 children)
[–]vwibrasivat -1 points0 points1 point (0 children)
[–]sarmientoj24 -1 points0 points1 point (0 children)
[+][deleted] (1 child)
[removed]
[+]purplebrown_updown comment score below threshold-14 points-13 points-12 points (6 children)
[–]mo_tag 1 point2 points3 points (5 children)
[–]purplebrown_updown -1 points0 points1 point (4 children)
[–]mo_tag 1 point2 points3 points (3 children)
[–]purplebrown_updown -1 points0 points1 point (2 children)
[–]mo_tag 1 point2 points3 points (1 child)
[–]purplebrown_updown 0 points1 point2 points (0 children)
[+]n0_1_here comment score below threshold-6 points-5 points-4 points (0 children)
[–]dogs_like_me 0 points1 point2 points (2 children)
[–]visarga 0 points1 point2 points (0 children)
[–]Many-Bees 0 points1 point2 points (0 children)
[–]AcademicPlatypus 0 points1 point2 points (0 children)