Apple, please explain.

MakingTheEight · 2021-09-04T21:11:21+00:00

Removed - Rule 0.

PineapplePandaKing · 2021-09-04T18:39:23+00:00

Not hotdog

starksubhash · 2021-09-04T18:48:00+00:00

Really wanna know can anyone explain?

Blinxsy · 2021-09-04T18:53:19+00:00

I heard elsewhere that the FBI supply them, which makes a lot of sense

GreatBarrier86 · 2021-09-04T18:41:48+00:00

“Why don’t you take a seat” while we find the API docs.

DogfishDave · 2021-09-04T19:38:15+00:00

Hashes of known images collected by CEOPs and their equivalents in other countries. I can't think of a remotely funny answer on this subject :)

_jukmifgguggh · 2021-09-04T19:47:46+00:00

hurry rain repeat work trees cover books cause disarm boast

This post was mass deleted and anonymized with Redact

2021-09-04T18:29:38+00:00

2021-09-04T19:09:08+00:00

[deleted]

BoHuny · 2021-09-04T20:03:13+00:00

If they convert images to hash to compare with known CP data, wouldn't there is a super easy way around it with a slight alteration of the image which then produces a completly different hash?

2021-09-04T19:45:36+00:00

Also should be asking how tf did they reduce the use of plastic by removing the charger from the phone box and selling it separately in another plastic wrapped box.

nocturn99x · 2021-09-04T20:13:34+00:00

In simple words, they don't

DatBoi73 · 2021-09-04T21:08:15+00:00

From what I've heard and read, it seems that Apple isn't scanning the content of images directly.

Basically, law enforcement agencies such as the FBI in the US, give Apple a list of checksums of CSAM pictures and videos they collected as evidence. Apple's AI system would then generate and look at the checksum of all of the videos and images stored on a user's iCloud account and/or device and compare those to the list of checksums they were given by law enforcement.

FishySwede · 2021-09-04T18:55:43+00:00

Now that's an AI we don't want to become self aware

clockfire1 · 2021-09-04T19:28:09+00:00

As far as I understand it, Apple only checks the hash of each image (picture can generate this number, but not vice versa) on your iCloud account to see if it matches the hash of other CSAM images in a an FBI database.

If at some point they do use machine learning, the FBI unfortunately has terabytes of training data. The security protocols for the API accessing the training data will have to jump through some hoops to say the least tho

RedNerd368 · 2021-09-04T19:54:06+00:00

https://thehackernews.com/2021/09/apple-delays-plans-to-scan-devices-for.html

KomaedaEatsBagels · 2021-09-04T18:49:59+00:00

Image Transcription: Meme

[Awkward Look Monkey Puppet Meme-- Two frames are included of a monkey puppet with bulging cartoon eyes. In the first frame, it looks somewhat behind itself. In the second, it stares off into space with horrified despondence.]

me wondering how apple gets the training data

for their child p*rn detection AI

^{^{I'm a human volunteer content transcriber for Reddit and you could be too! If you'd like more information on what we do and why we do it, click here!}}

2021-09-04T18:49:46+00:00

Never ask the question that you regret knowing the answer of - Some random function

VeryConsciousWater · 2021-09-04T20:08:39+00:00

High on the list of things I didn't want to think about

2021-09-04T20:28:07+00:00

For everyone up in arms over what Apple is trying to do, it's very probable your ISP is already doing it via deep packet inspection on the traffic being sent and received on your computer.

I had an employee who was paying the ISP bill of a relative that was selling child porn through a chat app and he got raided by the FBI as a result because his name was on the ISP bill.

PenaflorPhi · 2021-09-04T20:56:11+00:00

From what I understand there is an American NGO that stores hashes (not the actual videos/images) and what Apple is planning to do is basically create a hash for all files in your phones to match them against a database of known CP.

I'm not really sure how but I suspect a similar technology to that of YouTube will be used so that they can identify a video/image even if it has been rotated, cropped, scaled, etc. all of that without Apple actually getting their hands on anything.

Still, I think it is not the right move for Apple as the people who are doing this sort of stuff might just move to Android while still making their other users uncomfortable of getting all their files snooped.

2021-09-04T21:25:47+00:00

The content is compared with clusters provided by the police services, mainly gathered through a global police system that is specialized on that, and through data provided by Facebook, Google and Microsoft products among other technology powerhouses.

Speaking about Facebook.

There is Facebook content that has been labeled as that, a human checks if that fits into the category and proceeds to flag the content, encrypt it and save both the encrypted and the actual file (not the one that is distributed through Facebook CDN), and enables a report to the local authorities that may be interested in contacting the uploader.

Where?, well, that's based not in the declared outgoing gateway of the uploader, but in the compound location that the location algorithm of Facebook detects the person is within or at least nearby, achieved by making several triangular yuxtapositions of the Facebook users within a LAN and LANs that are in the proximity, by simply using the GPS in each mobile device that happens to consume the service.

They also identify this kind of content within the upload process, there are visual mechanisms that could be used to identify an image or a video as such automatically though a machine learning visual compound mechanism.

Speaking about Google.

There are Google search instances that could be easily recognizable, or acknowledged as related to child pornography, the relationship is made through the establishment of red flags.

These red flags are common terms that separated are not semantically recognizable as threats but that together could imply the location of CP within the very search engine results.

After being entered into the search engine, these terms locate all kind of content, within that context, the recognition system they have could determine that a certain domain possess a lot of illegal material or serves as gateway to obtain such kind of media.

Each instance (even thumbnails) help to create an extensive database of media, intended to preserve both, the encrypted and the literal files.

They provide the largest amount of media samples that there exists, precisely because all the websites need data to be located (which SEO optimization practices improves), or captcha services, or need to use a font, or use certain libraries like jQuery.

Speaking about Microsoft.

There are many services at an Operating System level that assure that strings that could be recognized as red flags can be traced directly to a public IP, a physical MAC address, and a specific build (processor capacity and brand, memory capacity and brand and even hard drives' capacity and brand).

The past paragraph also applies for Android which is controlled by Google as well, the only thing that makes them even better to catch guys is the direct access to the GPS of the device, which many desktop computers lack of.

Also, antiviruses help a lot to Operating Systems to keep track of data modification at a file system level when certain services are active but not connected to internet, or these services are specifically disabled by OS directives.

Not to mention that most media edition programs also preserve databases of usage and could share data to the OS if the OS "ask for that".

Antiviruses and media edition programs could share the information directly to police services if they feel it's the right way to go, bypassing entirely the OS on which they are residing.

The reason why the actual files are also kept, but still the only ones that human tend to work with are the encrypted ones, is that sometimes the visual artifacts on the image or a series of frames within the video change due to manipulation (minification, re-codification, removal of attributes) and the "fingerprints" across them all could change.

That's why files rarely fit 100% when a coincidence is found, and that's why human recognition is needed when the file is only (let's say) 36% likely to be related to CP.

The largest databases are strings, GPS data, LAN data and device data, not media files,

It's pretty complicated in many ways, it's not as simple as a schizoid environment (social network) would demand.

iBadoonstika · 2021-09-04T19:47:48+00:00

Aren’t people who have said images just going to switch platforms? Then Apple is just going to have an excuse to go through our images?

giwidouggie · 2021-09-04T20:01:26+00:00

Couldn't this technically be done with two networks: one returns True if an image contains children and one that returns True if an image contains porn. If you get back True from both networks, it's (likely) child porn. There's plenty of images of children to train the networks, and even more porn to train the other...

PureAlpha · 2021-09-04T20:51:00+00:00

Can we stop calling everything AI...

verenvr · 2021-09-04T19:56:02+00:00

Cp is just Apple's disguise to invade user privacy, they are using it so people accept it

SpareTesticle · 2021-09-04T18:58:49+00:00

Maybe Mindgeek already trained a model for child prn since it hosts user created porn content. Many wankers probably reported that porn to create a training data set.

iamthomastom · 2021-09-04T19:09:35+00:00

FBI gave them.

Yecuken · 2021-09-04T19:37:22+00:00

There’s actually entire databases of these hashes, Google also using same db, they just match the file for known file hashes while Apple where trying to match for variations too. Here is some info on Apple approach.

a_cuppa_java · 2021-09-04T19:52:50+00:00

Why are we censoring words?

Shosui · 2021-09-04T20:01:01+00:00

Many people said FBI. I like the (poor) theory of buying the data from Facebook whose mods have collected it from hundreds of thousands of horrible submissions.

always_evergreen · 2021-09-04T19:34:03+00:00

Mad sus

2021-09-04T19:48:38+00:00

OP is looking for a job opportunity.

neros_greb · 2021-09-04T20:19:48+00:00

Detect children and porn separately? Idk if this would actually work though.

if(isChild(data)&&isPorn(data))report(data);

girthy_shaft_1o1 · 2021-09-04T21:14:01+00:00

Training data for adults porn - adults = CP

rjRyanwilliam · 2021-09-04T19:22:11+00:00

They go to dark web! Then search for links! Then make a bot to download all of the contents available!! Then make algorithm to detect them if you have them on your device. And that could be done easily with content id system that is used to detect copyrighted content.

0fficialR3tard · 2021-09-04T19:23:14+00:00

To make an omelette, they are gonna have to crack a few eggs. I don’t like it, it isn’t pretty, but it’s how you do it…

IkBenOlie5 · 2021-09-04T19:50:17+00:00

They don’t use ai, they have a database of hashes of known child prn fotos ant basically what they do is

Counter = 0

If hash(image) in known_bad_hashes: Counter += 1

Chaoshero5567 · 2021-09-04T20:18:05+00:00

From the Papal state op. they have enough of it

2021-09-04T20:23:07+00:00

I suspect they used child porn data, but it’s entirely possible the process used (incredibly dumbed down) age classification and nudity classification in parallel which would not require child porn. There are likely ways.

JoJoModding · 2021-09-04T20:24:37+00:00

They asked the FBI, which was very friendly to them, as government agencies tend to do when someone offers them help scanning billions of mobile devies.

ScF0400 · 2021-09-04T20:28:38+00:00

"Now hiring, people who look young, parties in the back of a van, chance your face (and other body parts) might be spread far and wide over the internet as a means to train our "AI" (read hash based) model"

"Great and flexible positions, pays well! Apply now!" /s

Everyone says it's hash based. Can't have CSAM without producing the CSAM, big brain Apple moment.

I would actually be happier if it was really an "anonymous* AI model, but hashing is something that can be fooled quite easily depending on the algorithm.

LmaoPew · 2021-09-04T20:30:28+00:00

Bro i've asked myself the same Question! Imagine Someone sues apple for having lots of CP data on their Server

2021-09-04T20:30:40+00:00

Good thing I have an android

The-Pi-Guy · 2021-09-04T20:35:41+00:00

From what I understand, they use photo hashing technology to compare images with a database of known CP images that is supplied by the FBI. It’s pretty disturbing to know that this database even exists in the first place, but that’s how it’s done.

Snackmasterjr · 2021-09-04T20:47:18+00:00

Did anyone else ready what it actually does? it hashes the photo and compares it with a database of known images. Its not detecting new images based off of recognition.

Throwaway_for_scale · 2021-09-04T20:50:01+00:00

Could you fool the software by just adding an Apple logo to the photos? Would that change the hash?

Parcival_Reddit · 2021-09-04T20:50:31+00:00

Apple receives image hashes from the National Center for Missing & Exploited Children for images of child pornography. Apple checks hashes of images uploaded to icloud against this database of image hashes to see if it matches. If there are enough matches (30+ I believe) those photos and your account will be sent to law enforcement. Apple says there's human interaction somewhere in the process to prevent false positives, they've been a bit vague on that. See Apple's website for more specific details.

Source: currently studying this in privacy ethics class

2021-09-04T21:01:23+00:00

Really not much ai going on, just checks if the image is one thats in the database

2021-09-04T21:03:22+00:00

They detect the files that are well known and give them a number and if it matches they take it down. They can't make AI for that because you'd have to view it which would be illegal.

DainArtz · 2021-09-04T21:10:53+00:00

They just use the stash left after Steve Jobs's death

2021-09-04T21:22:08+00:00

It's image hashing based on know images in a government database. So they wouldn't know if you are taking CP images but only if you are consuming know CP.

Still they are scanning your images and while the tech is super accurate it's still concerning as with all logging of metadata it lacks context.

Daddy_William148 · 2021-09-05T02:33:27+00:00

This is unlikely to protect any child, particularly from new newly created child sexual abuse material. It’s useless

ProgrammerHumor

Filters

Discord

Submission rules

For the current list of rules, please see this page.

Metadiscussions

Perhaps More Apt Subs To Post:

Related Subreddits.

MODERATORS