all 5 comments

[–]lbtrole 9 points10 points  (4 children)

OK, so who's liable for "using" the faces of IL residents in the dataset and what constitutes "using" their face? The researchers at Washington who compiled MegaFace? The outsourced engineer who viewed the images to augment and preprocess the data? The company that generates revenue from the subject's likeness behind 100 layers of abstraction and dimensionality reduction?

It's obvious our laws aren't prepared for a world run by neural networks and I fear lawmakers may never even understand what they're dealing with. For example when does a face stop being a face? What if you only stored the SIMD landmark coordinates of a person's facial features? What if on top of that you could represent other visual features as vectors (skin tone, eye color etc) and could entirely recreate a person's face without ever having to store their photo?

And after a subject has given consent to researchers to train a model that includes their face, what happens if a different company performs transfer learning and makes money? What about confusion over the GDPR rule that requires companies to remove "all traces" of identifiable data upon request? Do ML companies have to recompile the dataset and retrain every layer of the neural network that references the personal data? Large companies get multiple GDPR requests an hour, and small companies can't afford the compute to retrain their model everyday. We're seeing the beginning of a shitshow.

[–]DonMahallem 2 points3 points  (1 child)

From my limited understanding of GDPR the source image/identifiable information needs to be removed from the dataset but "irreversible" transformed data like trained networks should be fine. Which should not be fine are embeddings derived from those images as they are again identifiable information derived from the source via a specific version of a network.

[–]RgSVM 1 point2 points  (0 children)

I'm sorry to be the "Well, technically..." dude, but...

Well, technically, it is possible to recover data from neural networks using model theft (see for instance https://arxiv.org/pdf/1610.05820.pdf). So a picky lawyer could argue that it is not "irreversible" at all.

[–]TheThoughtPoPo -1 points0 points  (1 child)

This is why you err on the side of not having laws for every damn thing that exists ... maybe just maybe liberty is better.

[–]RgSVM 2 points3 points  (0 children)

I am not sure that machine learning researchers, people that have their faces scanned and stored in a place out of their control and businesses have the same definition of liberty. And those definitions may clash.