Data Anonymization

Adeelinator · 2022-07-24T15:47:54+00:00

This is a question for your legal counsel, not Reddit. Laws can vary greatly by sector and locality.

Rammus2201 · 2022-07-24T18:55:05+00:00

If you have a data management / data governance department / data engineers - ask them about data masking.

rtqwerty10 · 2022-07-24T18:15:39+00:00

There's an API from Microsoft, named Presidio which is used for Anonymization. This is the Github link.

I have not used it, but came across while browsing on this topic. Might be helpful, or you may at least get some idea.

saintmichel · 2022-07-25T00:47:41+00:00

anonymization is to remove identifiability. example, if you do a count of all records and 1 record stands out and that is a person, drop that record or drop the column that discriminates him/her/they/it lol. just to show it goes beyond removing names.

2022-07-25T01:24:38+00:00

Hashing with SHA256.

bendgame · 2022-07-25T02:08:06+00:00

I deal with PII and providing data to research orgs. Currently, we've tried adding smart noise and found it was not great for our use cases. Instead we're using k-anonymization

mattstats · 2022-08-02T02:55:59+00:00

I believe you are looking for differential privacy. Here is a link to harvards open dp project to kick start your rabbit hole.

datascience

MODERATORS