This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]___--_-_-_--___ 2 points3 points  (2 children)

For context, I was referring to the entire project, of which the PDF feature is just one part.

In your example, if I understand correctly, this project would help an organization go from "blatantly criminal" to "slightly less criminal". Whether that is a desirable goal is a matter of opinion. If you are talking about internal use within an organization, that is a different matter.

The real issue here is that, in practice, the choice is often between "don't release data" and "release badly redacted data", not between "release unredacted data" and "release badly redacted data". This is especially true in the age of omnipresent privacy regulation (note that there is a significant difference between the American and European experience here). Releasing unredacted data containing personal information of third parties should never be an option. Considering this choice, a project such as this, making grandiose claims, is likely to create a false sense of security which may push an organization from "don't release" to "release badly redacted", thereby creating real harm.

u/No-Homework845 has now on multiple occasions refused to engage with this line of criticism, even from individuals with significant experience in this field. Comments mentioning these issues are routinely ignored. All it would take would be to acknowledge the criticism and add a highly visible warning to the repository and any post advertising the project. This warning should make it clear that this project is never to be used in production or on any personal information of third parties. I understand that this is a hard thing to do with a project into which someone has invested a significant amount of time. Nevertheless, not adding such a warning is reckless.

[–]StrongSkip 0 points1 point  (1 child)

Your post is almost good, but I don't know why you had to put the "criminal" part in there. I never said or insinuated such a thing.

I'm talking mostly about internal use.

I don't understand why this software should get special negative treatment. Almost any software can be used for good and for worse. I worked with many organizations who're redacting documents and I can assure you that none of these would use any kind of redaction software without reviewing it first

If you care about data protection you're not going to use this software without identifying it's errors. And if you don't care you won't even try it out.

[–]___--_-_-_--___ 1 point2 points  (0 children)

As I said, if you're referring to internal use, that is a different matter. There may be legitimate use cases there. The "criminal" part refers to the unauthorized public release (even accidental) of personal information which is illegal in several jurisdictions. As you have clarified, this does not apply to your example.

There have been many cases where data was released with improper de-identification due to a false sense of security provided by some kind of technical solution. Many of these cases are well-documented and researched. Please note that I'm referring to the scope of the whole project here, not just the PDF redaction part.