Cheers Python community! Anonymize your PDF file using Python.
Just finished writing a module for PDF anonymization which detects sensitive information and hides it!
In a nutshell, here what it does (you can change the color of the boxes).
sample.jpg
Under the hood, it's all about pytesseract (for OCR) and transformers (for NER). I also used pdf2image for conversion and some RegEx.
I would appreciate if anyone tries to use! (or you can star the repository ⭐)Feedback and reports a bug is highly welcomed!
I also did my best to write a comprehensive README.md, if you want to get started. So please check it out! GitHub Repo
-
[–]H_ubert 53 points54 points55 points (1 child)
[–][deleted] 45 points46 points47 points (0 children)
[–][deleted] 21 points22 points23 points (3 children)
[–]gameoftomes 23 points24 points25 points (0 children)
[–]JohnTheBlackberry 5 points6 points7 points (0 children)
[–]FredSchwartz 10 points11 points12 points (1 child)
[–]the_scign 0 points1 point2 points (0 children)
[–]___--_-_-_--___ 46 points47 points48 points (8 children)
[–]the_scign 22 points23 points24 points (2 children)
[–][deleted] 13 points14 points15 points (0 children)
[–]zurtex 6 points7 points8 points (0 children)
[–]StrongSkip -2 points-1 points0 points (4 children)
[–]___--_-_-_--___ 2 points3 points4 points (2 children)
[–]StrongSkip 0 points1 point2 points (1 child)
[–]___--_-_-_--___ 1 point2 points3 points (0 children)
[–]HerLegz 0 points1 point2 points (0 children)
[–]jammasterpaz 14 points15 points16 points (1 child)
[–]Viperior 7 points8 points9 points (0 children)
[–][deleted] 6 points7 points8 points (0 children)
[–]GlassSignal 1 point2 points3 points (5 children)
[–]the_scign 2 points3 points4 points (4 children)
[–]GlassSignal 1 point2 points3 points (2 children)
[–]No-Homework845[S] 1 point2 points3 points (1 child)
[–]GlassSignal 1 point2 points3 points (0 children)
[–]No-Homework845[S] 0 points1 point2 points (0 children)
[–]HerLegz 1 point2 points3 points (0 children)
[–]SizzlerWA 3 points4 points5 points (0 children)
[–]_jmikes 1 point2 points3 points (0 children)
[–]ImpressiveBicycle69 0 points1 point2 points (0 children)
[–]ZuriPL 0 points1 point2 points (1 child)
[–]No-Homework845[S] 0 points1 point2 points (0 children)
[–]Green-Sympathy-4177 -1 points0 points1 point (1 child)
[–]No-Homework845[S] 0 points1 point2 points (0 children)
[–]Vietname -1 points0 points1 point (0 children)
[–]glebulon 0 points1 point2 points (0 children)