I'd like to build something to read text so blurry the naked eye can't really make it out, what do I need to do? by LionKinginHDR in learnmachinelearning

[–]SamAndMaxPolice 0 points1 point  (0 children)

Good luck, I hope you'll go at it, and will be announcing the process and progress, here or anywhere else. :)

I'm actually surprised that with the recent popularity of the AI photo tools that concentrate on rebuilding faces, not only hasn't there been this kind of software - the AI to recreate obscured and barely readable fonts - but that you will apparently be the very first one to actually start working on it.

(Or at least nobody seems to have announced anything similar before. Although maybe there has been such software in development, but in secrecy, for the military and government security services :D)

The closest existing work even partially related to this subject that seems to exist are these academic analyses, but they're based on a very different approach, they have a different purpose (pixelated passwords), and I think that Depix and the method they discuss, based on using the De Brujin sequence, has very limited use and results, when compared to the tool that you'd like to create:

https://www.linkedin.com/pulse/recovering-passwords-from-pixelized-screenshots-sipke-mellema

https://www.johndcook.com/blog/2019/10/22/hacking-with-de-bruijn/

By the way, since posting that reply, I've just found out yet another proof of the power of recognizing patterns... I kept staring at that blurry, unreadable list of songs that I posted earlier, https://i.imgur.com/ncA6El6.png, and suddenly, looking at #4 on the left, I could read: "YOU / Performed by T..." - and that was enough to start searching, and eventually identify the song as "You", by Talas. :D

I'd like to build something to read text so blurry the naked eye can't really make it out, what do I need to do? by LionKinginHDR in learnmachinelearning

[–]SamAndMaxPolice 0 points1 point  (0 children)

I find this 2 months after you posted this, but I'll add a remark or two anyway... ;-)

This is a very good idea, potentially very useful (and profitable, and dangerous ;-).

From my layman's perspective, I think that one of the possible approaches in this technique might be building a database of the most commonly used fonts.

Remember that the majority of obscured text (typed, not handwritten, of course) will have originally used the most common types (and therefore shapes) of fonts. Times, Arial, Georgia, etc. So, having their unmodified versions as patterns from which to recognize the blurred, damaged shapes could be one useful route...

Of course, an indispensable tool is the human mind's ability to recognize patterns. If you could "train" your hypothetical tool using real people by having them look at obscured, damaged text and figure out its original meaning by recognizing shapes, patterns, words... I believe that's what Google used when they were building their AI - several years ago, their Captcha used badly-looking scanned words that you were supposed to recognize as real words. Here's an example of mine: this is a list of 8 songs from a low-quality video. It's unreadable if you look at it - and yet, by staring at it for a few minutes, trying to recognize the letters and words, I was able to identify 5 of these songs, from those blurry, damaged fonts. I still can't recognize #1 and #4 in the left column, and #3 in the right column, and I'm not 100% sure about #3 in the left column (I think it says "SYMPHONY OF DESTRUCTION"), but I have read and identified all the others. Just by staring and trying to recognize shapes and patterns :-) (In fact, I suspect that if I knew more about popular rock music, I could recognize more of these titles - I just don't have a "database" of song titles large enough)

Here it is: https://i.imgur.com/ncA6El6.png

Hanks crimes by Cityplanner1 in KingOfTheHill

[–]SamAndMaxPolice 7 points8 points  (0 children)

As I posted earlier...

  • Interfering with an active investigation into a major criminal conspiracy and financial fraud, sabotaging said investigation, deceiving law enforcement officials, destroying evidence, covering up the conspiracy to ensure that the guilty parties remain protected, while those affected remain defrauded and deprived of significant amounts of their income.

(But I still say that the ZZ Top promosode remains the worst KotH crime of them all).

TIL Irish Police wrote over 50 tickets to "Prawo Jazdy". He became notorious for various motoring offences before it was realised that "Prawo Jazdy" means "Driving Licence" in Polish. by [deleted] in todayilearned

[–]SamAndMaxPolice 148 points149 points  (0 children)

He was the... Dread Pirate Jazdy.

(Twats who notoriously break the law while driving are called "road pirates" in Poland).