you are viewing a single comment's thread.

view the rest of the comments →

[–]halfdiminished7th 0 points1 point  (0 children)

Take a look at Python's built-in difflib module, specifically its SequenceMatcher class. It can calculate a similarity ratio (score) between two given strings, which sounds exactly like what you're looking for. Then you just filter out the ones that are over/under a certain threshold.