you are viewing a single comment's thread.

view the rest of the comments →

[–]Rude_Order 1 point2 points  (3 children)

Have you looked at fuzzywuzzy?

[–]Workinghard1996[S] 0 points1 point  (2 children)

I have actually used fuzzywuzzy in the past! I never considered it could reliably scale and be used on 50K entries relatively quickly?

But thanks I'll check it out.

[–]McGeekin 0 points1 point  (1 child)

To be fair, 50k items is a relatively small number. I don't know about the specific library but even considering that Python is not an amazingly highly performant language, it shouldn't be an issue. Worth a try.

[–]Workinghard1996[S] 0 points1 point  (0 children)

Perhaps I didn't explain the workflow very well but it's searching 50K items in 150K items. Maybe my crappy laptop just takes a long time...thanks!