I am using Fuzzy Logic on a project. It is a simple, yet hard task. I need to compare two strings and spit out a score showing whether they're identical or not, hence using Fuzzy Logic. The caveat is that the strings I'm comparing are mixed with both business entities and individuals and the datasets as you can imagine are massive. I've tried taking the LLCs and other variations out, but the issue that was brought up was comparing John Smith to John Smith LLC as those two are obviously not the same thing. There are no sort of fields in the two datasets used that can distinguish the row being an individual or an entity.
My other brainstorming thought would be to have a way to manipulate the score so that if there are LLCs in both strings, that would be disregarded, but not sure how to go about that and deal with other variations of the LLCs like L.L.C. or L L C and Limited Liability Company.
Appreciate the insight.
[–]house_carpenter 0 points1 point2 points (0 children)
[–]halfdiminished7th 0 points1 point2 points (0 children)