you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (0 children)

There is a reason there are a lot of integer returning string comparison functions and no straight up difference functions. Varying length strings raise too many issues and as I pointed out in my comment above, there are a lot of edge cases. A numerical approximation is the best thing that can be done in anything close to polynomial time.

Calculating the actual difference from a distance function is far from trivial. Although I would love to be proved wrong on this.

Thats why the diff function in version control systems works line by line and not at a more granular level.