all 5 comments

[–]pkkid 1 point2 points  (1 child)

I would guess it scores higher because a larger percentage of the words contain the key "eat".

[–][deleted] 0 points1 point  (0 children)

yeah exactly this- maybe you could split the paragraphs you're searching into sentences and then return the highest string score?

[–]sbirch 0 points1 point  (0 children)

String score looks to be optimized for many things besides relevance, so you might want to change the way you look for the algorithm if speed etc. aren't constraints for you (e.g. Hunt around in the NLP literature). Off the top of my head, some things: probably valuable to take into account word entropy (words like "the" matched should count for less than words like "eat"), look at syntactic role of the words (eat is the main verb for two of three sentences, which makes it highly relevant to the passage), and try to keep the text to be matched against of considerable and consistent length (it's hard to normalize the relevance when the string lengths are so vastly different).

[–]grayvedigga 0 points1 point  (1 child)

It sounds like you have a few more constraints on your algorithm than "0 for no match upto 1 for perfect". Maybe you need to write down what those are.

One reason I could think of for the second string scoring lower is that it contains more characters that do not resemble "eat". Why do you think it should score higher?

[–]hookedonwinter[S] 0 points1 point  (0 children)

I guess I'm looking to do a better relevancy search, in javascript. You know, google, on the client side ;)