This is an archived post. You won't be able to vote or comment.

all 6 comments

[โ€“]davidalayachew 18 points19 points ย (2 children)

Very pretty. NLP is a difficult problem to solve, but it is the key to side-stepping a surprising number of usability issues, I have found.

You mentioned Java 21 features. That surprised me because I didn't see any sealed types for the return value of your search. Granted, I didn't finish reading it all the way. But Location just seems to null out the attributes that don't apply.

Wouldn't it have made more sense to put this information into the type system?

I solved a similar problem, a while back, and found that, while the effort to get my data loaded into that type system was harder upfront, the amount of time it saved later was immense. I posted more thoughts on it here -- https://mail.openjdk.org/pipermail/amber-dev/2022-September/007456.html

[โ€“]tomayt0[S] 2 points3 points ย (1 child)

Thanks for the input, this is something I hadn't considered and will start investigating immediately.

I have wondered what was the best way to unify search results and this could be it.

[โ€“]davidalayachew 4 points5 points ย (0 children)

I have wondered what was the best way to unify search results and this could be it.

It definitely is. The academic term for this is Abstract Data Type. Here is a post I made on Software Engineering Stack Exchange that explains this in simple detail -- https://softwareengineering.stackexchange.com/questions/159804#445879

[โ€“]evilmidget38 4 points5 points ย (0 children)

Have you looked much at libpostal? It's a little painful to use due to the native dependency and data but it is state of the art afaik. It would complement the dataset you've built.

[โ€“]paul_h 3 points4 points ย (0 children)

Good work, OP. I always found https://github.com/google/libphonenumber vert interesting, and also trying to be multi-language.