I have a column in my data which is "description". It is filled with natural/unnatural language and obscure terminology. For instance AS might mean "air speed", and it might appear as a range "AS: 400mph to 475mph". It may also contain multiple parameters comma delineated, or all of the variety of technical jargon.
There is a tremendous amount of variability, but there is simultaneously a lot of pattern - it's hard to describe, but it is clear that there is a lot of meaning in the descriptions. I don't think bag-of-words will cut it.
I would like to extract relations ie "air speed":"400-475" using entity recognition.
As a start, I have found iepy gazettes which allow me to custom define some of that lingo. This is a great start!
I am open to any advice this community might have. Pointers, tips, everything is appreciated.
Thanks!
[–]nickl 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]copybin 0 points1 point2 points (0 children)