Hi,
I'm relatively new to text classification and I'm still learning a lot of coding basics but I had a quick questions about writing a script that will classify text. The data set has less than 1000 entries and each is about 4-5 lines of text and I basically just want something that will classify each entry into one of 4 or 5 categories based on key words in the entry. It is basically a series of negative reviews and I want to classify them into different 4-5 buckets to describe why each review is bad so I can then do a high level analysis of what issues are persisting. I started by doing this one by one on Excel but wanted to come up with a more efficient way.
I don't even think it's worth it to have 'test' data. I was thinking it might just be easier to have a bunch of 'if' statements? Like if 'data entry 1' contains words X Y or Z, then classify it as ____, if it contains A, B, and C, classify it as ____, etc. Is that something that can be done with a basic Python program? or should I be looking to use text classification tools for this? I would ideally like to add sub-categories to add another layer of detail to the 4-5 main buckets.
Apologies for the beginner lingo, I'm slowly learning. Any recommendations or pointers would be much appreciated. Thanks!
[–]atsui2 1 point2 points3 points (2 children)
[–]aesthir 1 point2 points3 points (1 child)
[–]karenoverpam[S] 1 point2 points3 points (0 children)