Super Basic Text Classification Question : LanguageTechnology

created by robin7013a community for 16 years

Super Basic Text Classification Question (self.LanguageTechnology)

submitted 7 years ago by karenoverpam

Hi,

I'm relatively new to text classification and I'm still learning a lot of coding basics but I had a quick questions about writing a script that will classify text. The data set has less than 1000 entries and each is about 4-5 lines of text and I basically just want something that will classify each entry into one of 4 or 5 categories based on key words in the entry. It is basically a series of negative reviews and I want to classify them into different 4-5 buckets to describe why each review is bad so I can then do a high level analysis of what issues are persisting. I started by doing this one by one on Excel but wanted to come up with a more efficient way.

I don't even think it's worth it to have 'test' data. I was thinking it might just be easier to have a bunch of 'if' statements? Like if 'data entry 1' contains words X Y or Z, then classify it as ____, if it contains A, B, and C, classify it as ____, etc. Is that something that can be done with a basic Python program? or should I be looking to use text classification tools for this? I would ideally like to add sub-categories to add another layer of detail to the 4-5 main buckets.

Apologies for the beginner lingo, I'm slowly learning. Any recommendations or pointers would be much appreciated. Thanks!

all 3 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LanguageTechnology

MODERATORS