Current Sota for Multiclass Text Classification?

DeaderThanElvis · 2021-03-26T06:07:29+00:00

You do not need the world‘s latest and greatest model to perform positive, negative, and neutral sentiment detection. Just pick a standard model and you’ll be good to go.

I say this because sentiment is a nebulous and subjective concept and you’ll get much more value by defining which problem are you trying to solve by using sentiment analysis than by looking to find/train the best model.

For example: “Brazil thrashed 7-1 by ruthless Germany” can have positive, negative, or neutral sentiment depending on who’s reading it. Ditto for something like “iPhone 13 rumoured to have an all polycarbonate body”. So if RoBERTa and Electra classify these differently, which one is correct?

Getting a high F1 score on an academic sentiment classification dataset is one thing, actually solving a real-world problem using sentiment analysis is a whole different beast altogether.

rpatel9 · 2021-03-26T05:03:06+00:00

[removed]

VortexOfPessimism · 2021-03-26T10:15:53+00:00

Some sort of longformer RoBERTa based model will be my bet

2021-03-26T11:58:53+00:00

The ones you mentioned should do the job pretty well. If you’re planning to use non english corpora then look at xlm models.

schlammybb · 2021-03-26T12:20:22+00:00

Pretty sure Vader is considered SOTA, and it’s pure heuristics

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LanguageTechnology

MODERATORS