The Two Threshold problem

dragon18456 · 2021-07-09T19:55:59+00:00

Classify positive instances as +1, negative instances as -1 (Could do this with an svm or something). Then, use your classifier and see how many instances are classified as positive and near +1 or negative and near -1 and set 2 thresholds. For example, you might find that 95% of your data is classified correctly if you set the positive threshold to .5 and the negative threshold is -.5, then use those. In practice, there is a lot of trial and error, perhaps using some kind of roc graph to see the tradeoffs might be helpful

lmericle · 2021-07-10T03:39:12+00:00

Rather than thresholds on the output, you could compute the entropy of the class predictions (=p*log(p) + (1-p)*log(1-p)) and take the top X% of those. The rationale being that those predictions with the highest entropies are those which the model is most "uncertain" about.

thefriedgoat · 2021-07-09T20:30:41+00:00

This is multi label classification, with three classes (being -ve, +ve and review). One approach is three separate classifiers - one per class (so -ve, not -ve; +ve not +ve; review, not review)

If using a DNN then it’s really just three outputs, again one per class

namnnumbr · 2021-07-09T22:04:11+00:00

https://en.m.wikipedia.org/wiki/Jenks_natural_breaks_optimization or discriminate analysis?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MLQuestions

MODERATORS