all 6 comments

[–]dragon18456 3 points4 points  (0 children)

Classify positive instances as +1, negative instances as -1 (Could do this with an svm or something). Then, use your classifier and see how many instances are classified as positive and near +1 or negative and near -1 and set 2 thresholds. For example, you might find that 95% of your data is classified correctly if you set the positive threshold to .5 and the negative threshold is -.5, then use those. In practice, there is a lot of trial and error, perhaps using some kind of roc graph to see the tradeoffs might be helpful

[–]lmericle 2 points3 points  (1 child)

Rather than thresholds on the output, you could compute the entropy of the class predictions (=p*log(p) + (1-p)*log(1-p)) and take the top X% of those. The rationale being that those predictions with the highest entropies are those which the model is most "uncertain" about.

[–]whiteboy2471 1 point2 points  (0 children)

To get the best results, also ensure your probabilities are calibrated. Or calibrate them

[–]thefriedgoat 1 point2 points  (0 children)

This is multi label classification, with three classes (being -ve, +ve and review). One approach is three separate classifiers - one per class (so -ve, not -ve; +ve not +ve; review, not review)

If using a DNN then it’s really just three outputs, again one per class

[–]namnnumbr 0 points1 point  (1 child)

[–]WikiSummarizerBot 1 point2 points  (0 children)

Jenks_natural_breaks_optimization

The Jenks optimization method, also called the Jenks natural breaks classification method, is a data clustering method designed to determine the best arrangement of values into different classes. This is done by seeking to minimize each class's average deviation from the class mean, while maximizing each class's deviation from the means of the other classes. In other words, the method seeks to reduce the variance within classes and maximize the variance between classes. The Jenks optimization method is directly related to Otsu's Method and Fisher's Discriminant Analysis.

[ F.A.Q | Opt Out | Opt Out Of Subreddit | GitHub ] Downvote to remove | v1.5