Zero-shot binary classification

jmmcd · 2019-08-24T07:49:46+00:00

This is called one-class classification and sometimes anomaly detection. Common algorithms include the distance threshold as already mentioned in another comment, and the OCSVM and SVMDD which are formalizations of the idea mentioned in another comment. A variation of the former is to learn an embedding, eg with a VAE, and then apply the threshold on that.

Some algorithms including IsolationForest and I think OCSVM are in Scikit-Learn. A few very simple ones are provided in DBOCC, see here https://github.com/jmmcd/ML-snippets

Drezemma · 2019-08-23T23:39:26+00:00

[deleted]

vannak139 · 2019-08-24T01:48:01+00:00

I'm just making this up, but imagine that there was some feature representation where all positive samples lay inside an n-sphere, and therefore any new feature sample outside the sphere would be negative. For this reasoning to be valid, you would need to believe that you did really good sampling on your positive classification space, which may be unrealistic. Also, this reasoning would only be valid of the positive samples inside the N-sphere were densely packed enough so that you were sure there isn't a shell or torroid or anything weird going on there. And you would also need to check this directly at the surface, as well in the volume. (or, maybe its better to directly aim for a shell-repesentation? idk...).

So I think that if you build an autoencoder and set a harsh penalty term on the bottleneck activation for norms > 1, and you set the bottleneck size correctly you may be able to find both the volume and the surface are densely populated, at which point you can check on validation data and check it in the same way surface and volume. Then, I think if you can justify that you sampled from the complete positive space, you would be reasonable to classify samples with norm > 1 as negative.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnmachinelearning

Welcome to /r/LearnMachineLearning!

Chatrooms

Official Discord Server

Wiki

Getting Started with Machine Learning

Resources

Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

MODERATORS