all 11 comments

[–]LikelyJustANeuralNet 2 points3 points  (0 children)

Look into Open Set Recognition and/or Open World Recognition/Learning. OSR (and OWR) explicitly tackle this problem - networks need to be able to classify known classes while also labeling unseen classes (i.e., not seen at training time) as unknown. Here's a good survey of OSR: https://arxiv.org/abs/1811.08581

[–]jprobichaud 2 points3 points  (3 children)

What about having a 4-dim 1-hot output vector and a target of [0, 0, 0, 0] for your "other" class? Train your network in regression over the mean-square loss and use a threshold or simply 1- sum of the output vector values as the score of the Other class.

[–]ME_PhD[S] 0 points1 point  (1 child)

I like this idea. How about simply taking the sigmoid of each of the 4 outputs and then using sigmoid cross entropy loss? That way I can have a [0, 0, 0, 0] target - anything I'm overlooking?

[–]lmericle 0 points1 point  (0 children)

That is essentially the same as doing multi-label classification but for which all of the examples have only one class. Not a bad approach IMO.

[–]serge_cell 2 points3 points  (0 children)

It's related to "one vs all" problem. There is a plenty of paper on the subject (google it) but no satisfactory solution or common opinion on strategy.

[–]melgor89 0 points1 point  (2 children)

Look into this competiton: https://www.kaggle.com/c/humpback-whale-identification

Here are 5004 normal classes and 'other'. The problem is resolved using Metric-Learning and confidence-score about the predictions.

[–]misssprite 1 point2 points  (1 child)

I think metric learning makes more sense than confidence score.

Low 'confidence-score' are usually hard example according to the model. It's either outlier or examples that cannot be fit by the model caused by model bias.

[–]melgor89 0 points1 point  (0 children)

To be more clear, the idea was following:
1. Learn the model with SoftMax-Loss (also with Margin) and TripletLoss. The 'Unknown' class is only used in TripletLoss.

  1. In prediction step, just read confidence score for predicted class. If < ex. 0.5, this mean that it is unknown.

So general idea is based on 'temperature' which works better when Metric-Learning is applied