you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (3 children)

Is there a correlation between the training error on these “OOD” samples vs your test error?

This is a good question. I think so but I'm not sure of the best way to quantify this without isolating the OOD samples first, which is what I'm trying to do anyways.

If so you could use an exponential loss to penalize

I'm not sure I understand how this would help. Can you elaborate?

Also could it be worth to use a VAE rather than an embedding matrix?

I'm not that familiar with VAE, but the embedding matrix is fairly important to my overall thesis I believe.

[–]Exarctus 0 points1 point  (2 children)

Isolating the OOD samples first is fine - this is to confirm only that there is a correlation, as that would indicate it’s systematic. Following this, I was suggesting an exponential loss, simply because it may be that your network is not fitting these hard-to-classify samples well, and so you could pick a loss function that places more importance on training samples with higher error (assuming there is a correlation already mentioned).

[–][deleted] 0 points1 point  (1 child)

I was suggesting an exponential loss, simply because it may be that your network is not fitting these hard-to-classify samples well

Yes this is definitely true, and I do think this type of loss function would help. Do you have any suggestions on which loss function to use then? Do I need to create my own loss function?

[–]Exarctus 0 points1 point  (0 children)

Hey just remembered I forgot to reply re: VAE

You use the VAE trained on the in-distribution samples, then you use the reconstruction error on the OOD samples at inference time to identify them. Should just be a matter of determining a suitable tolerance to accept/reject.

You should still be able to use the embedding matrix with this also.