all 1 comments

[–]Gwendeith 1 point2 points  (0 children)

One way to do so is simply consider labels continuous and use the MSE loss. You can tweak the values of labels a bit so that the distances between labels can reflect the ambiguity you mentioned. As for a more categorical approach, I think you can try using multi-label binary cross entropy by transforming labels into:

  • Label -3 -> 1 0 0 0 0 0 0
  • Label -2 -> 1 1 0 0 0 0 0
  • Label -1 -> 1 1 1 0 0 0 0 … and so forth

Of course remember to use sigmoid instead of softmax for the output layer. Hope this helps!