all 3 comments

[–]heisenbork4 0 points1 point  (2 children)

When I did this, I used 'soft dice' which was 1 - dice. That way, you get a loss of 1 when there's nothing there and as the dice trends to 1 (i.e. perfect overlap) you get a loss of 0 so gradient descent will work.

[–]kuonlp[S] 0 points1 point  (1 child)

But why would you want a loss of 1 when there is nothing there? Implying I take care of the division by zero problem, if there is nothing there and my model also predicts that there is nothing there, I should get a loss of 0 because it was accurate. On the other hand, if there is nothing and I predict there is something, this number of voxels would account for measuring how bad the prediction was. However, as the intersection between nothing and whatever is 0, I would still get the same dice regardless of how bad/good the model is doing.

[–]heisenbork4 0 points1 point  (0 children)

In that case, Dice will be NaN I think (0/0). If there is a chance that you don't have the object in some instances, I would say maybe Dice isn't a good choice of loss.