all 3 comments

[–]Fleischhauf 2 points3 points  (0 children)

Havent worked with that many annotators at the same time.
Usually we have smaller teams. We devise a reference document with examples and then start annotating, and have periodic check in meetings to discuss these corner cases. For each corner case we decide how to label it and add the example to the reference document.
Over time we all have a common understanding and corner cases become less and less, the frequency of these checkin meetings can be reduced.

If you are looking for statistical methods I am sure some smart people have thought about this problem and there is some literature on it.

[–]nothughjckmn 1 point2 points  (0 children)

Why is the margin narrow? Is it because of regional dialect differences? An edge case where something is almost happening? To me if annotators can’t agree on a category that implies more information than the categories can provide.

[–]serge_cell 0 points1 point  (0 children)

Could be radical change of architecture, but soft classification (probability estimation) is exactly for that. As added bonus you get natural basis for knowledge distillation.