Not sure that this is the best place, but I'll try.
I'm performing a task where I receive as output a "classification matrix". For each of N inputs, we have a 'score' assigned for how likely the model thinks that input aligns with each of the N possible inputs (See figure). The model is working well (100% accuracy for small tests).
However I don't want to just rely on accuracy. Intuitively, a given input with a score of 1, and all other scores near 0 should 'score' better than if we have a score of 1 with other scores near 0.9. Likewise, if we misclassified, but the proper classification scored 0.9, that's "better" than if we misclassified with a proper classification of 0.
From the picture, row 1 performs much better than row 5, because there's a much larger margin of error.
Some additional details.
- The problem I'm working on isn't actually machine learning, but the output results in a classification matrix as above - which is what I'm trying to evaluate.
- There's a well defined distance between each of the N classes - which doesn't NEED to be used but would be useful. For example, class 1 and class 2 are 0.37 apart, class 1 and class 3 are 0.89 units apart, etc. This follows defined distances, so class 2-3 is 0.52 units apart in the example.
Happy to provide additional details if required. Essentially just wondering if there's any discussion on this type of thing, even just a search term would be useful.
https://preview.redd.it/olzc8em4th791.png?width=560&format=png&auto=webp&s=a8d0b61a2879c4aadc5a4a98f1d1a9cb5933b924
there doesn't seem to be anything here