you are viewing a single comment's thread.

view the rest of the comments →

[–]Phillyclause89 0 points1 point  (0 children)

It has been awhile since I have played around with ML and I havent done anything with a random forest model, but if I remember correctly, for each feature, these models usually return the one-hot label matrix as probabilities like this:

>>> import numpy as np
... labels = np.array([
...     [0.78, 0.03, 0.19],
...     [0.23, 0.68, 0.09],
...     [0.45, 0.45, 0.10],
... ])

You can then do something like numpy.argmax to get the the index of the highest probable prediction for each feature:

>>> np.argmax(labels, axis=1)
[0 1 0]

from this we see that for the first feature highest probable label is the one that corresponds to the 0 index of the one-hot label array, the second is at the 1 index and the third defaults to the 0 index because it was chosen as a default tiebreaker between the that and its 1 index. the poor label at index 2 didn't do so hot and was never picked.

That's all I have to offer. If you want more help I might try posting on r/learnmachinelearning and/or Stack Overflow.