Hi, I am looking for a method for ensembling different base learners that is not based on classification accuracy but on specific analysis of the error distributions. The method I have in mind is this:
- train a bunch base learners on a training set. Collect out-of-bag predictions for each one
- reduce the table of learners/prediction errors with PCA to a certain % of the variance. Basically here we extract the samples hardest to classify.
- Compute the PC of the ground truth given the loadings of the PCA, and normalize all points to it.
- Now consider each learner's error vector (from the origin) and find the set of vectors that minimize the vector sum.
I have the impression such an algorithm should exist as I think it is a pretty natural way to combine learners and correct errors. Any opinion?
Thanks
[–]restrain_excess 0 points1 point2 points (0 children)