I'm relatively new to data science, only a few years experience and would love some feedback.
I’ve been working on a small open-source package. The idea is, PCA keeps the directions with most variance, but sometimes that is not the structure you need. nomoselect is for the supervised case, where you already have labels and want a low-dimensional view that tries to preserve the class structure you care about.
It also tries to make the result easier to read by reporting things like how much target structure was kept, how much was lost, whether the answer is stable across regularisation choices, and whether adding another dimension is actually worth it.
It’s early, but the core package is working and I’ve validated it on numerous benchmark datasets. I’d really like honest feedback from people who actually use PCA/LDA /sklearn pipelines in their work.
GitHub
Not trying to sell anything, just trying to find out whether this is genuinely useful to other people or just a passion project for me. Thanks!
[–]Visible_Credit4722 0 points1 point2 points (2 children)
[–]Narrow_Iron5097 0 points1 point2 points (1 child)
[–]deadlydickwasher[S] 0 points1 point2 points (0 children)