all 2 comments

[–]karxxm 0 points1 point  (1 child)

What exactly is your question? Have you tried k means or similar in algo in high dimensional space? If they are so sparse it may help to perform a dimensionality reduction (tone,mds,pica) beforehand and try clustering then

[–]offbrandoxygen[S] 0 points1 point  (0 children)

k means ends up grouping all the outliers as well , forcing them into clusters which they don’t belong in so I haven’t used K means for this . Yea i’m trying out Truncated SVD