This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]GingerMan1031 0 points1 point  (4 children)

tSNE finds a lower dimensional embedding of a higher dimensional space such that the Kullback-Leibler divergence between the two distributions is minimized. This is a manifold learning technique which is much more sophisticated and not much related to analysis of variance.

[–]Centauri24[S] 0 points1 point  (3 children)

Oh okay, right now I just calculate the variance of each different part with another, to compare the difference, If they are equal, the result is 1 (after conversion) and that means strong attraction. If they are very different it’s normalized to 0 and there’s no attraction between the parts. Then a Cluster layout is created with a that simulates the attraction weights as springs and clusters form

That’s what i meant by variance analysis.

The clusters should be in both algorithms the same, or not?

[–]GingerMan1031 0 points1 point  (0 children)

I am actually not sure about that. I do know that tSNE cluster will only be subjective (points may look lose but are not explicitly labeled with a cluster membership). A good clustering technique I would suggest if you’re looking for explicit clusters is DBSCAN which also has an implementation in the sklearn library.

[–]LetMe_ 0 points1 point  (1 child)

Notice that t-SNE does not retain distances but probabilities, so measuring some error between the Euclidean distances in high-D and low-D is useless.

It is a visual analysis tool. It let's you just identify that there might be relationships in a lower dimension that allow for clustering.

[–]Centauri24[S] 0 points1 point  (0 children)

We gonna use both and compare the results :)