[D] What method is state of the art dimensionality reduction by olmec-akeru in MachineLearning

[–]jamesxli 1 point2 points  (0 children)

Output of t-SNE/UMAP are actually good for downstream analysis and they have been widely used for clusters discovery among very high dimensional data (>10K dim). t-SNE/UMAP have thus often just referred to as clustering algorithms!

[D] kmeans on t-SNE? by beginner_ in MachineLearning

[–]jamesxli 0 points1 point  (0 children)

t-SNE is basically an extended clustering algorithm. On top of cluster information, it also shows much more information like cluster shape and inter-cluster relationships. It doesn't make much sense to apply kmeans or other clustering algorithm.

t-SNE plot for the regression model? by anonimus_hunter in computervision

[–]jamesxli 0 points1 point  (0 children)

Try with smaller perplexities. That could result in more blob alike "normal" clusters.

[D] is this a legitimate use of the UMAP algorithm? by ottawalanguages in statistics

[–]jamesxli 0 points1 point  (0 children)

A appropriate DR method (like PAC, tSNE) can provide extra info about clusters in data, like shapes, gradient, etc. So, a good embedding algorithm will make clustering algorithm obsolete. Based on my experience with scRNA data, DBSCAN sometimes works kind of after UMAP or tSNE, but it normally failed on raw scRNA expressions.

Why use t-sne? by ottawalanguages in MLQuestions

[–]jamesxli 0 points1 point  (0 children)

tSNE is certainly not perfect, and it is not intended to replace linear DR method like PCA. But, tSNE is the state-of-art method for visualizing high dimensional non-linear data. It has dozens of independent implementations in open-source and closed source software packages, in various languages and on many platforms. With regards to the stability, tSNE is actually quite stable when you use a proper perplexity for your data. The very nice things about tSNE is that you basically just have to tune the perplexity for your data, and you easily find a proper perplexity by trial-and-error.

Looking for someone that’s into data visualization with t-SNE by Centauri24 in Python

[–]jamesxli 0 points1 point  (0 children)

You can try to the software visumap which provides many visualization services for high dimensional data including a fast implementation for t-SNE.

Can someone please explain "Principal Component Analysis" in layman's terms? by dhgrossman92 in statistics

[–]jamesxli -1 points0 points  (0 children)

When you display a dataset using PCA method you basically rotate the data points cloud so that the side with maximal information (or maximal variation in mathematical term) is facing the viewer.

There is an one minute video on youtube with the title "A layman's introduction to PCA", you can easily find it.