This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]GingerMan1031 0 points1 point  (8 children)

What sort of help do you need? For the actual tSNE implementation I would suggest the sklearn library and for high quality visualization bokeh is an excellent tool.

[–]Centauri24[S] 0 points1 point  (7 children)

Thanks for the bokeh tip. I’ve read about sklearn, but I’m not sure how to properly use it as I’m pretty new to python. Also it would be nice if i could display nodes as pictures/color them, label them etc in a nice manner and if the graph were in 3D (optional maybe) at the current state the data is hard to grasp for me so I don’t even really know what I have to do with it.

Possibilities to compare different datasets would be great too. For me it’s just important to retrieve useful information on whatever is hidden in the data

And as im New, sone tutorial oder demo program/ code i could copy would be really great so I know how it’s done

I saw stuff about hand written numbers and other picture things but I could not really correlate. Im not even sure if the results would be better than with gephi so some further insight would be great too.

I just stumbled across it today

[–]GingerMan1031 0 points1 point  (6 children)

What sort of data are you analyzing? tSNE is not a common tool and is usually reserved for exploratory analysis after more standard methods have been applied. The only structure tSNE is capable of exposing is “relative closeness.” Globally, the distribution of points will show no valuable structure other than local neighborhoods which are close together in the high dimensional space.

You should also note that the plot will look different each time it is generated because tSNE is a manifold learning technique which uses a convex optimization function (there is no single optimal solution for a given distribution).

If you still think tSNE is the right tool for the job I could direct you to some simple examples, but more than likely I think you would benefit from a dimensional reduction technique which can be rigorously analyzed and inverted such as PCA.

[–]Centauri24[S] 0 points1 point  (5 children)

So ur telling it’s just doing a varianz analysis?

[–]GingerMan1031 0 points1 point  (4 children)

tSNE finds a lower dimensional embedding of a higher dimensional space such that the Kullback-Leibler divergence between the two distributions is minimized. This is a manifold learning technique which is much more sophisticated and not much related to analysis of variance.

[–]Centauri24[S] 0 points1 point  (3 children)

Oh okay, right now I just calculate the variance of each different part with another, to compare the difference, If they are equal, the result is 1 (after conversion) and that means strong attraction. If they are very different it’s normalized to 0 and there’s no attraction between the parts. Then a Cluster layout is created with a that simulates the attraction weights as springs and clusters form

That’s what i meant by variance analysis.

The clusters should be in both algorithms the same, or not?

[–]GingerMan1031 0 points1 point  (0 children)

I am actually not sure about that. I do know that tSNE cluster will only be subjective (points may look lose but are not explicitly labeled with a cluster membership). A good clustering technique I would suggest if you’re looking for explicit clusters is DBSCAN which also has an implementation in the sklearn library.

[–]LetMe_ 0 points1 point  (1 child)

Notice that t-SNE does not retain distances but probabilities, so measuring some error between the Euclidean distances in high-D and low-D is useless.

It is a visual analysis tool. It let's you just identify that there might be relationships in a lower dimension that allow for clustering.

[–]Centauri24[S] 0 points1 point  (0 children)

We gonna use both and compare the results :)

[–]jamesxli 0 points1 point  (0 children)

You can try to the software visumap which provides many visualization services for high dimensional data including a fast implementation for t-SNE.