a quick look at how tSNE uses random walks on graphs to compute affinities
code that verifies that pytorch automatically computed gradient from KL divergence agrees with the formula in the tSNE paper.
animation of tsne on 2500 MNIST digits
an animated bokeh app
a close reading of a simple python implementation of tsne