Animating Random Projections of High Dimensional Data by Andreas Mueller.
From the post:
Recently Jake showed some pretty cool videos in his blog.
This inspired me to go back to an idea I had some time ago, about visualizing high-dimensional data via random projections.
I love to do exploratory data analysis with scikit-learn, using the manifold, decomposition and clustering module. But in the end, I can only look at two (or three) dimensions. And I really like to see what I am doing.
So I go and look at the first two PCA directions, than at the first and third, than at the second and third… and so on. That is a bit tedious and looking at more would be great. For example using time.
There is a software out there, called ggobi, which does a pretty good job at visualizing high dimensional data sets. It is possible to take interactive tours of your high dimensions, set projection angles and whatnot. It has a UI and tons of settings.
I used it a couple of times and I really like it. But it doesn’t really fit into my usual work flow. It has good R integration, but not Python integration that I know of. And it also seems a bit overkill for “just looking around a bit”.
It’s hard to over estimate the value of “just looking around a bit.”
As opposed to defending a fixed opinion about data, data structures, or processing.
Who knows?
Practice at “just looking around a bit,” may make your opinions less fixed.
Chance you will have to take.