imMens: Real-Time Interactive Visual Exploration of Big Data by Zhicheng Liu.
From the post:
Interactive visualization of large datasets is key in making big data technologies accessible to a wide range of data users. However, as datasets expand in size, they challenge traditional methods of interactive visual analysis, forcing data analysts and enthusiasts to spend more time on “data munging” and less time on analysis. Or to abandon certain analyses altogether.
At the Stanford Visualization Group, as part of the Intel Science and Technology Center for Big Data, we are developing imMens, a system that enables real-time interaction of billion+ element databases by using scalable visual summaries. The scalable visual representations are based on binned aggregation and support a variety of data types: ordinal, numeric, temporal and geographic. To achieve interactive brushing & linking between the visualizations, imMens precomputes multivariate data projections and stores these as data tiles. The browser-based front-end dynamically loads appropriate data tiles and uses WebGL to perform data processing and rendering on the GPU.
The first challenge we faced in designing imMens was how to make visualizations with a huge number of data points interpretable. Over-plotting is a typical problem even with thousands of data points. We considered various data reduction techniques. Sampling, for example, picks a subset of the data, but is still prone to visual cluttering. More importantly, sampling can miss interesting patterns and outliers. Another idea is binned aggregation: we define bins over each dimension, count the number of data points falling within each bin, and then visualize the density of data distribution using histograms or heatmaps. Binned aggregation can give a complete overview of the data without omitting local features such as outliers.
(…)
If you want to know more about imMens, we encourage you to visit the project website, which showcases our EuroVis ’13 paper, video and online demos.
imMens will be released on Github soon. Stay tuned!
Bearing in mind these are pre-computed data tiles along only a few projections, the video is still a rocking demonstration of interactivity.
Or to put it another way, the interactivity is “real-time” but the data processing to support the interactivity is not.
Not a criticism but an observation. An observation that should make you ask which data projections have been computed and which one have not been computed.
The answers you get and their reliability will depend upon choices that were made and data that was omitted and so not displayed by the interface.
Still, the video makes me wonder about interactive merging would be like, along a similar number of axes?
Are pre-computed data projections in your topic map future?