AggreSet: Rich and Scalable Set Exploration using Visualizations of Element Aggregations by M. Adil Yalçın, Niklas Elmqvist, and Benjamin B. Bederson.
Datasets commonly include multi-value (set-typed) attributes that describe set memberships over elements, such as genres per movie or courses taken per student. Set-typed attributes describe rich relations across elements, sets, and the set intersections. Increasing the number of sets results in a combinatorial growth of relations and creates scalability challenges. Exploratory tasks (e.g. selection, comparison) have commonly been designed in separation for set-typed attributes, which reduces interface consistency. To improve on scalability and to support rich, contextual exploration of set-typed data, we present AggreSet. AggreSet creates aggregations for each data dimension: sets, set-degrees, set-pair intersections, and other attributes. It visualizes the element count per aggregate using a matrix plot for set-pair intersections, and histograms for set lists, set-degrees and other attributes. Its non-overlapping visual design is scalable to numerous and large sets. AggreSet supports selection, filtering, and comparison as core exploratory tasks. It allows analysis of set relations inluding subsets, disjoint sets and set intersection strength, and also features perceptual set ordering for detecting patterns in set matrices. Its interaction is designed for rich and rapid data exploration. We demonstrate results on a wide range of datasets from different domains with varying characteristics, and report on expert reviews and a case study using student enrollment and degree data with assistant deans at a major public university.
These two videos will give you a better overview of AggreSet than I can. The first one is about 30 seconds and the second one about 5 minutes.
The visualization of characters from Les Misérables (the second video) is a dynamite demonstration of how you could explore pre-topic map data with an eye towards creating roles and associations between characters as well as with the text.
First use case that pops to mind would be harvesting the fan posts on Harry Potter and crossing them with a similar listing of characters from the Harry Potter book series. With author, date, book, character, etc., relationships.
While you are at the GitHub site: https://github.com/adilyalcin/Keshif/tree/master/AggreSet, be sure to bounce up a level to Keshif:
Keshif is a web-based tool that lets you browse and understand datasets easily.
To start using Keshif:
- Get the source code from github,
- Explore the existing datasets and their source codes, and
- Check out the wiki.
Or just go directly to the Keshif site, with 110 datasets (as of today)>
For the impatient, see Loading Data.
For the even more impatient:
You can load data to Keshif from :
- Google Sheets
- Text File
- On Google Drive
- On Dropbox
- File on your webserver
Text File Types
Keshif can be used with the following data file types:
- CSV / TSV
Hint: The dataset explorer at the frontpage indexes demos by file type and resource. Filter by data source to find example source code on how to apply a specific file loading approach.
The critical factor, in addition to its obvious usefulness, is that it works in a web browser. You don’t have to install software, set Java paths, download additional libraries, etc.
Are you using the modern web browser as your target for user facing topic map applications?
I first saw this in a tweet by Christophe Lalanne.