Archive for the ‘Venn Diagrams’ Category

The 3 Vs of Big Data revisited: Venn diagrams and visualization

Monday, June 3rd, 2013

The 3 Vs of Big Data revisited: Venn diagrams and visualization by Vincent Granville.

From the post:

This discussion is about visualization. The three Vs of big data (volume, velocity, variety) or the three skills that make a data scientist (hacking, statistics, domain expertise) are typically visualized using a Venn diagram, representing all the potential 8 combinations through set intersections. In the case of big data, I believe (visualization, veracity, value) are more important than (volume, velocity, variety), but that’s another issue. Except that one of my Vs is visualization and all these Venn diagrams are visually wrong: the color at the intersection of two sets should be the blending of both colors of the parent sets, for easy interpretation and easy generalization to 4 or more sets. For instance, if we have three sets A, B, C painted respectively in red, green, blue, the intersection of A and B should be yellow, the intersection of the three should be white.

Sorry to disappoint fans of the “3 Vs of Big Data,” as Vincent points out there are at least six (6). (Probably more. Post your suggestions.)

It is a helpful review on Venn diagrams until Vincent says:

For most people, the brain has a hard time quickly processing more than 4 dimensions at once, and this should be kept in mind when producing visualizations. Beyond 5 dimensions, any additional dimension probably makes your visual less and less useful for value extraction, unless you are a real artist!

I don’t think four dimensions is going to be easy:


3D projection of a tesseract undergoing a simple rotation in four dimensional space.

CXAIR Dynamic Venn Diagrams

Monday, July 11th, 2011

CXAIR Dynamic Venn Diagrams

From the description:

Brief overview of how Venn diagrams work in CXAIR. The Venn’s are created on the fly and allow you to find relationships in your data in an extremely visual and easy to use way.

Take the five (5) minutes it will take to watch this video.

This is an example of a good interface. Good, not great.

Works well with crisp data and sharp boundaries. Fuzzy or uncertain data, perhaps not as good.

Still, there is a lot of data with (alleged anyway) sharp boundaries. CXAIR is a good tool for exploring such data in order to create a topic map.

See: for more details.

Research question: To what extent is the overlapping of properties the specification of complex identities? Works in the demonstration to identify collective subjects. (A subject that is a collection of subjects.) Should work for singleton subjects (depending on your definition).

Would it be helpful to have associations with other subjects displayed while constructing an identification for a singleton subject? How would you decide which ones to display?