Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 24, 2014

Word Storms:…

Filed under: Text Analytics,Text Mining,Visualization,Word Cloud — Patrick Durusau @ 1:58 pm

Word Storms: Multiples of Word Clouds for Visual Comparison of Documents by Quim Castellà and Charles Sutton.

Abstract:

Word clouds are popular for visualizing documents, but are not as useful for comparing documents, because identical words are not presented consistently across different clouds. We introduce the concept of word storms, a visualization tool for analyzing corpora of documents. A word storm is a group of word clouds, in which each cloud represents a single document, juxtaposed to allow the viewer to compare and contrast the documents. We present a novel algorithm that creates a coordinated word storm, in which words that appear in multiple documents are placed in the same location, using the same color and orientation, across clouds. This ensures that similar documents are represented by similar- looking word clouds, making them easier to compare and contrast visually. We evaluate the algorithm using an automatic evaluation based on document classifi cation, and a user study. The results con rm that a coordinated word storm allows for better visual comparison of documents.

I never have cared for word clouds all that much but word storms as presented by the authors looks quite useful.

The paper examines the use of word storms at a corpus, document and single document level.

You will find Word Storms: Multiples of Word Clouds for Visual Comparison of Documents (website) of particular interest, including its like to Github for the source code used in this project.

Of particular interests for topic mappers is the observation:

similar documents should be represented by visually similar clouds (emphasis in original)

Now imagine for a moment visualizing topics and associations with “similar” appearances. Even if limited to colors that are easy to distinguish, that could be a very powerful display/discover tool for topic maps.

Not the paper’s use case but one that comes to mind with regard to display/discovery in a heterogeneous data set (such as a corpus of documents).

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress