Seeing beyond reading: a survey on visual text analytics by Aretha B. Alencar, Maria Cristina F. de Oliveira, Fernando V. Paulovich. (Alencar, A. B., de Oliveira, M. C. F. and Paulovich, F. V. (2012), Seeing beyond reading: a survey on visual text analytics. WIREs Data Mining Knowl Discov, 2: 476–492. doi: 10.1002/widm.1071)
Abstract:
We review recent visualization techniques aimed at supporting tasks that require the analysis of text documents, from approaches targeted at visually summarizing the relevant content of a single document to those aimed at assisting exploratory investigation of whole collections of documents.Techniques are organized considering their target input material—either single texts or collections of texts—and their focus, which may be at displaying content, emphasizing relevant relationships, highlighting the temporal evolution of a document or collection, or helping users to handle results from a query posed to a search engine.We describe the approaches adopted by distinct techniques and briefly review the strategies they employ to obtain meaningful text models, discuss how they extract the information required to produce representative visualizations, the tasks they intend to support and the interaction issues involved, and strengths and limitations. Finally, we show a summary of techniques, highlighting their goals and distinguishing characteristics. We also briefly discuss some open problems and research directions in the fields of visual text mining and text analytics.
Papers like this one make me wish for a high resolution color printer. 😉
With three tables of representations, twenty-nine (29) entries and sixty (60) footnotes, it isn’t really possible to provide a useful summary beyond quoting the author’s conclusion:
This survey has provided an overview of the lively field of visual text analytics. The variety of tasks and situations addressed introduces a demand for many domain-specific and/or task-oriented solutions. Nonetheless, despite the impressive number of contributions and wide variety of approaches identified in the literature, the field is still in its infancy. Deployment of existing and novel techniques to a wider audience of users performing real-life tasks remains a challenge that requires tackling multiple issues.
One issue is to foster tighter integration with traditional text mining tasks and algorithms. Various contributions are found in the literature reporting usage of visual interfaces or visualizations to support interpretation of the output of traditional text mining algorithms. Still, visualization has the potential to give users a much more active role in text mining tasks and related activities, and concrete examples of such usage are still scarce. Many rich possibilities remain open to further exploration. Better visual text analytics will also likely require more sophisticated text models, possibly integrating results and tools from research on natural language processing. Finally, providing usable tools also requires addressing several issues related to scalability, i.e., the capability of effectively handling very large text documents and textual collections.
However, what I can do is track down the cited literature and point back to this article as the origin for my searching.
It merits wider readership than its publisher’s access polices are likely to permit.