Scientists’ sense making when hypothesizing about disease mechanisms from expression data and their needs for visualization support by Barbara Mirel and Carsten Görg.
Abstract:
A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientists’ analytical workflows and their implications for tool design.
From the introduction:
In other domains, improvements in data visualization designs have relied on models of analysts’ actual sense making for a complex analysis [2]. A sense making model captures analysts’ cumulative, looped (not linear) “process [es] of searching for a representation and encoding data in that representation to answer task-specific questions” relevant to an open-ended problem [3]: 269. As an end-to-end flow of application-level tasks, a sense making model may portray and categorize analytical intentions, associated tasks, corresponding moves and strategies, informational inputs and outputs, and progression and iteration over time. The importance of sense making models is twofold: (1) If an analytical problem is poorly understood developers are likely to design for the wrong questions, and tool utility suffers; and (2) if developers do not have a holistic understanding of the entire analytical process, developed tools may be useful for one specific part of the process but will not integrate effectively in the overall workflow [4,5].
As the authors admit, one case isn’t enough to be generalized but their methodology, with its focus on the work flow of a scientist, is a refreshing break from imagined and/or “ideal” work flows for scientists.
Until now semantic software has followed someone’s projection of an “ideal” work flow.
The next generation of semantic software should follow the actual work flows of people working with their data.
I first saw this in a tweet by Neil Saunders