Needles in Stacks of Needles: genomics + data mining by Martin Krzywinski. (ICDM2012 Keynote)
Abstract:
In 2001, the first human genome sequence was published. Now, just over 10 years later, we capable of sequencing a genome in just a few days. Massive parallel sequencing projects now make it possible to study the cancers of thousands of individuals. New data mining approaches are required to robustly interrogate the data for causal relationships among the inherently noisy biology. How does one identify genetic changes that are specific and causal to a disease within the rich variation that is either natural or merely correlated? The problem is one of finding a needle in a stack of needles. I will provide a non-specialist introduction to data mining methods and challenges in genomics, with a focus on the role visualization plays in the exploration of the underlying data.
This page links to the slides Martin used in his presentation.
Excellent graphics and a number of amusing points, even without the presentation itself:
Cheap Data: A fruit fly that expresses high sensitivity to alcohol.
Kenny: A fruit fly without this gene dies in two days, named for the South Park character who dies in each episode.
Ken and Barbie: Fruit flys that fail to develop external genitalia.
One observation that rings true across disciplines:
Literature is still largely composed and published opaquely.
I searched for a video recording of the presentation but came up empty.