Practical tools for exploring data and models by Hadley Alexander Wickham. (PDF)
From the introduction:
This thesis describes three families of tools for exploring data and models. It is organised in roughly the same way that you perform a data analysis. First, you get the data in a form that you can work with; Section 1.1 introduces the reshape framework for restructuring data, described fully in Chapter 2. Second, you plot the data to get a feel for what is going on; Section 1.2 introduces the layered grammar of graphics, described in Chapter 3. Third, you iterate between graphics and models to build a succinct quantitative summary of the data; Section 1.3 introduces strategies for visualising models, discussed in Chapter 4. Finally, you look back at what you have done, and contemplate what tools you need to do better in the future; Chapter 5 summarises the impact of my work and my plans for the future.
The tools developed in this thesis are ﬁrmly based in the philosophy of exploratory data analysis (Tukey, 1977). With every view of the data, we strive to be both curious and sceptical. We keep an open mind towards alternative explanations, never believing we
have found the best model. Due to space limitations, the following papers only give a glimpse at this philosophy of data analysis, but it underlies all of the tools and strategies that are developed. A fuller data analysis, using many of the tools developed in this thesis, is available in Hobbs et al. (To appear).
Has a focus on R tools, including ggplot2 and Wilkinson’s The Grammar of Graphics.
The “…never believing we have found the best model” approach works for me!
I first saw this at Data Scholars.