Acknowledging Errors in Data Quality by Jim Harris.
From the post:
The availability heuristic is a mental shortcut that occurs when people make judgments based on the ease with which examples come to mind. Although this heuristic can be beneficial, such as when it helps us recall examples of a dangerous activity to avoid, sometimes it leads to availability bias, where we’re affected more strongly by the ease of retrieval than by the content retrieved.
In his thought-provoking book “Thinking, Fast and Slow,” Daniel Kahneman explained how availability bias works by recounting an experiment where different groups of college students were asked to rate a course they had taken the previous semester by listing ways to improve the course — while varying the number of improvements that different groups were required to list.
Jim applies the result of Kahneman’s experiment to data quality issues and concludes:
- Isolated errors – Management chooses one-time data cleaning projects.
- Ten errors – Management concludes overall data quality must not be too bad (availability heuristic).
I need to re-read Kahneman but have you seen suggestions for overcoming the availability heuristic?