Nuts and Bolts of Data Mining: Correlation & Scatter Plots by Tim Graettinger.
From the post:
In this article, I continue the “Nuts and Bolts of Data Mining” series. We will tackle two, intertwined tools/topics this time: correlation and scatter plots. These tools are fundamental for gauging the relationship (if any) between pairs of data elements. For instance, you might want to view the relationship between the age and income of your customers as a scatter plot. Or, you might compute a number that is the correlation between these two customer demographics. As we’ll soon see, there are good, bad, and ugly things that can happen when you apply a purely computational method like correlation. My goal is to help you avoid the usual pitfalls, so that you can use correlation and scatter plots effectively in your own work.
You will smile at the examples but if the popular press is any indication, correlation is no laughing matter!
Tim’s post won’t turn the tide but short enough to forward to the local broadside folks.