If You Can’t See The Data, The Statistics Are False

The headline, If You Can’t See The Data, The Statistics Are False is my one line summary of 73.6% of all Statistics are Made Up – How to Interpret Analyst Reports by Mark Suster.

You should read Suster’s post in full, if for no other reason that his accounts of how statistics are created, that’s right, created, for reports:

But all of the data projections were so different so I decided to call some of the research companies and ask how they derived their data. I got the analyst who wrote one of the reports on the phone and asked how he got his projections. He must have been about 24. He said, literally, I sh*t you not, “well, my report was due and I didn’t have much time. My boss told me to look at the growth rate average over the past 3 years an increase it by 2% because mobile penetration is increasing.” There you go. As scientific as that.

I called another agency. They were more scientific. They had interviewed telecom operators, handset manufacturers and corporate buyers. They had come up with a CAGR (compounded annual growth rate) that was 3% higher that the other report, which in a few years makes a huge difference. I grilled the analyst a bit. I said, “So you interviewed the people to get a plausible story line and then just did a simple estimation of the numbers going forward?”

“Yes. Pretty much”

How many stories have you enjoyed over the past six months with “scientific” statistics like those?

Suster has five common tips for being a more informed consumer of data. All of which require effort on your part.

Can you see the data for the statistic? By that I mean is the original data, its collection method, who collected it, method of collection, when it was collected, etc., available to the reader?

If not, the statistic is either false or inflated.

The test I suggest is applicable at the point where you encounter the statistic. It puts the burden on the author who wants their statistic to be credited, to empower the user to evaluate their statistic.

Imagine the data analyst story where the growth rate statistic had this footnote:

1. Averaged growth rate over past three (3) years and added 2% at direction of management.

It reports the same statistic but also warns the reader the result is a management fantasy. Might be right, might be wrong.

Patronize publications with statistics + underlying data. Authors and publishers will get the idea soon enough.