Statistical Reporting Errors and Collaboration on Statistical Analyses in Psychological Science by Coosje L. S. Veldkamp, et al. (PLOS Published: December 10, 2014 DOI: 10.1371/journal.pone.0114876)
Abstract:
Statistical analysis is error prone. A best practice for researchers using statistics would therefore be to share data among co-authors, allowing double-checking of executed tasks just as co-pilots do in aviation. To document the extent to which this ‘co-piloting’ currently occurs in psychology, we surveyed the authors of 697 articles published in six top psychology journals and asked them whether they had collaborated on four aspects of analyzing data and reporting results, and whether the described data had been shared between the authors. We acquired responses for 49.6% of the articles and found that co-piloting on statistical analysis and reporting results is quite uncommon among psychologists, while data sharing among co-authors seems reasonably but not completely standard. We then used an automated procedure to study the prevalence of statistical reporting errors in the articles in our sample and examined the relationship between reporting errors and co-piloting. Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.
If you are relying on statistical reports from psychology publications, you need to keep the last part of that abstract firmly in mind:
Overall, 63% of the articles contained at least one p-value that was inconsistent with the reported test statistic and the accompanying degrees of freedom, and 20% of the articles contained at least one p-value that was inconsistent to such a degree that it may have affected decisions about statistical significance. Overall, the probability that a given p-value was inconsistent was over 10%. Co-piloting was not found to be associated with reporting errors.
That is an impressive error rate. Imagine incorrect GPS locations 63% of the time and your car starting only 80% of the time. I would take that as a sign that something was seriously wrong.
Not an amazing results considering reports of contamination in genome studies and bad HR data, not to mention that only 6% of landmark cancer research projects could be replicated.
At the root of the problem are people. People just like you and me.
People who did not follow (or in some cases record) a well defined process that included independent verification results they obtained.
Independent verification is never free but then neither are the consequences of errors. Choose carefully.