Archive for the ‘Replication’ Category

Is Failing to Attempt to Replicate, “Just Part of the Whole Science Deal”?

Tuesday, February 16th, 2016

Genomeweb posted this summary of Stuart Firestein’s op-ed on failure to replicate:

Failure to replicate experiments is just part of the scientific process, writes Stuart Firestein, author and former chair of the biology department at Columbia University, in the Los Angeles Times. The recent worries over a reproducibility crisis in science are overblown, he adds.

“Science would be in a crisis if it weren’t failing most of the time,” Firestein writes. “Science is full of wrong turns, unconsidered outcomes, omissions and, of course, occasional facts.”

Failures to repeat experiments and the struggle to figure out what went wrong has also fed a number of discoveries, he says. For instance, in 1921, biologist Otto Loewi studied beating hearts from frogs in saline baths, one with the vagus nerve removed and one with it still intact. When the solution from the heart with the nerve still there was added to the other bath, that heart also slowed, suggesting that the nerve secreted a chemical that slowed the contractions.

However, Firestein notes Loewi and other researchers had trouble replicating the results for nearly six years. But that led the researchers to find that seasons can affect physiology and that temperature can affect enzyme function: Loewi’s first experiment was conducted at night and in the winter, while the follow-up ones were done during the day in heated buildings or on warmer days. This, he adds, also contributed to the understanding of how synapses fire, a finding for which Loewi shared the 1936 Nobel Prize.

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

You will need to read Firestein’s comments in full: just part of the scientific process, to appreciate my concerns.

For example, Firestein says:


Absolutely not. Science is doing what it always has done — failing at a reasonable rate and being corrected. Replication should never be 100%. Science works beyond the edge of what is known, using new, complex and untested techniques. It should surprise no one that things occasionally come out wrong, even though everything looks correct at first.

I don’t know, would you say an 85% failure to replicate rate is significant? Drug development: Raise standards for preclinical cancer research, C. Glenn Begley & Lee M. Ellis, Nature 483, 531–533 (29 March 2012) doi:10.1038/483531a. Or over half of psychology studies? Over half of psychology studies fail reproducibility test. just to name two studies on replication.

I think we can agree with Firestein that replication isn’t at 100% but at what level are the attempts to replicate?

From what Firestein says,

“Replication is part of [the scientific] process, as open to failure as any other step,” Firestein adds. “The mistake is to think that any published paper or journal article is the end of the story and a statement of incontrovertible truth. It is a progress report.”

Systematic attempts at replication (and its failure) should be part and parcel of science.

Except…, that it’s obviously not.

If it were, there would have been no earth shaking announcements that fundamental cancer research experiments could not be replicated.

Failures to replicate would have been spread out over the literature and gradually resolved with better data, methods, if not both.

Failure to replicate is a legitimate part of the scientific method.

Not attempting to replicate, “I won’t look too close at your results if you don’t look too closely at mine,” isn’t.

There an ugly word for avoiding looking too closely at your own results or those of others.

Why Use Make

Thursday, November 12th, 2015

Why Use Make by Mike Bostock.

From the post:

I love Make. You may think of Make as merely a tool for building large binaries or libraries (and it is, almost to a fault), but it’s much more than that. Makefiles are machine-readable documentation that make your workflow reproducible.

To illustrate with a recent example: yesterday Kevin and I needed to update a six-month old graphic on drought to accompany a new article on thin snowpack in the West. The article was already on the homepage, so the clock was ticking to republish with new data as soon as possible.

Shamefully, I hadn’t documented the data-transformation process, and it’s painfully easy to forget details over six months: I had a mess of CSV and GeoJSON data files, but not the exact source URL from the NCDC; I was temporarily confused as to the right Palmer drought metric (Drought Severity Index or Z Index?) and the corresponding categorical thresholds; finally, I had to resurrect the code to calculate drought coverage area.

Despite these challenges, we republished the updated graphic without too much delay. But I was left thinking how much easier it could have been had I simply recorded the process the first time as a makefile. I could have simply typed make in the terminal and be done!

Remember how science has been losing the ability to replicate experiments due to computers? How Computers Broke Science… [Soon To Break Businesses …]

So you are trying to remember and explain to an opponent’s attorney the process you went through in processing data, after about 3 hours of sharp questioning, how clear do you think you will be? Will you really remember every step? The source of every file?

Had you documented your workflow you can read from your Make file and say exactly what happened, in what order and with what sources. You do need to do that every time if you want anyone to believe the make file represents what actually happened.

You will be on more solid ground than trying to remember which files, the dates on those files, their content, etc.

Mike concludes his post with:

So do your future self and coworkers a favor, and use Make!

Let’s modify that to read:

So do your future self, coworkers, and lawyer a favor, and use Make!

I first saw this in a tweet by Christophe Lalanne.

How Computers Broke Science… [Soon To Break Businesses …]

Tuesday, November 10th, 2015

How Computers Broke Science — and What We can do to Fix It by Ben Marwick.

From the post:

Reproducibility is one of the cornerstones of science. Made popular by British scientist Robert Boyle in the 1660s, the idea is that a discovery should be reproducible before being accepted as scientific knowledge.

In essence, you should be able to produce the same results I did if you follow the method I describe when announcing my discovery in a scholarly publication. For example, if researchers can reproduce the effectiveness of a new drug at treating a disease, that’s a good sign it could work for all sufferers of the disease. If not, we’re left wondering what accident or mistake produced the original favorable result, and would doubt the drug’s usefulness.

For most of the history of science, researchers have reported their methods in a way that enabled independent reproduction of their results. But, since the introduction of the personal computer — and the point-and-click software programs that have evolved to make it more user-friendly — reproducibility of much research has become questionable, if not impossible. Too much of the research process is now shrouded by the opaque use of computers that many researchers have come to depend on. This makes it almost impossible for an outsider to recreate their results.

Recently, several groups have proposed similar solutions to this problem. Together they would break scientific data out of the black box of unrecorded computer manipulations so independent readers can again critically assess and reproduce results. Researchers, the public, and science itself would benefit.

Whether you are looking for specific proposals to make computed results capable of replication or quotes to support that idea, this is a good first stop.

FYI for business analysts, how are you going to replicate results of computer runs to establish your “due diligence” before critical business decisions?

What looked like a science or academic issue has liability implications!

Changing a few variables in a spreadsheet or more complex machine learning algorithms can make you look criminally negligent if not criminal.

The computer illiteracy/incompetence of prosecutors and litigants is only going to last so long. Prepare defensive audit trails to enable the replication of your actual* computer-based business analysis.

*I offer advice on techniques for such audit trails. The audit trails you choose to build are up to you.