Underhyped – Big Data as an Advance in the Scientific Method

Underhyped – Big Data as an Advance in the Scientific Method by Yanpei Chen.

From the post:

Big data is underhyped. That’s right. Underhyped. The steady drumbeat of news and press talk about big data only as a transformative technology trend. It is as if big data’s impact goes only as far as creating tremendous commercial value for a selected few vendors and their customers. This view could not be further from the truth.

Big data represents a major advance in the scientific method. Its impact will be felt long after the technology trade press turns its attention to the next wave of buzzwords.

I am fortunate to work at a leading data management vendor as a big data performance specialist. My job requires me to “make things go fast” by observing, understanding, and improving big data systems. Specifically, I am expected to assess whether the insights I find represent solid information or partial knowledge. These processes of “finding out about things”, more formally known as empirical observation, hypothesis testing, and causal analysis, lie at the heart of the scientific method.

My work gives me some perspective on an under-appreciated aspect of big data that I will share in the rest of the article.

Searching for “big data” and “philosophy of science” returns almost 80,000 “hits” today. It is a connection I have not considered and if you know of any survey papers on the literature I would appreciate a pointer.

I enjoyed reading this essay but I don’t consider tracking medical treatment results and managing residential heating costs as examples of the scientific method. Both are examples of observation and analysis that is made easier by big data techniques but they don’t involve testing any hypotheses, prediction, testing, causal analysis.

Big data techniques are useful for such cases. But the use of big data techniques for all the steps of the scientific method, observation, formulation of hypotheses, prediction, testing and casual analysis, would be far more exciting.

Any pointers to use uses?

Comments are closed.