Big Data – The New Science of Complexity by Wolfgang Pietsch.
Abstract:
Data-intensive techniques, now widely referred to as ‘big data’, allow for novel ways to address complexity in science. I assess their impact on the scientific method. First, big-data science is distinguished from other scientific uses of information technologies, in particular from computer simulations. Then, I sketch the complex and contextual nature of the laws established by data-intensive methods and relate them to a specific concept of causality, thereby dispelling the popular myth that big data is only concerned with correlations. The modeling in data-intensive science is characterized as ‘horizontal’—lacking the hierarchical, nested structure familiar from more conventional approaches. The significance of the transition from hierarchical to horizontal modeling is underlined by a concurrent paradigm shift in statistics from parametric to non-parametric methods.
A serious investigation of the “science” of big data, which I noted was needed in: Underhyped – Big Data as an Advance in the Scientific Method.
From the conclusion:
The knowledge established by big-data methods will consist in a large number of causal laws that generally involve numerous parameters and that are highly context-specific, i.e. instantiated only in a small number of cases. The complexity of these laws and the lack of a hierarchy into which they could be integrated prevent a deeper understanding, while allowing for predictions and interventions. Almost certainly, we will experience the rise of entire sciences that cannot leave the computers and do not fit into textbooks.
This essay and the references therein are a good vantage point from which to observe the development of a new science and its philosophy of science.