Data Analysis for the Life Sciences – a book completely written in R markdown by Rafael Irizarry.

From the post:

Data analysis is now part of practically every research project in the life sciences. In this book we use data and computer code to teach the necessary statistical concepts and programming skills to become a data analyst. Following in the footsteps of Stat Labs, instead of showing theory first and then applying it to toy examples, we start with actual applications and describe the theory as it becomes necessary to solve specific challenges. We use simulations and data analysis examples to teach statistical concepts. The book includes links to computer code that readers can use to program along as they read the book.

It includes the following chapters: Inference, Exploratory Data Analysis, Robust Statistics, Matrix Algebra, Linear Models, Inference for High-Dimensional Data, Statistical Modeling, Distance and Dimension Reduction, Practical Machine Learning, and Batch Effects.

…

Have you ever wondered about the growing proliferation of data analysis books?

The absence of one Ur-Data Analysis book that everyone could read and use?

I have a longer post coming on a this idea but if each discipline has the need for its own view on data analysis, it is really surprising that no one system of semantics satisfies all communities?

In other words, is the evidence of heterogeneous semantics so strong that we should abandon attempts at uniform semantics and focus on communicating across systems of semantics?

I’m sure there are other examples of where every niche has its own vocabulary, tables in relational databases or column headers in spreadsheets for example.

What is your favorite example of heterogeneous semantics?

Assuming heterogeneous semantics are here to stay (they have been around since the start of human to human communication, possibly earlier), what solution do you suggest?

I first saw this in a tweet by Christophe Lalanne.