Statistical Learning with Sparsity: The Lasso and Generalizations by Trevor Hastie, Robert Tibshirani, and Martin Wainwright.
From the introduction:
“I never keep a scorecard or the batting averages. I hate statistics. What I got to know, I keep in my head.”
This is a quote from baseball pitcher Dizzy Dean, who played in the major leagues from 1930 to 1947.
How the world has changed in the 75 or so years since that time! Now large quantities of data are collected and mined in nearly every area of science, entertainment, business, and industry. Medical scientists study the genomes of patients to choose the best treatments, to learn the underlying causes of their disease. Online movie and book stores study customer ratings to recommend or sell them new movies or books. Social networks mine information about members and their friends to try to enhance their online experience. And yes, most major league baseball teams have statisticians who collect and analyze detailed information on batters and pitchers to help team managers and players make better decisions.
Thus the world is awash with data. But as Rutherford D. Roger (and others) has said:
“We are drowning in information and starving for knowledge.”
There is a crucial need to sort through this mass of information, and pare it down to its bare essentials. For this process to be successful, we need to hope that the world is not as complex as it might be. For example, we hope that not all of the 30, 000 or so genes in the human body are directly involved in the process that leads to the development of cancer. Or that the ratings by a customer on perhaps 50 or 100 different movies are enough to give us a good idea of their tastes. Or that the success of a left-handed pitcher against left-handed batters will be fairly consistent for different batters. This points to an underlying assumption of simplicity. One form of simplicity is sparsity, the central theme of this book. Loosely speaking, a sparse statistical model is one in which only a relatively small number of parameters (or predictors) play an important role. In this book we study methods that exploit sparsity to help recover the underlying signal in a set of data.
The delightful style of the authors had me going until they said:
…we need to hope that the world is not as complex as it might be.
What? “…not as complex as it might be?
Law school and academia both train you to look for complexity so “…not as complex as it might be” is as close to apostasy as any statement I can imagine. 😉 (At least I can say I am honest about my prejudices. Some of them at any rate.)
Not for the mathematically faint of heart but it may certainly be a counter to the intelligence communities’ mania about collecting every scrap of data.
Finding a needle in a smaller haystack could be less costly and more effective. Both of those principles run counter to well established government customs but there are those in government who wish to be effective. (Article of faith on my part.)
I first saw this in a tweet by Chris Diehl.