I can’t give you a clean URL but on Monday (21 May 2012), the Washington Post ran a story on the Health Care Cost Institute, which had the following quotes:
This morning a new nonprofit called the Health Care Cost Institute will roll out a database of 5 billion health insurance claims (all stripped of the individual health plan’s identity, to address privacy concerns).
…
This is the first study to use the HCCI data, although more are in the works. Gaynor has been inundated with about 130 requests from health policy researchers to use the database. While his team sifts through those, three approved studies are already tackling big health policy questions.
…
“There is immense interest in gaining access,” says HCCI executive director David Newman. “We’re having trouble keeping up with that.” (emphasis added)
Sorry, that went by a little fast. The data has already been scrubbed so why the choke point of the Health Care Cost Insitute on the data?
Spin it up to one or more clouds that support free public storage for data sets of public interest.
Problem of sorting through access request is solved.
Just maybe researchers will want to address other questions, ones that aren’t necessarily about costs. And/or combine this data with other data. Like data on local pollution. (Although you would need historical data to make that work.)
Mapping this data set to other data sets could only magnify its importance.
Many thanks are owed to the Health Care Cost Institute for securing the data set.
But our thanks should not include electing the HCCI as censor of uses of this data set.