Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 30, 2016

Google BigQuery Public Datasets

Filed under: Books,Google BigQuery — Patrick Durusau @ 8:15 pm

Google BigQuery Public Datasets

An amazing set of public datasets, from the post:

  • : A Social Security Administration dataset that contains all names from Social Security card applications for births that occurred in the United States after 1879.
  • : Data collected by the NYC Taxi and Limousine Commission (TLC) that includes trip records from all trips completed in yellow and green taxis in NYC from 2009 to 2015.
  • : A dataset that contains all stories and comments from Hacker News since its launch in 2006.
  • : A dataset published by the US Department of Health and Human Services that includes all weekly surveillance reports of nationally notifiable diseases for all U.S. cities and states published between 1888 and 2013.
  • : A dataset that contains 3.5 million digitized books stretching back two centuries, encompassing the complete English-language public domain collections of the Internet Archive (1.3M volumes) and HathiTrust (2.2 million volumes).
  • : This public dataset was created by the National Oceanic and Atmospheric Administration (NOAA) and includes global data obtained from the USAF Climatology Center. This dataset covers GSOD data between 1929 and 2016, collected from over 9000 stations.

I can readily see myself loosing serious time in the GDELT Book Corpus!

Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress