Big data sets available for free by Vincent Granville.
From the post:
A few data sets are accessible from our data science apprenticeship web page.
(graphic omitted)
- Source code and data for our Big Data keyword correlation API (see also section in separate chapter, in our book)
- Great statistical analysis: forecasting meteorite hits (see also section in separate chapter, in our book)
- Fast clustering algorithms for massive datasets (see also section in separate chapter, in our book)
- 53.5 billion clicks dataset available for benchmarking and testing
- Over 5,000,000 financial, economic and social datasets
- New pattern to predict stock prices, multiplies return by factor 5 (stock market data, S&P 500; see also section in separate chapter, in our book)
- 3.5 billion web pages: The graph has been extracted from the Common Crawl 2012 web corpus and covers 3.5 billion web pages and 128 billion hyperlinks between these pages
- Another large data set – 250 million data points: This is the full resolution GDELT event dataset running January 1, 1979 through March 31, 2013 and containing all data fields for each event record.
- 125 Years of Public Health Data Available for Download
Just in case you are looking for data for a 2014 demo or data project!