Publicly available large data sets for database research by Daniel Lemire.
Daniel summaries large (> 20 GB) data sets that may be useful for database research.
If you know of any data sets that have been overlooked or that become available, please post a note on this entry at Daniel’s blog.