Indexing StackOverflow In Solr by John Berryman.
From the post:
One thing I really like about Solr is that its super easy to get started. You just download solr, fire it up, and then after following the 10 minute tutorial you’ll have a basic understand of indexing, updating, searching, faceting, filtering, and generally using Solr. But, you’ll soon get bored of playing with the 50 or so demo documents. So, quit insulting Solr with this puny, measly, wimpy dataset; Index something of significance and watch what Solr can do.
One of the most approachable large datasets is the StackExchange data set which most notably includes all of StackOverflow, but also contains many of the other StackExchange sites (Cooking, English Grammar, Bicycles, Games, etc.) So if StackOverflow is not your cup of tea, there’s bound to be a data set in there that jives more with your interests.
Once you’ve pulled down the data set, then you’re just moments away from having your own SolrExchange index. Simply unzip the dataset that you’re interested in (7-zip format zip files), pull down this git repo which walks you through indexing the data, and finally, just follow the instructions in the README.md.
Interesting data set for Solr.
More importantly, a measure of how easy it needs to be to get started with software.
Software like topic maps.
Suggestions?