Notes on DIH Architecture: Solr’s Data Import Handler by Mark Bennett.
From the post:
What the world really needs are some awesome examples of extending DIH (Solr DataImportHanlder), beyond the classes and unit tests that ship with Solr. That’s a tall order given DIH’s complexity, and sadly this post ain’t it either! After doing a lot of searches online, I don’t think anybody’s written an “Extending DIH Guide” yet – everybody still points to the Solr wiki, quick start, FAQ, source code and unit tests.
However, in this post, I will review a few concepts to keep in mind. And who knows, maybe in a future post I’ll have some concrete code.
When I make notes, I highlight the things that are different from what I’d expect and why, so I’m going to start with that. Sure DIH has an XML config where you tell it about your database or filesystem or RSS feed, and map those things into your Solr schema, so no surprise there. But the layering of that configuration really surprised me. (and turns out there’s good reasons for it)
If you aspire to be Solr proficient, print this article and work through it.
It will be time well spent.