Combining Neo4J and Hadoop (part I) by Kris Geusebroek.
From the post:
Why combine these two different things.
Hadoop is good for data crunching, but the end-results in flat files don’t present well to the customer, also it’s hard to visualize your network data in excel.
Neo4J is perfect for working with our networked data. We use it a lot when visualizing our different sets of data.
So we prepare our dataset with Hadoop and import it into Neo4J, the graph database, to be able to query and visualize the data.
We have a lot of different ways we want to look at our dataset so we tend to create a new extract of the data with some new properties to look at every few days.This blog is about how we combined Hadoop and Neo4J and describes the phases we went trough in our search for the optimal solution.
Mostly covers slow load speeds into Neo4j and attempts to improve it.
A future post will cover use of a distributed batchimporter process.
I first saw this at DZone.