Node.js, Neo4j, and usefulness of hacking ugly code by Justin Mandzik.
From the post:
My primary application has a ton of data, even in its infancy. Hundreds of millions of distinct entities (and growing fast), each with many properties, and many relationships. Numbers in the billions start to be really easy to hit, and then thats still not accounting for organic growth. Most of the data is hierarchical for now, but theres a need in the near term for arbitrary relationships and the quick traversing thereof. Vanilla MySQL in particular is annoying to work when it comes to hierarchical data. Moving to Oracle gets us some nicer toys to play with (CONNECT_BY_ROOT and such), but ultimately, the need for a complimentary database solution emerges.
While my non-relational db experience is limited to MongoDB (which I love dearly), a graph db seemed to be the better theoretical fit. Requirements: Manage dense, interconnected data that has to be traversed fast, a query language that supports a root cause analysis use case, and some kind of H.A. plan of attack. Signals of Neo4j, OrientDB, and Titan started emerging from the noise. Randomly, I started in with Neo4j with the intent of repeating the test cases on the other contenders assuming any of the 3 met the requirements (in theory, at least). Neo4j has a GREAT “2 minutes to get up and running” experience. Untar,
bin/neo4j start, and go to localhost:7474 and you’re off and running. A decent interface waits for you and you can dive right in.
Proof of concept code for testing Neo4j with project data.
The presumption of normalization in Neo4j continues to nag at me.
The broader the reach for data, the less likely normalization is going to be possible, or affordable if possible in some theoretical sense.
It may be that normalization is a presentation aspect of results. Will have to think about that over the holidays.