imGraph: A distributed in-memory graph database by Salim Jouili.
From the post:
Eura Nova contribution
Having these challenges in mind, we introduce a new graph database system called imGraph. We have considered the random access requirement for large graphs as a key factor on deciding the type of storage. Then, we have designed a graph database where all data is stored in memory so the speed of random access is maximized. However, as large graphs can not be completely loaded in the RAM of a single machine, we designed imGraph as distributed graph database. That is, the vertices and the edges are partitioned into subsets, and each subset is located in the memory of one machine belonging to the involved machines (see the following figure). Furthermore, we implemented on imGraph a graph traversal engine that takes advantage of distributed parallel computing and fast in-memory random access to gain performance.
I haven’t verified the numbers but imGraph is reported to have beaten both Titan and Neo4j by x150 and x200, respectively on particular data sets.
Enough to justify reading the paper.
The test machines each had 7.5 GB of memory, which seems a little lite to me.
Particularly since the IBM Power 770 server can expand to hold 4 TB of memory.
Imagine the performance on five (5) machines where each has 4 TB of memory.
True, it would be more expensive but at some point, there is only so much performance you can squeeze out of a commodity box.
BTW, the paper: imGraph: A distributed in-memory graph database.