From Graph (batch) processing towards a distributed graph data base by René Pickhardt.
From the post:
Yesterdays meeting of the reading club was quite nice. We all agreed that the papers where of good quality and we gained some nice insights. The only drawback of the papers was that it did not directly tell us how to achieve our goal for a real time distributed graph data base technology. In the readings for next meeting (which will take place Wednesday March 7th 2pm CET) we tried to choose papers that don’t discuss these distributed graph / data processing techniques but focus more on speed or point out the general challenges in parallel graph processing.
Readinglist for next Meeting (Wednesday March 7th 2pm CET)
- memcached paper: To understand how for distributed shared memory works which could essentially speed up approaches like Signal Collect
- Beehive: to see a p2p aproach for graph distribution.
- Challenges in parallel graph processing. For obvious reasons since it points out the large picture.
- http://www.boost.org/doc/libs/1_48_0/libs/graph_parallel/doc/html/index.html The boos library is a general parallel graph processing framework. In any case it is interesting and good to understand what is going on there.
- Topology partitioning applied to SPARQL, HADOOP and TripleStores Shows how a speedup of 1000x can be achieved due to smart partitioning of a graph
Again while reading an preparing stuff feel free to add more reading wishes to the comments of this blog post or drop me a mail!
That’s two weeks from yesterday: Wednesday March 7th 2pm CET.