Archive for the ‘TripleRush’ Category

TripleRush: A Fast and Scalable Triple Store

Monday, October 21st, 2013

TripleRush: A Fast and Scalable Triple Store by Philip Stutz, Mihaela Verman, Lorenz Fischer, and Abraham Bernstein.

Abstract:

TripleRush is a parallel in-memory triple store designed to address the need for efficient graph stores that quickly answer queries over large-scale graph data. To that end it leverages a novel, graph-based architecture.

Specifi cally, TripleRush is built on our parallel and distributed graph processing framework Signal/Collect. The index structure is represented as a graph where each index vertex corresponds to a triple pattern. Partially matched copies of a query are routed in parallel along di fferent paths of this index structure.

We show experimentally that TripleRush takes less than a third of the time to answer queries compared to the fastest of three state-of-the-art triple stores, when measuring time as the geometric mean of all queries for two benchmarks. On individual queries, TripleRush is up to three orders of magnitude faster than other triple stores.

If the abstract hasn’t already gotten your interest, consider the following:

The index graph we just described is di fferent from traditional index structures, because it is designed for the efficient parallel routing of messages to triples that correspond to a given triple pattern. All vertices that form the index structure are active parallel processing elements that only interact via message passing.

That is the beginning to section “3.2 Query Processing.” It has a worked example that will repay a close reading.

The processing model outlined here is triple specific, but I don’t see any reason why the principles would not work for other graph structures.

This is going to the top of my reading list.

I first saw this in a tweet by Stefano Bertolo.