SpiderStore: A Native Main Memory Approach for Graph Storage by Robert Binna, Wolfgang Gassler, Eva Zangerle, Dominic Pacher, and Günther Specht.
The ever increasing amount of linked open data results in a demand for high performance graph databases. In this paper we therefore introduce a memory layout which is tailored to the storage of large RDF data sets in main memory. We present the memory layout SpiderStore. This layout features a node centric design which is in contrast to the prevailing systems using triple focused approaches. The benefit of this design is a native mapping between the nodes of a graph onto memory locations connected to each other. Based on this native mapping an addressing schema which facilitates relative addressing together with a snapshot mechanism is presented. Finally a performance evaluation, which demonstrates the capabilities, of the SpiderStore memory layout is performed using an RDF-data set consisting of about 190 mio triples.
I saw this in a tweet by Marko A. Rodriguez.
I am sure René Pickhardt will be glad to see the focus on edges in this paper. 😉
It is hard to say which experiments or lines of inquiry will lead to substantial breakthroughs, but focusing on smallish data sets is unlikey to push the envelope very hard. Even if smallish experiments are sufficient for Linked Data scenarios.
The authors project that their technique might work up for up to a billion triples. Yes, well, but by 2024, one science installation will be producing one exabyte of data per day. And that is just one source of data.
The science community isn’t going to wait for the W3C to catch up, nor should they.