S3G2: A Scalable Structure-Correlated Social Graph Generator by Minh-Duc Pham, Peter Boncz, Orri Erling. (The same text you will find at: Selected Topics in Performance Evaluation and Benchmarking Lecture Notes in Computer Science Volume 7755, 2013, pp 156-172. DOI: 10.1007/978-3-642-36727-4_11)
Benchmarking graph-oriented database workloads and graph-oriented database systems is increasingly becoming relevant in analytical Big Data tasks, such as social network analysis. In graph data, structure is not mainly found inside the nodes, but especially in the way nodes happen to be connected, i.e. structural correlations. Because such structural correlations determine join fan-outs experienced by graph analysis algorithms and graph query executors, they are an essential, yet typically neglected, ingredient of synthetic graph generators. To address this, we present S3G2: a Scalable Structure-correlated Social Graph Generator. This graph generator creates a synthetic social graph, containing non-uniform value distributions and structural correlations, which is intended as test data for scalable graph analysis algorithms and graph database systems. We generalize the problem by decomposing correlated graph generation in multiple passes that each focus on one so-called correlation dimension; each of which can be mapped to a MapReduce task. We show that S3G2 can generate social graphs that (i) share well-known graph connectivity characteristics typically found in real social graphs (ii) contain certain plausible structural correlations that influence the performance of graph analysis algorithms and queries, and (iii) can be quickly generated at huge sizes on common cluster hardware.
You may also want to see the slides.
What a nice way to start the week!
I first saw this at Datanami.