Archive for the ‘Graph Analytics’ Category

Faster Graph Processing [Almost Linear Time Construction Of Spectral Sparsifier For Any Graph]

Tuesday, October 20th, 2015

Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time by Yin Tat Lee, He Sun.


We present the first almost-linear time algorithm for constructing linear-sized spectral sparsification for graphs. This improves all previous constructions of linear-sized spectral sparsification, which requires \Omega(n^2) time.

A key ingredient in our algorithm is a novel combination of two techniques used in literature for constructing spectral sparsification: Random sampling by effective resistance, and adaptive constructions based on barrier functions.

Apologies to the paper authors for my liberties with their title: Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time but I wanted to capture eyes that might glaze past their more formal title.

The PR release where I saw this article reads as follows:

In the second paper, Constructing linear-sized spectral sparsification in almost-linear time, Dr He Sun, Lecturer in Computer Science in the University’s Department of Computer Science and Yin Tat Lee, a PhD student from MIT, have presented the first algorithm for constructing linear-sized spectral sparsifiers that runs in almost-linear time.

More and more applications from today’s big data scenario need to deal with graphs of millions of vertices. While traditional algorithms can be applied directly in these massive graphs, these algorithms are usually too slow to be practical when the graph contains millions of vertices. Also, storing these practical massive graphs are very expensive.

Dr He Sun said: “Over the past decade, there have been intensive studies in order to overcome these two bottlenecks. One notable approach is through the intermediate step called spectral sparsification, which is the approximation of any input graph by a very sparse graph that inherits many properties of the input graph. Since most algorithms run faster in sparse graphs, spectral sparsification is used as a key intermediate step in speeding up the runtime of many practical graph algorithms, including finding approximate maximum flows in an undirected graph, and approximately solving linear systems, among many others.”

Using spectral sparsification, the researchers ran many algorithms in a sparse graph, and obtained approximately the correct results as well. This general framework allowed them to speed up the runtime of a wide range of algorithms by a magnitude. However, to make the overall approach practical, a key issue was to find faster constructions of spectral sparsification with fewer edges in the resulting sparsifiers. There have been many studies looking at this area in the past decade.

The researchers have proved that, for any graph, they can construct in almost-linear time its spectral sparsifier, and in the output sparsifier every vertex has only constant number of vertices. This result is almost optimal respect to time complexity of the algorithm, and the number of edges in the spectral sparsifier.

Very heavy sledding in the paper but you don’t have to be able to originate the insight in order to take advantage of the technique.


The impact of fast networks on graph analytics, part 1

Monday, July 13th, 2015

The impact of fast networks on graph analytics, part 1 by Frank McSherry.

From the post:

This is a joint post with Malte Schwarzkopf, cross-blogged here and at the CamSaS blog.

tl;dr: A recent NSDI paper argued that data analytics stacks don’t get much faster at tasks like PageRank when given better networking, but this is likely just a property of the stack they evaluated (Spark and GraphX) rather than generally true. A different framework (timely dataflow) goes 6x faster than GraphX on a 1G network, which improves by 3x to 15-17x faster than GraphX on a 10G network.

I spent the past few weeks visiting the CamSaS folks at the University of Cambridge Computer Lab. Together, we did some interesting work, which we – Malte Schwarzkopf and I – are now going to tell you about.

Recently, a paper entitled “Making Sense of Performance in Data Analytics Frameworks” appeared at NSDI 2015. This paper contains some surprising results: in particular, it argues that data analytics stacks are limited more by CPU than they are by network or disk IO. Specifically,

“Network optimizations can only reduce job completion time by a median of at most 2%. The network is not a bottleneck because much less data is sent over the network than is transferred to and from disk. As a result, network I/O is mostly irrelevant to overall performance, even on 1Gbps networks.” (§1)

The measurements were done using Spark, but the authors argue that they generalize to other systems. We thought that this was surprising, as it doesn’t match our experience with other data processing systems. In this blog post, we will look into whether these observations do indeed generalize.

One of the three workloads in the paper is the BDBench query set from Berkeley, which includes a “page-rank-like computation”. Moreover, PageRank also appears as an extra example in the NSDI slide deck (slide 38-39), used there to illustrate that at most a 10% improvement in job completion time can be had even for a network-intensive workload.

This was especially surprising to us because of the recent discussion around whether graph computations require distributed data processing systems at all. Several distributed systems get beat by a simple, single-threaded implementation on a laptop for various graph computations. The common interpretation is that graph computations are communication-limited; the network gets in the way, and you are better off with one machine if the computation fits.[footnote omitted]

The authors introduce Rust and timely dataflow to achieve rather remarkable performance gains. That is if you think a 4x-16x speedup over GraphX on the same hardware is a performance gain. (Most do.)

Code and instructions are available so you can test their conclusions for yourself. Hardware is your responsibility.

While you are waiting for part 2 to arrive, try Frank’s homepage for some fascinating reading.

Unicode 8 – Coming Next Week!

Friday, June 12th, 2015

Unicode 8 will be released next week. Rick McGowan has posted directions to code charts for final review:

For the complete archival charts, as a single-file 100MB file, or as individual block files, please see the charts directory here:

For the set of “delta charts” only with highlighting for changes please see:

(NOTE: There is a known problem viewing the charts using the PDF Viewer plugin for Firefox on the Mac platform.)

And the 8.0 beta UCD files are also available for cross-reference:

The draft version page is here:

From the draft version homepage:

Unicode 8.0 adds a total of 7,716 characters, encompassing six new scripts and many new symbols, as well as character additions to several existing scripts. Notable character additions include the following:

  • A set of lowercase Cherokee syllables, forming case pairs with the existing Cherokee characters
  • A large collection of CJK unified ideographs
  • Emoji symbols and symbol modifiers for implementing skin tone diversity; see Unicode Emoji.
  • Georgian lari currency symbol
  • Letters to support the Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of Africa
  • The Ahom script for support of the Tai Ahom language in India
  • Arabic letters to support Arwi—the Tamil language written in the Arabic script

Other important updates in Unicode Version 8.0 include:

  • Change in encoding model of New Tai Lue to visual order


Two other important Unicode specifications are maintained in synchrony with the Unicode Standard, and include updates for the repertoire additions made in Version 8.0, as well as other modifications:

If you have the time this weekend, take a quick look.

GPU-Accelerated Graph Analytics in Python with Numba

Thursday, March 19th, 2015

GPU-Accelerated Graph Analytics in Python with Numba Siu Kwan Lam.


Numba is an open-source just-in-time (JIT) Python compiler that generates native machine code for X86 CPU and CUDA GPU from annotated Python Code. (Mark Harris introduced Numba in the post “NumbaPro: High-Performance Python with CUDA Acceleration”.) Numba specializes in Python code that makes heavy use of NumPy arrays and loops. In addition to JIT compiling NumPy array code for the CPU or GPU, Numba exposes “CUDA Python”: the CUDA programming model for NVIDIA GPUs in Python syntax.

By speeding up Python, we extend its ability from a glue language to a complete programming environment that can execute numeric code efficiently.

Python enthusiasts, I would not take the “…from a glue language to a complete programming environment…” comment to heart.

The author also says:

Numba helps by letting you write pure Python code and run it with speed comparable to a compiled language, like C++. Your development cycle shortens when your prototype Python code can scale to process the full dataset in a reasonable amount of time.

and then summarizes the results of code in the post:

Our GPU PageRank implementation completed in just 163 seconds on the full graph of 623 million edges and 43 million nodes using a single NVIDIA Tesla K20 GPU accelerator. Our equivalent Numba CPU-JIT version took at least 5 times longer on a smaller graph.

plus points out techniques for optimizing the code.

I’d say no hard feelings. Yes? 😉

Tutorial Statistical Graph Analysis

Thursday, April 17th, 2014

Tutorial Statistical Graph Analysis by Aapo Kyrola.

From the post:

GraphChi-DB can be used for efficiently querying induced subgraphs from very large networks. Thus, you can for example, easily sample a vertex, and retrive induced neighborhood graph of the vertex. Or you can choose a random set of vertices and compute their induced subgraph.

Assuming you have the data loaded in database, and the GraphChiDatabase object in a value “DB”, here is how you request edges for the induced subgraph of a set of vertices:

GraphChi-DB is released two days ago, GraphChi-DB [src released] and today you have a tutorial written for it.

Not bad, not bad at all.

Faunus 4.1 Release

Monday, November 25th, 2013

Faunus 4.1 Release

I don’t find this change reflected in the 4.1 release notes but elsewhere Marko Rodriguez writes:

I tested the new code on a subset of the Friendster data (6 node Hadoop and 6 node Cassandra cluster).

    vertices: 7 minutes to write 39 million vertices at ~100mb/second from the Hadoop to the Cassandra cluster.

  • edges: 15 minutes to write 245 million edges at ~40mb/second from the Hadoop to the Cassandra cluster.

This is the fastest bulk load time I’ve seen to date. This means, DBPedia can be written in ~20 minutes! I’ve attached an annotated version of the Ganglia monitor to the email that shows the outgoing throughput for the various stages of the MapReduce job. In the past, I was lucky to get 5-10mb/second out of the edge writing stage (this had to do with how I was being dumb about how reduce worked in Hadoop — wasn’t considering the copy/shuffle aspect of the stage).

At this rate, this means we can do billion edges graphs in a little over 1 hour. I bet though I can now speed this up more with some parameter tweaking as I was noticing that Cassandra was RED HOT and locking up a few times on transaction commits. Anywho, Faunus 0.4.1 is going to be gangbusters!

Approximately one billion edges an hour?

It’s not > /dev/null speed but still quite respectable. 😉

Faunus 0.4.1 wikidoc.

Download Faunus 0.4.1.

New SOSP paper: a lightweight infrastructure for graph analytics

Monday, October 21st, 2013

New SOSP paper: a lightweight infrastructure for graph analytics by Danny Bickson.

Danny cites a couple of new graph papers that will be of interest:

I got this reference from my collaborator Aapo Kyorla, author of GraphChi.

A Lightweight Infrastructure for Graph Analytics. Donald Nguyen, Andrew Lenharth, Keshav Pingali (University of Texas at Austin), to appear in SOSP 2013.

It is an interesting paper which heavily compares to GraphLab, PowerGraph (GraphLab v2.1) and

One of the main claims is that dynamic and asynchronous scheduling can significantly speed up many graph algorithms (vs. bulk synchronous parallel model where all graph nodes are executed on each step).

Some concerns I have is regarding the focus on multicore settings, which makes everything much easier, and thus to comparison with PowerGraph less relevant.


Another relevant paper which improves on GraphLab is: Leveraging Memory Mapping for Fast and Scalable Graph Computation on a PC. Zhiyuan Lin, Duen Horng Chau, and U Kang, IEEE Big Data Workshop: Scalable Machine Learning: Theory and Applications, 2013. The basic idea is to speed graph loading using mmap() operation.

One of the things I like about Danny’s posts is that he is trying to improve graph processing for everyone, not cooking numbers for his particular solution.


Discovering Emerging Tech through Graph Analysis

Wednesday, June 19th, 2013

Discovering Emerging Tech through Graph Analysis by Henry Hwangbo.


With the torrent of data available to us on the Internet, it’s been increasingly difficult to separate the signal from the noise. We set out on a journey with a simple directive: Figure out a way to discover emerging technology trends. Through a series of experiments, trials, and pivots, we found our answer in the power of graph databases. We essentially built our “Emerging Tech Radar” on emerging technologies with graph databases being central to our discovery platform. Using a mix of NoSQL databases and open source libraries we built a scalable information digestion platform which touches upon multiple topics such as NLP, named entity extraction, data cleansing, cypher queries, multiple visualizations, and polymorphic persistence.

Sounds like Henry needs to go stand close to the NSA. 😉

Not really!

Henry’s use case isn’t trying to boil a random ocean of data to see if insights emerge.

Like on slide 19 where he recommends cleaning up data and eliminating duplicates before loading the data.

If you see a video link to this presentation, please post it.

Fast Collaborative Graph Exploration

Thursday, April 18th, 2013

Fast Collaborative Graph Exploration by Dariusz Dereniowski, Yann Disser, Adrian Kosowski, Dominik Pajak, Przemyslaw Uznanski.


We study the following scenario of online graph exploration. A team of $k$ agents is initially located at a distinguished vertex $r$ of an undirected graph. At every time step, each agent can traverse an edge of the graph. All vertices have unique identifiers, and upon entering a vertex, an agent obtains the list of identifiers of all its neighbors. We ask how many time steps are required to complete exploration, i.e., to make sure that every vertex has been visited by some agent. We consider two communication models: one in which all agents have global knowledge of the state of the exploration, and one in which agents may only exchange information when simultaneously located at the same vertex. As our main result, we provide the first strategy which performs exploration of a graph with $n$ vertices at a distance of at most $D$ from $r$ in time $O(D)$, using a team of agents of polynomial size $k = D n^{1+ \epsilon} < n^{2+\epsilon}$, for any $\epsilon > 0$. Our strategy works in the local communication model, without knowledge of global parameters such as $n$ or $D$. We also obtain almost-tight bounds on the asymptotic relation between exploration time and team size, for large $k$. For any constant $c>1$, we show that in the global communication model, a team of $k = D n^c$ agents can always complete exploration in $D(1+ \frac{1}{c-1} +o(1))$ time steps, whereas at least $D(1+ \frac{1}{c} -o(1))$ steps are sometimes required. In the local communication model, $D(1+ \frac{2}{c-1} +o(1))$ steps always suffice to complete exploration, and at least $D(1+ \frac{2}{c} -o(1))$ steps are sometimes required. This shows a clear separation between the global and local communication models.

Heavy going but seems important for graph exploration performance.

See also the special case of exploring trees under related work.

Another possibility for exploring overlapping markup. Each agent has an independent view of one part of the markup trees.

S3G2: A Scalable Structure-Correlated Social Graph Generator

Sunday, February 24th, 2013

S3G2: A Scalable Structure-Correlated Social Graph Generator by Minh-Duc Pham, Peter Boncz, Orri Erling. (The same text you will find at: Selected Topics in Performance Evaluation and Benchmarking Lecture Notes in Computer Science Volume 7755, 2013, pp 156-172. DOI: 10.1007/978-3-642-36727-4_11)


Benchmarking graph-oriented database workloads and graph-oriented database systems is increasingly becoming relevant in analytical Big Data tasks, such as social network analysis. In graph data, structure is not mainly found inside the nodes, but especially in the way nodes happen to be connected, i.e. structural correlations. Because such structural correlations determine join fan-outs experienced by graph analysis algorithms and graph query executors, they are an essential, yet typically neglected, ingredient of synthetic graph generators. To address this, we present S3G2: a Scalable Structure-correlated Social Graph Generator. This graph generator creates a synthetic social graph, containing non-uniform value distributions and structural correlations, which is intended as test data for scalable graph analysis algorithms and graph database systems. We generalize the problem by decomposing correlated graph generation in multiple passes that each focus on one so-called correlation dimension; each of which can be mapped to a MapReduce task. We show that S3G2 can generate social graphs that (i) share well-known graph connectivity characteristics typically found in real social graphs (ii) contain certain plausible structural correlations that influence the performance of graph analysis algorithms and queries, and (iii) can be quickly generated at huge sizes on common cluster hardware.

You may also want to see the slides.

What a nice way to start the week!


I first saw this at Datanami.

Graph Databases, GPUs, and Graph Analytics

Wednesday, February 20th, 2013

Graph Databases, GPUs, and Graph Analytics by Bryan Thompson.

From the post:

For people who want to track what we’ve been up to on the XDATA project, there are three surveys articles that we’ve produced:

Literature Review of Graph Databases (Bryan Thompson, SYSTAP)
Large Scale Graph Algorithms on the GPU (Yangzihao Wang and John Owens, UC Davis)
Graph Pattern Matching, Search, and OLAP (Dr. Xifeng Yan, UCSB)

Simply awesome reading.

It may be too early to take off for a weekend of reading but I wish….