Making Graph Algorithms Fast, using Strongly Connected Components by Mark C. Chu-Carroll.
From the post:
One of the problems with many of the interesting graph algorithms is that they’re complex. As graphs get large, computations over them can become extremely slow. For example, graph coloring is NP-complete – so the time to run a true optimal graph coloring algorithm on an arbitrary graph grows exponentially in the size of the graph!
So, quite naturally, we look for ways to make it faster. We’ve already talked about using heuristics to get fast approximations. But what if we really need the true optimum? The other approach to making it faster, when you really want the true optimum, is parallelism: finding some way to subdivide the problem into smaller pieces which can be executed at the same time. This is obviously a limited technique – you can’t stop the exponential growth in execution time by adding a specific finite number of processors. But for particular problems, parallelism can make the difference between being able to do something fast enough to be useful, and not being able to. For a non-graph theoretic but compelling example, there’s been work in something called microforecasting, which is precise weather forecasting for extreme small areas. It’s useful for predicting things like severe lightning storms in local areas during events where there are large numbers of people. (For example, it was used at the Atlanta olympics.) Microforecasting inputs a huge amount of information about local conditions – temperature, air pressure, wind direction, wind velocity, humidity, and such, and computes a short-term forecast for that area. Microforecasting is completely useless if it takes longer to compute the forecast than it takes for the actual weather to develop. If you can find a way to split the computation among a hundred processors, and get a forecast in 5 minutes, you’ve got something really useful. If you have to run it on a single processor, and it takes 2 hours – well, by then, any stormthat the forecast predicts is already over.
For graphs, there’s a very natural way of decomposing problems for parallelization. Many complex graphs for real problems can be divided into a set of subgraphs, which can be handled in parallel. In fact, this can happen on multiple levels: graphs can be divided into subgraphs, which can be divided into further subgraphs. But for now, we’ll mainly focus on doing it once – one decomposition of the graph into subgraphs for parallelization.
Mark illustrates the power of graph decomposition.
Don’t skimp on the comments, there are references I am mining from there as well.