Archive for the ‘Graphic Processors’ Category

Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices

Wednesday, April 24th, 2013

Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices by Shivaram Venkataraman, Erik Bodzsar, Indrajit Roy, Alvin AuYoung, and Robert S. Schreiber.


It is cumbersome to write machine learning and graph algorithms in data-parallel models such as MapReduce and Dryad. We observe that these algorithms are based on matrix computations and, hence, are inefficient to implement with the restrictive programming and communication interface of such frameworks.

In this paper we show that array-based languages such as R [3] are suitable for implementing complex algorithms and can outperform current data parallel solutions. Since R is single-threaded and does not scale to large datasets, we have built Presto, a distributed system that extends R and addresses many of its limitations. Presto efficiently shares sparse structured data, can leverage multi-cores, and dynamically partitions data to mitigate load imbalance. Our results show the promise of this approach: many important machine learning and graph algorithms can be expressed in a single framework and are substantially faster than those in Hadoop and Spark.

Your mileage may vary but the paper reports that for PageRank, Presto is 40X faster than Hadoop and 15X Spark.

Unfortunately I can’t point you to any binary or source code for Presto.

Still, the description is an interesting one at a time of rapid development of computing power.

GTC 2012

Monday, June 6th, 2011

GTC (GPU Technology Conference) 2012

Important Dates

GTC 2012 in San Jose, May 14-17, 2012

Session proposals has closed but posters proposals is open until June 27, 2011. Both will re-open September 27, 2011.

From the website:

GTC advances awareness of high performance computing, and connects the scientists, engineers, researchers, and developers who use GPUs to tackle enormous computational challenges.

GTC 2012 will feature the latest breakthroughs and the most amazing content in GPU-enabled computing. Spanning 4 full days of world-class education delivered by some of the greatest minds in GPU computing, GTC will showcase the dramatic impact that parallel computing is having on scientific research and commercial applications.

BTW, hundreds of hours of video is available from GTC 2010 at this website.

If you are concerned with scaling topic maps and other semantic technologies or just high performance computing in general, the 2010 recordings look like a good place to start while awaiting the 2012 conference.

Thrust Graph Library

Sunday, April 17th, 2011

Thrust Graph Library

From the website:

Thrust Graph Library provides graph container, algorithm, and other concepts like a Boost Graph Library. This Library based on the thrust, which is a CUDA library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL).

AllegroMCOCE: GPU-accelerated Cytoscape Plugin
TM Explorer?

Sunday, February 20th, 2011

AllegroMCOCE: GPU-accelerated Cytoscape Plugin

From the website:

AllegroMCODE is a high-performance Cytoscape plugin to find clusters, or highly interconnected groups of nodes in a huge complex network such as a protein interaction network and a social network in real time. AllegroMCODE finds the same clusters as the MCODE plugin does, but the analysis usually takes less than a second even for a large complex network. The plugin user interface of AllegroMCODE is based on MOCDE and has additional features. AllegroMCODE is an open source software and freely available under LGPL.

Cluster has various meanings according to the sources of networks. For instance, a protein-protein interaction network is represented as proteins are nodes and interactions between proteins are edges. Clusters in the network can be considered as protein complexes and functional modules, which can be identified as highly interconnected subgraphs. For social networks, people and their relationships are represented as nodes and edges, respectively. A cluster in the network can be considered as a community which has strong inter-relationship among their members.

AllegroMCODE exploits our high performance GPU computing architecture to make your analysis task faster than ever. The analysis task of the MCODE algorithm to find the clusters can be long for large complex networks even though the MCODE is a relatively fast method of clustering. AllegroMCODE provides our parallel algorithm implementation base on the original sequential MCODE algorithm. It can achieve two orders of magnitude speedup for the analysis of a large complex network by using the latest graphics card. You can also exploit the GPU acceleration without any special graphics hardware since it provides the seamless remote processing in our free GPU computing server.

You do not need to purchase any special GPU hardware or systems and also not to care about the tedious installation task of them. All you have to do are to install the AllegroMCODE plugin module on your computer and create a free account on our server.

Simply awesome!

The ability to dynamically explore and configure topic maps will be priceless.

A greater gap than between hot-lead type and a modern word processor.

Will take weeks/months to fully explore but wanted to bring it to your attention.

Dynamic Indexes?

Friday, December 3rd, 2010

I was writing the post about the New York Times graphics presentation when it occurred to me how close we are to dynamic indexes.

After all, gaming consoles are export restricted.

What we now consider to be “runs,” static indexes and the like are computational artifacts.

They follow how we created indexes when they were done by hand.

What happens when the properties of what is being indexed, its identifications and merging rules can change on the fly and re-present itself to the user for further manipulation?

I don’t think the fundamental issues of index construction get any easier with dynamic indexes but how we answer them will determine how quickly we can make effective use of such indexes.

Whether crossing the line first to dynamic indexes will be a competitive advantage, only time will tell.

I would like for some VC to be interested in finding out.

Caveat to VCs. If someone pitches this as making indexes more quickly, that isn’t the point. “Quick” and “dynamic” aren’t the same thing. Related but different. Keep both hands on your wallet.

An Approach for Fast Hierarchical Agglomerative Clustering Using Graphics Processors with CUDA

Sunday, October 10th, 2010

An Approach for Fast Hierarchical Agglomerative Clustering Using Graphics Processors with CUDA Authors: S.A. Arul Shalom, Manoranjan Dash, Minh Tue Keywords: CUDA. Hierarchical clustering, High performance Computing, Computations using Graphics hardware, complete linkage


Graphics Processing Units in today’s desktops can well be thought of as a high performance parallel processor. Each single processor within the GPU is able to execute different tasks independently but concurrently. Such computational capabilities of the GPU are being exploited in the domain of Data mining. Two types of Hierarchical clustering algorithms are realized on GPU using CUDA. Speed gains from 15 times up to about 90 times have been realized. The challenges involved in invoking Graphical hardware for such Data mining algorithms and effects of CUDA blocks are discussed. It is interesting to note that block size of 8 is optimal for GPU with 128 internal processors.

GPUs offer a great deal of processing power and programming them may provoke deeper insights into subject identification and mapping.

Topic mappers may be able to claim NVIDIA based software/hardware and/or Sony Playstation 3 and 4 units (Cell Broadband Engine) as a business expense (check with your tax advisor).

A GPU based paper for TMRA 2011 anyone?