Archive for the ‘Calvin’ Category

Big Game Hunting in the Database Jungle

Thursday, May 17th, 2012

If all these new DBMS technologies are so scalable, why are Oracle and DB2 still on top of TPC-C? A roadmap to end their dominance.

Alexander Thomson and Daniel Abadi write:

In the last decade, database technology has arguably progressed furthest along the scalability dimension. There have been hundreds of research papers, dozens of open-source projects, and numerous startups attempting to improve the scalability of database technology. Many of these new technologies have been extremely influential—some papers have earned thousands of citations, and some new systems have been deployed by thousands of enterprises.

So let’s ask a simple question: If all these new technologies are so scalable, why on earth are Oracle and DB2 still on top of the TPC-C standings? Go to the TPC-C Website with the top 10 results in raw transactions per second. As of today (May 16th, 2012), Oracle 11g is used for 3 of the results (including the top result), 10g is used for 2 of the results, and the rest of the top 10 is filled with various versions of DB2. How is technology designed decades ago still dominating TPC-C? What happened to all these new technologies with all these scalability claims?

The surprising truth is that these new DBMS technologies are not listed in the TPC-C top ten results not because that they do not care enough to enter, but rather because they would not win if they did.

Preview of a paper that Alex is presenting at SIGMOD next week. Introducing “Calvin,” a new approach to database processing.

So where does Calvin fall in the OldSQL/NewSQL/NoSQL trichotomy?

Actually, nowhere. Calvin is not a database system itself, but rather a transaction scheduling and replication coordination service. We designed the system to integrate with any data storage layer, relational or otherwise. Calvin allows user transaction code to access the data layer freely, using any data access language or interface supported by the underlying storage engine (so long as Calvin can observe which records user transactions access).

What I find exciting about this report (and the paper) is the re-thinking of current assumptions concerning data processing. May be successful or may not be. But the exciting part is the attempt to transcend decades of acceptance of the maxims of our forefathers.

BTW, Calvin is reported to support 500,000 transactions a second.

Big game hunting anyone?*


* I don’t mean that as an expression of preference for or against Oracle.

I suspect Calvin will be a wake up call to R&D at Oracle to re-double their own efforts at ground breaking innovation.

Breakthroughs in matching up multi-dimensional indexes would be attractive to users who need to match up disparate data sources.

Speed is great but a useful purpose attracts customers.