Archive for the ‘Memcached’ Category

Masstree – Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached

Tuesday, May 1st, 2012

Masstree – Much Faster than MongoDB, VoltDB, Redis, and Competitive with Memcached

From the post:

The EuroSys 2012 system conference has an excellent live blog summary of their talks for: Day 1, Day 2, Day 3 (thanks Henry at the Paper Trail blog). Summaries for each of the accepted papers are here.

One of the more interesting papers from a NoSQL perspective was Cache Craftiness for Fast Multicore Key-Value Storage, a wonderfully detailed description of the low level techniques used to implement Masstree:

A storage system specialized for key-value data in which all data fits in memory, but must persist across server restarts. It supports arbitrary, variable-length keys. It allows range queries over those keys: clients can traverse subsets of the database, or the whole database, in sorted order by key. On a 16-core machine Masstree achieves six to ten million operations per second on parts A–C of the Yahoo! Cloud Serving Benchmark benchmark, more than 30x as fast as VoltDB [5] or MongoDB [2].

An inspiration for anyone pursuing pure performance in the key-value space.

As the authors note when comparing Masstree to other systems:

Many of these systems support features that Masstree does not, some of which may bottleneck their performance. We disable other systems’ expensive features when possible.

The lesson here is to not buy expensive features unless you need them.

RethinkDB

Thursday, November 3rd, 2011

RethinkDB

From the features page:

RethinkDB is a persistent, industrial-strength key-value store with full support for the Memcached protocol.

Powerful technology

  • Ten times faster on solid-state
  • Linear scaling across cores
  • Fine-grained durability control
  • Instantaneous recovery on power failure

Supported core features

  • Point queries
  • Atomic increment/decrement
  • Arbitrary atomic operations
  • Append/prepend operations
  • Values up to 10MB in size
  • Pipelining support
  • Row expiration support
  • Multi-GET support

I particularly liked this line:

Can I use RethinkDB even if I don’t have solid-state drives in my infrastructure?

While RethinkDB performs best on dedicated commodity hardware that has a multicore processor and is backed by solid-state storage, it will still deliver a performance advantage both on rotational drives and in the cloud. (emphasis added to the answer)

Don’t worry your “rotational drives” and “cloud” account have not suddenly become obsolete. The skill you need to acquire before the next upgrade cycle is evaluating performance claims with your processes and data.

It doesn’t matter that all the UN documents can be retrieved in under sub-millisecond time, translated and served with a hot Danish if you don’t use the same format, have no need for translation and are more a custard tart fan. Vendor performance figures may attract your interest but your decision making should be driven by performance figures that represent your environment.

Build into the acquisition budget funding for your staff to replicate a representative subset of your data and processes for testing with vendor software/hardware. True enough, after the purchase you will probably toss that subset, but remember you will be living with the software purchase for years. And be known as the person who managed the project. Suddenly spending a little more money on making sure your requirements are met doesn’t sound so bad.