Under the Hood: Building and open-sourcing RocksDB by Dhruba Borthakur.
From the post:
Every time one of the 1.2 billion people who use Facebook visits the site, they see a completely unique, dynamically generated home page. There are several different applications powering this experience–and others across the site–that require global, real-time data fetching.
Storing and accessing hundreds of petabytes of data is a huge challenge, and we’re constantly improving and overhauling our tools to make this as fast and efficient as possible. Today, we are open-sourcing RocksDB, an embeddable, persistent key-value store for fast storage that we built and use here at Facebook.
Why build an embedded database?
Applications traditionally access their data via remote procedure calls over a network connection, but that can be slow–especially when we need to power user-facing products in real time. With the advent of flash storage, we are starting to see newer applications that can access data quickly by managing their own dataset on flash instead of accessing data over a network. These new applications are using what we call an embedded database.
There are several reasons for choosing an embedded database. When database requests are frequently served from memory or from very fast flash storage, network latency can slow the query response time. Accessing the network within a data center can take about 50 microseconds, as can fast-flash access latency. This means that accessing data over a network could potentially be twice as slow as an application accessing data locally.
Secondly, we are starting to see servers with an increasing number of cores and with storage-IOPS reaching millions of requests per second. Lock contention and a high number of context switches in traditional database software prevents it from being able to saturate the storage-IOPS. We’re finding we need new database software that is flexible enough to be customized for many of these emerging hardware trends.
Like most of you, I don’t have 1.2 billion people visiting my site. 😉
However, understanding today’s “high-end” solutions will prepare you for tomorrow’s “middle-tier” solution and day after tomorrow’s desktop solution.
A high level overview of RocksDB.
Other resources to consider:
Update: Igor Canadi has posted to the Facebook page a proposal to add the concept of ColumnFamilies to RocksDB. https://github.com/facebook/rocksdb/wiki/Column-Families-proposal Comments? (Direct comments on that proposal to the Facebook page for RocksDB.)