Report on XLDB Tutorial on Data Structures and Algorithms by Michael Bender.
From the post:
The tutorial was organized as follows:
- Module 0: Tutorial overview and introductions. We describe an observed (but not necessary) tradeoff in ingestion, querying, and freshness in traditional database.
- Module 1: I/O model and cache-oblivious analysis.
- Module 2: Write-optimized data structures. We give the optimal trade-off between inserts and point queries. We show how to build data structures that lie on this tradeoff curve.
- Module 2 continued: Write-optimized data structures perform writes much faster than point queries; this asymmetry affects the design of an ACID compliant database.
- Module 3: Case study – TokuFS. How to design and build a write-optimized file systems.
- Module 4: Page-replacement algorithms. We give relevant theorems on the performance of page-replacement strategies such as LRU.
- Module 5: Index design, including covering indexes.
- Module 6: Log-structured merge trees and fractional cascading.
- Module 7: Bloom filters.
These algorithms and data structures are used both in NoSQL implementations such as MongoDB, HBase and in SQL-oriented implementations such as MySQL and TokuDB.
The slides are available here.
A tutorial offered by Michael and Bradley C. Kuszmaul at the 6th XLDB conference.
If you are committed to defending your current implementation choices against all comers, don’t bother with the slides.
If you want a peek at one future path in data structures, get the slides. You won’t be disappointed.