Archive for the ‘TokuDB’ Category

Comparing MongoDB, MySQL, and TokuMX Data Layout

Tuesday, July 30th, 2013

Comparing MongoDB, MySQL, and TokuMX Data Layout by Zardosht Kasheff.

From the post:

A lot is said about the differences in the data between MySQL and MongoDB. Things such as “MongoDB is document based”, “MySQL is relational”, “InnoDB has a clustering key”, etc.. Some may wonder how TokuDB, our MySQL storage engine, and TokuMX, our MongoDB product, fit in with these data layouts. I could not find anything describing the differences with a simple google search, so I figured I’d write a post explaining how things compare.

So who are the players here? With MySQL, users are likely familiar with two storage engines: MyISAM, the original default up until MySQL 5.5, and InnoDB, the current default since MySQL 5.5. MongoDB has only one storage engine, and we’ll refer to it as “vanilla Mongo storage”. And of course, there is TokuDB for MySQL, and TokuMX.

First, let’s get some quick terminology out of the way. Documents and collections in MongoDB can be thought of as rows and tables in MySQL, respectively. And while not identical, fields in MongoDB are similar to columns in MySQL. A full SQL to MongoDB mapping can be found here. When I refer to MySQL, what I say applies to TokuDB, InnoDB, and MyISAM. When I say MongoDB, what I say applies to TokuMX and vanilla Mongo storage.

Great contrast of MongoDB and MySQL data formats.

Data formats are essential to understanding the capabilities and limitations of any software package.

Why Unique Indexes are Bad [Caveat on Fractal Tree(R) Indexes]

Monday, July 15th, 2013

Why Unique Indexes are Bad by Zardosht Kasheff.

From the post:

Before creating a unique index in TokuMX or TokuDB, ask yourself, “does my application really depend on the database enforcing uniqueness of this key?” If the answer is ANYTHING other than yes, do not declare the index to be unique. Why? Because unique indexes may kill your write performance. In this post, I’ll explain why.

Unique indexes are a strange beast: they have no impact on standard databases that use B-Trees, such as MongoDB and MySQL, but may be horribly painful for databases that use write optimized data structures, like TokuMX’s Fractal Tree(R) indexes. How? They essentially drag the Fractal Tree index down to the B-Tree’s level of performance.

When a user declares a unique index, the user tells the database, “please help me and enforce uniqueness on this index.” So, before doing any insertion into a unique index, the database must first verify that the key being inserted does not already exist. If the possible location of the key is not in memory, which may happen if the working set does not fit in memory, then the database MUST perform an I/O to bring into memory the contents of the potential location (be it a leaf node in a tree, or an offset into a memory mapped file), in order to check whether the key exists in that location.


Zardosht closes by recommending if your application does require unique indexes that you consider re-writing it so it doesn’t.


Not a mark against Fractal Tree(R) indexes but certainly a consideration in deciding to adopt technology using them.

Would be nice if this type of information could be passed along as more than sysadmin lore.

Like a plugin for your browser that at your request highlights products or technologies of interest and on mouse-over displays known limitations or bugs.

The sort of things that vendors loath to disclose.

Open Source TokuDB Resources

Saturday, April 27th, 2013

Open Source TokuDB Resources

A quick summary of the Tokutek repositories at Github and pointers to Google groups for discussion of TokuDB.

Announcing TokuDB v7: Open Source and More

Tuesday, April 23rd, 2013

Announcing TokuDB v7: Open Source and More by Martin Farach-Colton.

From the post:

The free Community Edition is fully functional and fully performant. It has all the compression you’ve come to expect from TokuDB. It has hot schema changes: no-down-time column insertion, deletion, renaming, etc., as well as index creation. It has clustering secondary keys. We are also announcing an Enterprise Edition (coming soon) with additional benefits, such as a support package and advanced backup and recovery tools.

You may have noticed those screaming performance numbers I have cited from TokuDB posts?

Now the origin of those numbers is open source.

Curious, what questions are you going to ask differently or what different questions will you ask as processing power increases?

Or to ask it the other way, what questions have you not asked because of a lack of processing power?

NoSQL is Great, But You Still Need Indexes [MongoDB for example]

Wednesday, February 20th, 2013

NoSQL is Great, But You Still Need Indexes by Martin Farach-Colton.

From the post:

I’ve said it before, and, as is the nature of these things, I’ll almost certainly say it again: your database performance is only as good as your indexes.

That’s the grand thesis, so what does that mean? In any DB system — SQL, NoSQL, NewSQL, PostSQL, … — data gets ingested and organized. And the system answers queries. The pain point for most users is around the speed to answer queries. And the query speed (both latency and throughput, to be exact) depend on how the data is organized. In short: Good Indexes, Fast Queries; Poor Indexes, Slow Queries.

But building indexes is hard work, or at least it has been for the last several decades, because almost all indexing is done with B-trees. That’s true of commercial databases, of MySQL, and of most NoSQL solutions that do indexing. (The ones that don’t do indexing solve a very different problem and probably shouldn’t be confused with databases.)

It’s not true of TokuDB. We build Fractal Tree Indexes, which are much easier to maintain but can still answer queries quickly. So with TokuDB, it’s Fast Indexes, More Indexes, Fast Queries. TokuDB is usually thought of as a storage engine for MySQL and MariaDB. But it’s really a B-tree substitute, so we’re always on the lookout for systems where we can improving the indexing.

Enter MongoDB. MongoDB is beloved because it makes deployment fast. But when you peel away the layers, you get down to a B-tree, with all the performance headaches and workarounds that they necessitate.

That’s the theory, anyway. So we did some testing. We ripped out the part of MongoDB that takes care of secondary indices and plugged in TokuDB. We’ve posted the blogs before, but here they are again, the greatest hits of TokuDB+MongoDB: we show a 10x insertion performance, a 268x query performance, and a 532x (or 53,200% if you prefer) multikey index insertion performance. We also discussed covered indexes vs. clustered Fractal Tree Indexes.

Did somebody declare February 20th to be performance release day?

Did I miss that memo? 😉

Like every geek, I like faster. But, here’s my question:

Have there been any studies on the impact of faster systems on searching and decision making by users?

My assumption is the faster I get a non-responsive result from a search, the sooner I can improve it.

But that’s an assumption on my part.

Is that really true?

Concurrency Improvements in TokuDB v6.6 (Part 2)

Tuesday, February 5th, 2013

Concurrency Improvements in TokuDB v6.6 (Part 2)

From the post:

In Part 1, we showed performance results of some of the work that’s gone in to TokuDB v6.6. In this post, we’ll take a closer look at how this happened, on the engineering side, and how to think about the performance characteristics in the new version.


It’s easiest to think about our concurrency changes in terms of a Fractal Tree® index that has nodes like a B-tree index, and buffers on each node that batch changes for the subtree rooted at that node. We have materials that describe this available here, but we can proceed just knowing that:

  1. To inject data into the tree, you need to store a message in a buffer at the root of the tree. These messages are moved down the tree, so you can find messages in all the internal nodes of the tree (the mechanism that moves them is irrelevant for now).
  2. To read data out of the tree, you need to find a leaf node that contains your key, check the buffers on the path up to the root for messages that affect your query, and apply any such messages to the value in the leaf before using that value to answer your query.

It’s these operations that modify and examine the buffers in the root that were the main reason we used to serialize operations inside a single index.

Just so not everything today is “soft” user stuff. 😉

Interesting avoidance of the root node as an I/O bottleneck.

Sort of thing that gets me to thinking about distributed topic map writing/querying.

Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data

Friday, February 1st, 2013

Tracking 5.3 Billion Mutations: Using MySQL for Genomic Big Data by Lawrence Schwartz.

From the post:

The Organization: The The Philip Awadalla Laboratory is the Medical and Population Genomics Laboratory at the University of Montreal. Working with empirical genomic data and modern computational models, the laboratory addresses questions relevant to how genetics and the environment influence the frequency and severity of diseases in human populations. Its research includes work relevant to all types of human diseases: genetic, immunological, infectious, chronic and cancer. Using genomic data from single-nucleotide polymorphisms (SNP), next-generation re-sequencing, and gene expression, along with modern statistical tools, the lab is able to locate genome regions that are associated with disease pathology and virulence as well as study the mechanisms that cause the mutations.

The Challenge: The lab’s genomic research database is following 1400 individuals with 3.7 million shared mutations, which means it is tracking 5.3 billion mutations. Because the representation of genomic sequence is a highly compressible series of letters, the database requires less hardware than a typical one. However, it must be able to store and retrieve data quickly in order to respond to research requests.

Thibault de Malliard, the researcher tasked with managing the lab’s data, adds hundreds of thousands of records every day to the lab’s MySQL database. The database must be able to process the records ASAP so that the researchers can make queries and find information quickly. However, as the database grew to 200 GB, its performance plummeted. de Malliard determined that the database’s MyISAM storage engine was having difficulty keeping up with the fire hose of data, pointing out that a single sequencing batch could take days to run.

Anticipating that the database could grow to 500 GB or even 1 TB within the next year, de Malliard began to search for a storage engine that would maintain performance no matter how large his database got.

Insertion Performance: “For us, TokuDB proved to be over 50x faster to add or update data into big tables,” according to de Malliard. “Adding 1M records took 51 min for MyISAM, but 1 min for TokuDB. So inserting one sequencing batch with 48 samples and 1.5M positions would take 2.5 days for MyISAM but one hour with TokuDB.”

OK, so it’s not “big data.” But it was critical data to the lab.

Maybe instead of “big data” we should be talking about “critical” or even “relevant” data.

Remember the story of the data analyst with “830 million GPS records of 80 million taxi trips” whose analysis confirmed what taxi drivers already knew, they stop driving when it rains. Could have asked a taxi driver or two. Starting Data Analysis with Assumptions

Take a look at TukoDB when you need a “relevant” data solution.

Announcing TokuDB v6.6: Performance Improvements

Wednesday, January 9th, 2013

Announcing TokuDB v6.6: Performance Improvements

From the post:

We are excited to announce TokuDB® v6.6, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers three types of performance improvements: in-memory, multi-client and fast updates.

Although TokuDB is optimized for large tables, which are larger than memory, many workloads consist of a mix of large and small tables. TokuDB v6.6 offers improvements on in-memory performance, with a more than 100% improvement on Sysbench at many concurrency levels and more than 200% improvement on TPC-C at many concurrency levels. Details to follow.

We have also made improvements in multi-threaded performance. For example, single threaded trickle loads have always been fast in TokuDB. But now multi-threaded trickle loads are even faster. An iibench run with four writers shows an increase from ~18K insertions/sec to ~28K insertions/sec. With a writer and reader running concurrently, we achieve ~13K insertions/sec.

Leif Walsh, one of our engineers, will be posting some details of how this particular improvement was achieved. So stay tuned for this and posts comparing our concurrent iibench performance with InnoDB’s.

A bit late for Christmas but performance improvements on top of already impressive performance are always welcome!

Looking forward to hearing more of the details!

Fractal Tree Indexing Overview

Monday, December 10th, 2012

Fractal Tree Indexing Overview by Martin Farach-Colton.

From the post:

We get a lot of questions about how Fractal Tree indexes work. It’s a write-optimized index with fast queries, but which write-optimized indexing structure is it?

In this ~15 minute video (which uses these slides), I give a quick overview of how they work and what they are good for.

Suggestion: Watch the video along with the slides. (Some of the slides are less than intuitive. Trust me on this one.)

Martin Gardner explaining fractals in SciAm it’s not but it will give you a better appreciation for fractal trees.

BTW, did you know B-Trees are forty years old this year?

Best Practices for a Successful TokuDB Evaluation (Webinar)

Thursday, November 29th, 2012

Best Practices for a Successful TokuDB Evaluation by Gerry Narvaja

Date: December 11th
Time: 2 PM EST / 11 AM PST

From the webpage:

In this webinar we will show step by step how to install, configure, and test TokuDB for a typical performance evaluation. We’ll also be flagging potential pitfalls that can ruin the eval results. It will describe the differences between installing from scratch and replacing an existing MySQL / MariaDB installation. It will also review the most common issues that may arise when running TokuDB binaries.

You have seen the TokuDB numbers on their data.

Now you can see what numbers you can get with your data.

Report on XLDB Tutorial on Data Structures and Algorithms

Tuesday, October 16th, 2012

Report on XLDB Tutorial on Data Structures and Algorithms by Michael Bender.

From the post:

The tutorial was organized as follows:

  • Module 0: Tutorial overview and introductions. We describe an observed (but not necessary) tradeoff in ingestion, querying, and freshness in traditional database.
  • Module 1: I/O model and cache-oblivious analysis.
  • Module 2: Write-optimized data structures. We give the optimal trade-off between inserts and point queries. We show how to build data structures that lie on this tradeoff curve.
  • Module 2 continued: Write-optimized data structures perform writes much faster than point queries; this asymmetry affects the design of an ACID compliant database.
  • Module 3: Case study – TokuFS. How to design and build a write-optimized file systems.
  • Module 4: Page-replacement algorithms. We give relevant theorems on the performance of page-replacement strategies such as LRU.
  • Module 5: Index design, including covering indexes.
  • Module 6: Log-structured merge trees and fractional cascading.
  • Module 7: Bloom filters.

These algorithms and data structures are used both in NoSQL implementations such as MongoDB, HBase and in SQL-oriented implementations such as MySQL and TokuDB.

The slides are available here.

A tutorial offered by Michael and Bradley C. Kuszmaul at the 6th XLDB conference.

If you are committed to defending your current implementation choices against all comers, don’t bother with the slides.

If you want a peek at one future path in data structures, get the slides. You won’t be disappointed.

Forbes: “Tokutek Makes Big Data Dance”

Saturday, October 6th, 2012

Forbes: “Tokutek Makes Big Data Dance” by Lawrence Schwartz.

From the post:

Recently, our CEO, John Partridge had a chance to talk about novel database technologies for “Big Data” with Peter Cohan of Forbes.

According to the article, “Fractal Tree indexing is helping organizations analyze big data more efficiently due to its ability to improve database efficiency thanks to faster ‘database insertion speed, quicker input/output performance, operational agility, and data compression.’” As a start-up based on “the first algorithm-based breakthrough in the database world in 40 years,” Toktuetek is following in the footsteps of firms such as Google and RSA, which also relied on novel algortithm advances as core to their technology.

To read the full article, and to see how Tokutek is helping companies tackle big data, see here.

I would ignore Peter Cohan’s mistakes about the nature of credit card processing. You don’t wait for the “ok” on your account balance.

Remember What if all transactions required strict global consistency? by Matthew Aslett of the 451 Group? Eventual consistency works right now.

I would have picked “hot schema” changes as a feature to highlight but that might not play as well with a business audience.

Webinar: Introduction to TokuDB v6.5 (Oct. 10, 2012)

Saturday, October 6th, 2012

Webinar: Introduction to TokuDB v6.5

From the post:

TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, replication performance and online schema flexibility. Tokutek’s recently launched TokuDB v6.5 delivers all of these features and more, not just for HDDs, but also for flash memory.

Date: October 10th
Time: 2 PM EST / 11 AM PST

TokuDB v6.5:

  • Stores 10x More Data – TokuDB delivers 10x compression without any performance degradation. Users can therefore take advantage of much greater amounts of available space without paying more for additional storage.
  • Delivers High Insertion Speed – TokuDB Fractal Tree® indexes continue to change the game with huge insertion rates and greater scalability. Our latest release delivers an order of magnitude faster insertion performance than the competition, ideal for applications that must simultaneously query and update large volumes of rapidly arriving data (e.g., clickstream analytics).
  • Allows Hot Schema Changes — Hot column addition/deletion/rename/resize provides the ability to add/drop/change a column to a database without taking the database offline, enabling database administrators to redefine or add new fields with no downtime.
  • Extends Wear Life for Flash– TokuDB’s proprietary Fractal Tree indexing writes fewer, larger blocks which reduces overall wear, and more efficiently utilizes the FTL (Flash Translation Layer). This extends the life of flash memory by an order of magnitude for many applications.

This webinar covers TokuDB features, latest performance results, and typical use cases.

You have seen the posts about fractal indexing! Now see the demos!

MySQL Schema Agility on SSDs

Saturday, September 29th, 2012

MySQL Schema Agility on SSDs by Tim Callaghan.

From the post:

TokuDB v6.5 adds the ability to expand certain column types without downtime. Users can now enlarge char, varchar, varbinary, and integer columns with no interruption to insert/update/delete statements on the altered table. Prior to this feature, enlarging one of these column types required a full table rebuild. InnoDB blocks all insert/update/delete operations to a table during column expansion as it rebuilds the table and all indexes.

Not sure how often you will need the ability to enlarge columns types without downtime but when you do, suspect it is mission critical.

Something to keep in mind while planning for uncertain data futures.

Announcing TokuDB v6.5: Optimized for Flash [Disambiguation]

Tuesday, September 25th, 2012

Announcing TokuDB v6.5: Optimized for Flash

Semantic confusion follows me around. Like the harpies that tormented Phineus. Well, maybe not quite that bad. 😉

But I see in the news feed that TukoDB v6.5 has been optimized for Flash.

First thought: Why? Who would want a database optimized for Flash?

But they did not mean Flash, or one of the other seventy-five (75) meanings of Flash, but Flash.

I’m glad we had this conversation and cleared that up!

The “Flash” in this case refers to “flash memory.” And so this is an exciting announcement:

We are excited to announce TokuDB® v6.5, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers optimization for Flash as well as more hot schema change operations for improved agility.

We’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

TokuDB v6.5 continues the great Toku-tradition of fast insertions. On flash drives, we show an order-of-magnitude (9x) faster insertion rate than InnoDB. TokuDB’s standard compression works just as well on flash and helps you get the most out of your storage system. And TokuDB reduces wear on solid-state drives by more than an order of magnitude. The full technical details will be subject of a future blog post. In summary though, when TokuDB writes to disk, it updates many rows, whereas InnoDB may write a leaf to disk with a single modified row, in some circumstances. More changes per write means fewer writes, which makes the flash drive wear out much more slowly.

More Hot Schema Changes
TokuDB already has hot column addition, deletion and renaming. In this release we add hot column expansion, so you can change the size of the integers in a column or the number of characters in a field. These operations incurs no down time and the changes are immediately available on the table. In this release, we have also extended hot schema changes to partitioned tables.

Every disambiguation page at, in every language, is testimony to a small part of the need for semantic disambiguation.

Did you know that as of today, there are 218,765 disambiguation pages in Wikipedia? Disambiguation Pages.

How many disambiguations could you use for an index at work, that don’t appear in Wikipedia?

You can stop at ten (10). Point made.

Announcing TokuDB v6.1

Saturday, July 21st, 2012

Announcing TokuDB v6.1

From the post:

TokuDB v6.1 is now generally available and can be downloaded here.

New features include:

  • Added support for MariaDB 5.5 (5.5.25)
    • The TokuDB storage engine is now available with all the additional functionality of MariaDB 5.5.
  • Added HCAD support to our MySQL 5.5 version (5.5.24)
    • Hot column addition/deletion was present in TokuDB v6.0 for MySQL 5.1 and MariaDB 5.2, but not in MySQL 5.5. This feature is now present in all MySQL and MariaDB versions of TokuDB.
  • Improved in-memory point query performance via lock/latch refinement
    • TokuDB has always been a great performer on range scans and workloads where the size of the working data set is significantly larger than RAM. TokuDB v6.0 improved the performance of in-memory point queries at low levels of concurrency. TokuDB v6.1 further increased the performance at all concurrency levels.
    • The following graph shows our sysbench.oltp.uniform performance on an in-memory data set (16 x 5 million row tables, server is 2 x Xeon 5520, 72GB RAM, Centos 5.8)

Go to the post to see impressive performance numbers.

I do wonder, when do performance numbers cease to be meaningful for the average business application?

Like a car that can go from 0 to 60 in under 3 seconds. (Yes, there is such a car, 2011 Bugatti.)

Nice to have, but where are you going to drive it?

As you can tell from this blog, I am all for the latest algorithms, software, hardware, but at the same time, the latest may not be the best for your application.

It maybe that simpler, less high performance solutions will not only be more appropriate but also more robust.

TokuDB v6.0: Download Available

Wednesday, May 2nd, 2012

TokuDB v6.0: Download Available by Martin Farach-Colton.

From the post:

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Are you familiar with any independent benchmark testing on TokuDB?

Not that I doubt the TokuDB numbers.

Thinking that contributing standard numbers to a more centralized resource would help with evaluations.

1 Billion Insertions – The Wait is Over!

Monday, January 30th, 2012

1 Billion Insertions – The Wait is Over! by Tim Callaghan.

From the post:

iiBench measures the rate at which a database can insert new rows while maintaining several secondary indexes. We ran this for 1 billion rows with TokuDB and InnoDB starting last week, right after we launched TokuDB v5.2. While TokuDB completed it in 15 hours, InnoDB took 7 days.

The results are shown below. At the end of the test, TokuDB’s insertion rate remained at 17,028 inserts/second whereas InnoDB had dropped to 1,050 inserts/second. That is a difference of over 16x. Our complete set of benchmarks for TokuDB v5.2 can be found here.

Kudos to TokuDB team! Impressive performance!

Tim comments on iiBench:

iiBench [Indexed Insertion Benchmark] simulates a pattern of usage for always-on applications that:

  • Require fast query performance and hence require indexes
  • Have high data insert rates
  • Cannot wait for offline batch processing and hence require the indexes be maintained as data comes in

If this sounds familiar, could be an important benchmark to keep in mind.

BTW, do you know of any topic map benchmarks? Just curious.

TokuDB v5.2 Beta Program

Monday, November 21st, 2011

TokuDB v5.2 Beta Program

From the webpage:

With the release of TokuDB v5.0 last March, we delivered a powerful and agile storage engine that broke through traditional MySQL scalability and performance barriers. As deployments of TokuDB have grown more varied, one request we have repeatedly heard from customers and prospects, especially in areas such as online advertising, social media, and clickstream analysis, is for improved performance for multi-client workloads.

Tokutek is now pleased to announce limited beta availability for TokuDB v5.2. The latest version of our flagship product offers a significant improvement over TokuDB v5.0 in multi-client scaling as well as performance gains in point queries, range queries, and trickle load speed. There are a host of other smaller changes and improvements that are detailed in our release notes (available to beta participants).

Here’s your chance for your topic map backend to have a jump over your competitors. And to help make an impressive product even more so.

Your impressions or comments most welcome!

Scaling MySQL with TokuDB Webinar

Sunday, November 20th, 2011

Scaling MySQL with TokuDB Webinar – Video and Slides Now Available

From the post:

Thanks to everyone who signed up and attended the webinar I gave this week with Tim Callaghan on Scaling MySQL. For those who missed it and are interested, the video and slides are now posted here.


MySQL implementations are often kept relatively small, often just a few hundred GB or less. Anything beyond this quickly leads to painful operational problems such as poor insertion rates, slow queries, hours to days offline for schema changes, prolonged downtime for dump/reload, etc. The promise of scalable MySQL has remained largely unfulfilled, until TokuDB.

TokuDB v5.0 delivers

  • Exceptional Agility — Hot Schema Changes allow read/write operations during index creation or column/field addition
  • Unmatched Speed — Fractal Tree indexes perform 20x to 80x better on write intensive workloads
  • Maximum Scalability — Fractal Tree index performance scales even as the primary index exceeds available RAM

This webinar covers TokuDB v5.0 features, latest performance results, and typical use cases.

I haven’t run TukoDB but it is advertised as a drop-in replacement for MySQL. High performance replacement.

Comments/suggestions? (I need to pre-order High-Performance MySQL, 2nd ed., 2012. Ignore the scams with the 1st edition copies still in stock at some sellers.)