Archive for the ‘FoundationDB’ Category

Flow: Actor-based Concurrency with C++ [FoundationDB]

Saturday, February 14th, 2015

Flow: Actor-based Concurrency with C++

From the post:

FoundationDB began with ambitious goals for both high performance per node and scalability. We knew that to achieve these goals we would face serious engineering challenges while developing the FoundationDB core. We’d need to implement efficient asynchronous communicating processes of the sort supported by Erlang
or the Async library in .NET, but we’d also need the raw speed and I/O efficiency of C++. Finally, we’d need to perform extensive simulation to engineer for reliability and fault tolerance on large clusters.

To meet these challenges, we developed several new tools, the first of which is Flow, a new programming language that brings actor-based concurrency to C++11. To add this capability, Flow introduces a number of new keywords and control-flow primitives for managing concurrency. Flow is implemented as a compiler which analyzes an asynchronous function (actor) and rewrites it as an object with many different sub-functions that use callbacks to avoid blocking (see streamlinejs for a similar concept using JavaScript). The Flow compiler’s output is normal C++11 code, which is then compiled to a binary using traditional tools. Flow also provides input to our simulation tool, Lithium, which conducts deterministic simulations of the entire system, including its physical interfaces and failure modes. In short, Flow allows efficient concurrency within C++ in a maintainable and extensible manner, achieving all three major engineering goals:

  • high performance (by compiling to native code),
  • actor-based concurrency (for high productivity development),
  • simulation support (for testing).

Flow Availability

Flow is not currently available outside of FoundationDB, but we’d like to open-source it in the future. If you’d like to stay in the loop with our progress subscribe below.

Are you going to be ready when Flow is released separate from FoundationDB?

Hot Cloud Swap: Migrating a Database Cluster with Zero Downtime

Tuesday, December 23rd, 2014

Hot Cloud Swap: Migrating a Database Cluster with Zero Downtime by Jennifer Rullmann.

By now, you may have heard about, seen, or even tried your hand against the fault tolerance of our database. The Key-Value Store, and the layers that turn it into a multi-model database, handle a wide variety of disasters with ease. In this real-time demo video, we show off the ability to migrate a cluster to a new set of machines with zero downtime.

fdb_image_rush

We’re calling this feature ‘hot cloud swap’, because although you can use it on your own machines, it’s particularly interesting to those who run their database in the cloud and may want to switch providers. And that’s exactly what I do in the video. Watch me migrate a database cluster from Digital Ocean to Amazon Web Services in under 7 minutes, real-time!

Its been years but I can remember as a sysadmin switching out “hot swapable” drives. Never lost any data but there was always that moment of doubt during the rebuild.

Personally I would have more than one complete and tested backups, to the extent that is possible, before trying a “hot cloud swap.” That may be overly cautious but better cautious than crossing into the “Sony Zone.”

At one point Jennifer says:

“…a little bit of hesitation but it worked it out.”

Difficult to capture but if you look at time marker 06.52.85 on the clock below the left hand window, writes start failing.

It recovers but it is not the case that the application never stops. At least in the sense of writes. Depends on your definition of “stops” I suppose.

I am sure that the fault tolerance build into FoundationDB made this less scary but the “hot swap” part should be doable with any clustering solution. Yes?

That is you add “new” machines to the cluster, then exclude the “old” machines from the cluster, which results in a complete transfer of data to the “new” machines, at which point you create new coordinators, exclude the “old” machines from the cluster and then eventually you close the “old” machines. Is there something unique about that process to FoundationDB?

Don’t get me wrong, I am hoping to learn a great deal more about FoundationDB in the new year but I intensely dislike distinctions between software packages that have no basis in fact.

FoundationDB 3.0

Wednesday, December 10th, 2014

Failing at Scaling by Dave Rosenthal.

Dave writes a great post but you want cut to what screams “Try FoundationDB!

Without further ado:

FoundationDB performance

I hope you agree that this is an incredible result. And it’s made even more impressive because we are hitting this number on a fully-ordered, fully-transactional database with 100% multi-key cross-node transactions. We haven’t heard of a database that even comes close to these performance numbers with those guarantees. Oh, and in the public cloud, with all its usual communications and noisy-neighbor challenges.

Let’s put 14.4 Mhz in context:

It’s gratifying for the whole team here to hit our ambitious initial goal after five hard years of theory, simulation, and engineering!

Yep, that is 14,400,000 random writes per second. (I know, Dave calls that number 14.4 Mhz. Control of abuse of language isn’t my department.)

I’m sure you have other questions so see Dave’s post and while you are there, grab a copy of FoundationDB 3.0!

No Query Language Needed: Using Python with an Ordered Key-Value Store

Friday, October 10th, 2014

No Query Language Needed: Using Python with an Ordered Key-Value Store by Stephen Pimentel.

From the post:

FoundationDB is a complex and powerful database, designed to handle sharding, replication, network hiccups, and server failures gracefully and automatically. However, when we designed our Python API, we wanted most of that complexity to be hidden from the developer. By utilizing familiar features- such as generators, itertools, and comprehensions-we tried to make FoundationDB’s API as easy to us as a Python dictionary.

In the video below, I show how FoundationDB lets you query data directly using Python language features, rather than a separate query language.

Most applications have back-end data stores that developers need to query. This talk presents an approach to storing and querying data that directly employs Python language features. Using the Key-Value Store, we can make our data persistent with an interface similar to a Python dictionary. Python then gives us a number of tools “out of the box” that we can use to form queries:

  • generators for memory-efficient data retrieval;
  • itertools to filter and group data;
  • comprehensions to assemble the query results.

Taken together, these features give us a query capability using straight Python. The talk walks through a number of example queries using the Enron email dataset.

https://github.com/stephenpiment/object-store For code and the details.

More motivation to take a look at FoundationDB!

I do wonder about the “no query language needed.” Users, despite their poor results, appear to be committed to querying and query languages.

Whether it is the illusion of “empowerment” of users, the current inability to measure the cost of ineffectual searching, or acceptance of poor search results, search and search operators continue to be the preferred means of interaction. Plan accordingly.

I first saw this in a tweet by Hari Kishan.

FoundationDB: Developer Recipes

Thursday, April 24th, 2014

FoundationDB: Developer Recipes

From the webpage:

Learn how to build new data models, indexes, and more on top of the FoundationDB key-value store API.

I was musing the other day about how to denormalize a data structure for indexing.

This is the reverse of that process but still should be instructive.

Graphistas should note that FoundationDB also implements the Blueprints API (blueprints-foundationdb-graph).

Data Modeling – FoundationDB

Saturday, February 15th, 2014

Data Modeling – FoundationDB

From the webpage:

FoundationDB’s core provides a simple data model coupled with powerful transactions. This combination allows building richer data models and libraries that inherit the scalability, performance, and integrity of the database. The goal of data modeling is to design a mapping of data to keys and values that enables effective storage and retrieval. Good decisions will yield an extensible, efficient abstraction. This document covers the fundamentals of data modeling with FoundationDB.

Great preparation for these tutorials using the tuple layer of FoundationDB:

The Class Scheduling tutorial introduces the fundamental concepts needed to design and build a simple application using FoundationDB, beginning with basic interaction with the database and walking through a few simple data modeling techniques.

The Enron Email Corpus tutorial introduces approaches to loading data in FoundationDB and further illustrates data modeling techniques using a well-known, publicly available data set.

The Managing Large Values and Blobs tutorial discusses approaches to working with large data objects in FoundationDB. It introduces the blob layer and illustrates its use to build a simple file library.

The Lightweight Query Language tutorial discusses a layer that allows Datalog to be used as an interactive query language for FoundationDB. It describes both the FoundationDB binding and the use of the query language itself.

Enjoy!

FoundationDB 2.0 is OUT!

Thursday, February 6th, 2014

Version 2.0 is here: PHP, Golang, Directory layer, TLS Security, and More! by David Rosenthal.

From the post:

We’re very excited to introduce FoundationDB 2.0. FoundationDB combines the power of ACID transactions with the scalability, fault tolerance, and operational elegance of distributed NoSQL databases. This release was driven by specific customer feedback for increased language support, network security, and higher-level tools for managing data within FoundationDB.

FoundationDB 2.0 adds Go and PHP to the list of languages with native FoundationDB support. There also are two new layers available in all languages: The Subspace layer provides an easy way to define and manage subspaces of keys via key prefixes. The Directory layer manages the efficient allocation and management of virtual “directories” of keys and values within a database. They work together as the recommended way to efficiently organize different kinds of data within a single FoundationDB database.

Along with the additional language and layer support, 2.0 also ships with full Transport Layer Security which encrypts all FoundationDB network traffic, enabling security and authentication between both servers and clients via a public/private key infrastructure. This allows FoundationDB to safely run on an untrusted LAN or WAN. (emphasis added)

If you know of a trusted LAN or WAN, please leave a comment below. 😉

After commenting, download a copy of FoundationDB 2.0 and see what you think of the key management features.

FoundationDB Developer Guide & API Reference

Thursday, January 16th, 2014

FoundationDB Developer Guide & API Reference

From the webpage:

Foundation’s scalability and performance make it an ideal back end for supporting the operation of critical applications. FoundationDB provides a simple data model coupled with powerful transactional integrity. This document gives an overview of application development using FoundationDB, including use of the API, working with transactions, and performance considerations.

When I saw a tweet from FoundationDB that read:

More into theory or practice? Either way, check out the FoundationDB Developer Guide & API Reference

I just had to go look! 😉

Enjoy!

Class Scheduling [Tutorial FoundationDB]

Saturday, December 21st, 2013

Class Scheduling

From the post:

This tutorial provides a walkthrough of designing and building a simple application in Python using FoundationDB. In this tutorial, we use a few simple data modeling techniques. For a more in-depth discussion of data modeling in FoundationDB, see Data Modeling.

The concepts in this tutorial are applicable to all the languages supported by FoundationDB. If you prefer, you can see a version of this tutorial in:

The offering of the same tutorial in different languages looks like a clever idea.

Like using a polyglot edition of the Bible with parallel original text and translations.

In a polyglot, the associations between words in different languages are implied rather than explicit.

Benchmarking Honesty

Tuesday, December 3rd, 2013

Benchmarking Honesty by David Rosenthal.

From the post:

Recently, someone brought to my attention a blog post that benchmarks FoundationDB and another responding to the benchmark itself. I’ll weigh in: I think this benchmark is unfair because it gives people too good an impression of FoundationDB’s performance. In the benchmark, 100,000 items are loaded into each database/storage engine in both sequential and random patterns. In the case of FoundationDB and other sophisticated systems like SQL Server, you can see that the performance of random and sequential writes are virtually the same; this points to the problem. In the case of FoundationDB, an “absorption” mechanism is able to cope with bursts of writes (on the order of a minute or two, usually) without actually updating the real data structures holding the data (i.e. only persisting a log to disk, and making changes available to read from RAM). Hence, the published test results are giving FoundationDB an unfair advantage. I think that you will find that if you sustain this workload for a longer time, like in real-world usages, FoundationDB might be significantly slower.

If you don’t recognize the name, David Rosenthal is the co-founder and CEO of FoundationDB.

What?

A CEO saying a benchmark favorable to his product is “unfair?”

Odd as it may sound, I think there is an honest CEO on the loose.

Statistically speaking, it had to happen eventually. 😉

Seriously, high marks to David Rosenthal. We need more CEOs, engineers and presenters with a sense of honesty.

How to: Scaling FoundationDB

Saturday, October 5th, 2013

How to: Scaling FoundationDB by Ben Collins.

From the post:

We put together this screencast to walk through how to install and scale your FoundationDB cluster beyond the default single-machine installation. In it, we explain how to add two more machines and configure the cluster to take advantage of the high performance and fault tolerance properties of FoundationDB.

I need to get some medium sized boxes for experimental purposes.

Winter is approaching and they would reduce the amount of time I have to run the heater. 😉

FoundationDB: Version 1.0 and Pricing Announced!

Tuesday, August 20th, 2013

FoundationDB: Version 1.0 and Pricing Announced!

From the post:

After a successful 18-month Alpha and Beta testing program involving more than 2,000 participants, we’re very excited to announce that we’ve released version 1.0 of FoundationDB and general availability pricing!

Built on a distributed shared-nothing architecture, FoundationDB is a unique database technology that combines the time-proven power of ACID transactions with the scalability, fault tolerance, and operational elegance of distributed NoSQL databases.

You can download FoundationDB and use it under our Community License today and run as many server processes as you’d like to in non-production use, and use up to six processes in production for free! You don’t even have to sign up – just go to our download page for instant access. You’ll get all the technical goodness of FoundationDB – exceptional fault tolerance, high performance distributed ACID transactions, and access to our growing catalog of open source layers – regardless of whether you’re a community user or a paying customer.

Have a big application that needs more than six processes in production, or want your FoundationDB cluster supported? We’re also offering commercial licensing and support priced starting at $99 per server process per month. Check out our commercial license and support plans on our pricing page.

I don’t know if FoundationDB will meet your requirements but I can say their business model should set the standard for software offerings.

High quality software with aggressive pricing and no registration required for the community edition.

I am downloading the community version now.

When are you going to grab a copy?

FoundationDB Beta 2 [NSA Scale?]

Monday, June 10th, 2013

Beta 2 is here – with 100X increased capacity!

From the post:

We’re happy to announce that we’ve released FoundationDB Beta 2!

Most of our testing and tuning in the past has focused on data sets ranging up to 1TB, but our users have told us that they’re excited to begin applying FoundationDB’s transactional processing to data sets larger than 1 TB, so we made that our major focus for Beta 2.

db scale

Beta 2 significantly reduces memory and CPU usage while increasing server robustness when working with larger data sets. FoundationDB now supports data sets up to 100 TB of aggregate key-value size. Though if you are planning on going above 10 TB you might want to talk to us at support@foundationdb.com for some configuration recommendations—we’re always happy to help.

Also new in Beta 2 is support for Node 0.10 and Ruby on Windows. Of course, there are a whole lot of behind-the-scenes improvements to both the core and our APIs, some of which are documented in the release notes.

New Website!

We also recently rolled out a cool new website to explain the transformative effect that ACID transactions have on NoSQL technology. Be sure to check it out, along with our community site where you can share your insights and get questions answered.

Do you think “web scale” is rather passé nowadays?

Really should be talking about NSA scale.

Yes?

Why FoundationDB Might Be All Its Cracked Up To Be

Friday, March 8th, 2013

Why FoundationDB Might Be All Its Cracked Up To Be by Doug Turnbull.

From the post:

When I first heard about FoundationDB, I couldn’t imagine how it could be anything but vaporware. Seemed like Unicorns crapping happy rainbows to solve all your problems. As I’m learning more about it though, I realize it could actually be something ground breaking.

NoSQL: Lets Review…

So, I need to step back and explain one reason NoSQL databases have been revolutionary. In the days of yore, we used to normalize all our data across multiple tables on a single database living on a single machine. Unfortunately, Moore’s law eventually crapped out and maybe more importantly hard drive space stopped increasing massively. Our data and demands on it only kept growing. We needed to start trying to distribute our database across multiple machines.

Turns out, its hard to maintain transactionality in a distributed, heavily normalized SQL database. As such, a lot of NoSQL systems have emerged with simpler features, many promoting a model based around some kind of single row/document/value that can be looked up/inserted with a key. Transactionality for these systems is limited a single key value entry (“row” in Cassandra/HBase or “document” in (Mongo/Couch) — we’ll just call them rows here). Rows are easily stored in a single node, although we can replicate this row to multiple nodes. Despite being replicated, it turns out transactionally working with single rows in distributed NoSQL is easier than guaranteeing transactionality of an SQL query visiting potentially many SQL tables in a distributed system.

There are deep design ramifications/limitations to the transactional nature of rows. First you always try to cram a lot of data related to the row’s key into a single row, ending up with massive rows of hierarchical or flat data that all relates to the row key. This lets you cover as much data as possible under the row-based transactionality guarantee. Second, as you only have a single key to use from the system, you must chose very wisely what your key will be. You may need to think hard how your data will be looked up through its whole life, it can be hard to go back. Additionally, if you need to lookup on a secondary value, you better hope that your database is friendly enough to have a secondary key feature or otherwise you’ll need to maintain secondary row for storing the relationship. Then you have the problem of working across two rows, which doesn’t fit in the transactionality guarantee. Third, you might lose the ability to perform a join across multiple rows. In most NoSQL data stores, joining is discouraged and denormalization into large rows is the encouraged best practice.

FoundationDB Is Different

FoundationDB is a distributed, sorted key-value store with support for arbitrary transactions across multiple key-values — multiple “rows” — in the database.

As Doug points out, there is must left to be known.

Still, exciting to have something new to investigate.

FoundationDB

Monday, March 4th, 2013

FoundationDB

FoundationDB Beta 1 is now available!

It will take a while to sort out all of its features, etc.

I should mention that it is refreshing that the documentation contains Known Limitations.

All software has limitations but few every acknowledge them up front.

You have to encounter one before one of the technical folks says: “…yes, we have been meaning to work on that.”

I would rather know up front what the limitation are.

Whether FoundationDB meets your requirements or not, it is good to see that kind of transparency.