Archive for the ‘Consistency’ Category

Consistency through semantics

Saturday, November 24th, 2012

Consistency through semantics by Oliver Kennedy.

From the post:

When designing a distributed systems, one of the first questions anyone asks is what kind of consistency model to use. This is a fairly nuanced question, as there isn’t really one right answer. Do you enforce strong consistency and accept the resulting latency and communication overhead? Do you use locking, and accept the resulting throughput limitations? Or do you just give up and use eventual consistency and accept that sometimes you’ll end up with results that are just a little bit out of sync.

It’s this last bit that I’d like to chat about today, because it’s actually quite common in a large number of applications. This model is present in everything from user-facing applications like Dropbox to SVN/GIT, to back-end infrastructure systems like Amazon’s Dynamo and Yahoo’s PNUTs. Often, especially in non-critical applications latency and throughput are more important than dealing with the possibility that two simultaneous updates will conflict.

So what happens when this dreadful possibility does come to pass? Clearly the system can’t grind to a halt, and often just randomly discarding one of these updates is the wrong thing to do. So what happens? The answer is common across most of these systems: They punt to the user.

Intuitively, this is the right thing to do. The user sees the big picture. The user knows best how to combine these operations. The user knows what to do, so on those rare occurrences where the system can’t handle it, the user can.

But why is this the right thing to do? What does the user have that the infrastructure doesn’t?

Take the time to read the rest of Oliver’s post.

He distinguishes rather nicely between applications and users.

Eventually-Consistent Data Structures

Tuesday, November 13th, 2012

Eventually-Consistent Data Structures by Sean Cribbs

Summary:

Sean Cribbs discusses Convergent Replicated Data Types, data structures that tolerate eventual consistency.

Covers a number of eventually consistent data types.

Materials you may want to cover before you watch the presentation:

Safety/Liveness – from Proving the Correctness of Multiprocess Programs – Leslie Lamport (March 1977) (As a bonus, a link to all Leslie Lamport’s papers.)

Safety and liveness: Eventual consistency is not safe by Peter Ballis.

Logic and Lattices for Distributed Programming by Neil Conway, William Marczak, Peter Alvaro, Joseph M. Hellerstein, and David Maier.

A comprehensive study of Convergent and Commutative Replicated Data Types by Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski.

Strong Eventual Consistency and Conflict-free Replicated Data Types by Marc Shapiro (video).

I first saw this in a tweet by Sean T. Allen.

Convergent and Commutative Replicated Data Types [Warning: Heavy Sledding Ahead]

Thursday, October 11th, 2012

A comprehensive study of Convergent and Commutative Replicated Data Types (PDF file) Marc Shapiro, Nuno M. Preguiça, Carlos Baquero, Marek Zawirski.

Abstract:

Eventual consistency aims to ensure that replicas of some mutable shared object converge without foreground synchronisation. Previous approaches to eventual consistency are ad-hoc and error-prone. We study a principled approach: to base the design of shared data types on some simple formal conditions that are sufficient to guarantee eventual consistency. We call these types Convergent or Commutative Replicated Data Types (CRDTs). This paper formalises asynchronous object replication, either state based or operation based, and provides a sufficient condition appropriate for each case. It describes several useful CRDTs, including container data types supporting both add and remove operations with clean semantics, and more complex types such as graphs, montonic DAGs, and sequences. It discusses some properties needed to implement non-trivial CRDTs.

I found this following a link in the readme for riak dt which said:

WHAT?

Currently under initial development, riak_dt is a platform for convergent data types. It’s built on riak core and deployed with riak. All of our current work is around supporting fast, replicated, eventually consistent counters (though more data types are in the repo, and on the way.) This work is based on the paper – A Comprehensive study of Convergent and Commutative Replicated Data Types – which you may find an interesting read.

WHY?

Riak’s current model for handling concurrent writes is to store sibling values and present them to the client for resolution on read. The client must encode the logic to merge these into a single, meaningful value, and then inform Riak by doing a further write. Convergent data types remove this burden from the client, as their structure guarantees they will deterministically converge to a single value. The simplest of these data types is a counter.

I haven’t thought of merging of subject representatives as a quest for “consistency” but that is one way to think about it.

The paper is forty-seven pages long and has forty-four references, most of which I suspect are necessary to fully appreciate the work.

Having said that, I suspect it will be well worth the effort.

The Cost of Strict Global Consistency [Or Rules for Eventual Consistency]

Sunday, September 23rd, 2012

What if all transactions required strict global consistency? by Matthew Aslett.

Matthew quotes Basho CTO Justin Sheehy on eventual consistency and traditional accounting:

“Traditional accounting is done in an eventually-consistent way and if you send me a payment from your bank to mine then that transaction will be resolved in an eventually consistent way. That is, your bank account and mine will not have a jointly-atomic change in value, but instead yours will have a debit and mine will have a credit, each of which will be applied to our respective accounts.”

And Matthew comments:

The suggestion that bank transactions are not immediately consistent appears counter-intuitive. Comparing what happens in a transaction with a jointly atomic change in value, like buying a house, with what happens in normal transactions, like buying your groceries, we can see that for normal transactions this statement is true.

We don’t need to wait for the funds to be transferred from our accounts to a retailer before we can walk out the store. If we did we’d all waste a lot of time waiting around.

This highlights a couple of things that are true for both database transactions and financial transactions:

  • that eventual consistency doesn’t mean a lack of consistency
  • that different transactions have different consistency requirements
  • that if all transactions required strict global consistency we’d spend a lot of time waiting for those transactions to complete.

All of which is very true but misses an important point about financial transctions.

Financial transactions (involving banks, etc.) are eventually consistent according to the same rules.

That’s no accident. It didn’t just happen that banks adopted ad hoc rules that resulted in a uniform eventual consistency.

It didn’t happen over night but the current set of rules for “uniform eventual consistency” of banking transactions are spelled out by the Uniform Commercial Code. (And other laws, regulations but that is a major part of it.)

Dare we say a uniform semantic for financial transactions was hammered out without the use of formal ontologies or web addresses? And that it supports billions of transactions on a daily basis? To become eventually consistent?

Think about the transparency (to you) of your next credit card transaction. Standards and eventual consistency make that possible.

On Distributed Consistency — Part 1 (MongoDB)

Sunday, February 20th, 2011

On Distributed Consistency — Part 1 (MongoDB)

The first of a six part series on consistency in distributed databases.

From the website:

See also:

  • Part 2 – Eventual Consistency
  • Part 3 – Network Partitions
  • Part 4 – Multi Data Center
  • Part 5 – Multi Writer Eventual Consistency
  • Part 6 – Consistency Chart

For distributed databases, consistency models are a topic of huge importance. We’d like to delve a bit deeper on this topic with a series of articles, discussing subjects such as what model is right for a particular use case. Please jump in and help us in the comments.

Consistency is an issue that will confront distributed topic maps so best to start learning the options now.