A Comprehensive study of Convergent and Commutative Replicated Data Types reviewed by Adrian Colyer.
From the post:
…
This paper introduces the concept of a CRDT, a “simple, theoretically sound approach to eventual consistency.” Let’s adddress one of the pressing distributed systems questions of our time right here: “what does CRDT stand for?” We’ve seen over the last couple of weeks that there are two fundamental approaches to replication: you can execute operations at a primary and replicate the resulting state, or you can replicate the operations themselves. If you’re replicating state, then given some convergence rules for state, you can create Convergent Replicated Data Types. If you’re replicating operations, then given operations carefully designed to commute , you can create Commutative Replicated Data Types. Conveniently both ‘convergent’ and ‘commutative’ begin with C, so we can call both of these CRDTs. In both cases, the higher order goal is to avoid the need for coordination by ensuring that actions taken independently can’t conflict with each other (and thus can be composed at a later point in time). Thus we might also call them Conflict-free Replicated Data Types.
Think of it a bit like this: early on languages gave us standard data type implementations for set, list, map, and so on. Then we saw the introduction of concurrent versions of collections and related data types. With CRDTs, we are seeing the birth of distributed collections and related data types. Eventually any self-respecting language/framework will come with a distributed collections library – Riak already supports CRDTs and Jonas has an Akka CRDT library in github at least. As you read through the paper, it’s tempting to think “oh, these are pretty straightforward to implement,” but pay attention to the section on garbage collection – a bit like we saw with Edelweiss, making production implementations with state that doesn’t grow unbounded makes things more difficult.
…
If you haven’t read A comprehensive study of Convergent and Commutative Replicated Data Types by Marc Shapiro, Nuno Pregui¸ca, Carlos Baquero, and Marek Zawirski, this is a very useful and approachable introduction.
Enjoy!