Archive for the ‘Distributed Consistency’ Category

Data Integrity and Problems of Scope

Wednesday, October 22nd, 2014

Data Integrity and Problems of Scope by Peter Baillis.

From the post:

Mutable state in distributed systems can cause all sorts of headaches, including data loss, corruption, and unavailability. Fortunately, there are a range of techniques—including exploiting commutativity and immutability—that can help reduce the incidence of these events without requiring much overhead. However, these techniques are only useful when applied correctly. When applied incorrectly, applications are still subject to data loss and corruption. In my experience, (the unfortunately common) incorrect application of these techniques is often due to problems of scope. What do I mean by scope? Let’s look at two examples:

Having the right ideas is not enough, you must implement them correctly as well.

Peter’s examples will sharpen your thinking about data integrity.


A Distributed Systems Reading List

Friday, May 16th, 2014

A Distributed Systems Reading List by

From the introduction:

I often argue that the toughest thing about distributed systems is changing the way you think. The below is a collection of material I’ve found useful for motivating these changes.

Categories include:

  • Thought Provokers
  • Amazon
  • Google
  • eBay
  • Consistency Models
  • Theory
  • Languages and Tools
  • Infrastructure
  • Storage
  • Paxos Consensus
  • Other Consensus Papers
  • Gossip Protocols (Epidemic Behaviors)
  • P2P

Unless you think the knowledge in your domain is small enough to fit into a single system, I suggest you start reading about distributed systems this weekend.


I first saw this in a tweet by FoundationDB.

Tombstones in Topic Map Future?

Wednesday, January 16th, 2013

Watching the What’s New in Cassandra 1.2 (Notes) webcast and encountered an unfamiliar term: “tombstones.”

If you are already familiar with the concept, skip to another post.

If you’re not, the concept is used in distributed systems that maintain “eventual” consistency by the nodes replicating their content. Which works if all nodes are available but what if you delete data and a node is unavailable? When it comes back, the other nodes are “missing” data that needs to be replicated.

From the description at the Cassandra wiki, DistributedDeletes, not an easy problem to solve.

So, Cassandra turns it into a solvable problem.

Deletes are implemented with a special value known as a tombstone. The tombstone is propogated to nodes that missed the initial delete.

Since you will eventually want to delete the tombstones as well, a grace period can be set, which is slightly longer than the period needed to replace a non-responding node.

Distributed topic maps will face the same issue.

Complicated by imperative programming models of merging that make changes in properties that alter merging difficult to manage.

Perhaps functional models of merging, as with other forms of distributed processing, will carry the day.

Reading List for Distributed Systems

Wednesday, November 2nd, 2011

Reading List for Distributed Systems

From the post:

I quite often get asked by friends, colleagues who are interested in learning about distributed systems saying “Please tell me what are the top papers and books we need to read to learn more about distributed systems”. I used to write one off emails giving a few pointers. Now that, I’ve asked enough I thought it is a worthwhile exercise to put these together in a single post.

Please feel free to comment if you think there are more posts that needs to be added.

Reading list that ranges from Paxos to MapReduce and places in between. Looks like a very good list.

On Distributed Consistency — Part 1 (MongoDB)

Sunday, February 20th, 2011

On Distributed Consistency — Part 1 (MongoDB)

The first of a six part series on consistency in distributed databases.

From the website:

See also:

  • Part 2 – Eventual Consistency
  • Part 3 – Network Partitions
  • Part 4 – Multi Data Center
  • Part 5 – Multi Writer Eventual Consistency
  • Part 6 – Consistency Chart

For distributed databases, consistency models are a topic of huge importance. We’d like to delve a bit deeper on this topic with a series of articles, discussing subjects such as what model is right for a particular use case. Please jump in and help us in the comments.

Consistency is an issue that will confront distributed topic maps so best to start learning the options now.