Archive for the ‘Immutable’ Category

Scala DataTable

Monday, February 9th, 2015

Scala DataTable by Martin Cooper.

From the webpage:


Scala DataTable is a lightweight, in-memory table structure written in Scala. The implementation is entirely immutable. Modifying any part of the table, adding or removing columns, rows, or individual field values will create and return a new structure, leaving the old one completely untouched. This is quite efficient due to structural sharing.

Features :

  • Fully immutable implementation.
  • All changes use structural sharing for performance.
  • Table columns can be added, inserted, updated and removed.
  • Rows can be added, inserted, updated and removed.
  • Individual cell values can be updated.
  • Any inserts, updates or deletes keep the original structure and data completely unchanged.
  • Internal type checks and bounds checks to ensure data integrity.
  • RowData object allowing typed or untyped data access.
  • Full filtering and searching on row data.
  • Single and multi column quick sorting.
  • DataViews to store sets of filtered / sorted data.

If you are curious about immutable data structures and want to start with something familiar, this is your day!

See the Github page for example code and other details.

Using Clojure To Generate Java To Reimplement Clojure

Thursday, November 13th, 2014

Using Clojure To Generate Java To Reimplement Clojure by Zach Tellman.

From the post:

Most data structures are designed to hold arbitrary amounts of data. When we talk about their complexity in time and space, we use big O notation, which is only concerned with performance characteristics as n grows arbitrarily large. Understanding how to cast an O(n) problem as O(log n) or even O(1) is certainly valuable, and necessary for much of the work we do at Factual. And yet, most instances of data structures used in non-numerical software are very small. Most lists are tuples of a few entries, and most maps are a few keys representing different facets of related data. These may be elements in a much larger collection, but this still means that the majority of operations we perform are on small instances.

But except in special cases, like 2 or 3-vectors that represent coordinates, it’s rarely practical to specify that a particular tuple or map will always have a certain number of entries. And so our data structures have to straddle both cases, behaving efficiently at all possible sizes. Clojure, however, uses immutable data structures, which means it can do an end run on this problem. Each operation returns a new collection, which means that if we add an element to a small collection, it can return something more suited to hold a large collection.

Tellman describes this problem and his solution in Predictably Fast Clojure. (The URL is to a time mark but I think the entire video is worth your time.)

If that weren’t cool enough, Tellman details the creation of 1000 lines of Clojure that generate 5500 lines of Java so his proposal can be rolled into Clojure.

What other data structures can be different when immutability is a feature?

Data Integrity and Problems of Scope

Wednesday, October 22nd, 2014

Data Integrity and Problems of Scope by Peter Baillis.

From the post:

Mutable state in distributed systems can cause all sorts of headaches, including data loss, corruption, and unavailability. Fortunately, there are a range of techniques—including exploiting commutativity and immutability—that can help reduce the incidence of these events without requiring much overhead. However, these techniques are only useful when applied correctly. When applied incorrectly, applications are still subject to data loss and corruption. In my experience, (the unfortunately common) incorrect application of these techniques is often due to problems of scope. What do I mean by scope? Let’s look at two examples:

Having the right ideas is not enough, you must implement them correctly as well.

Peter’s examples will sharpen your thinking about data integrity.


When Should Identifications Be Immutable?

Thursday, September 8th, 2011

After watching a presentation on Clojure and its immutable data structures, I began to wonder when should identifications be immutable?

Note that I said when should identifications… which means I am not advocating a universal position for all identifiers but rather a choice that may vary from situation to situation.

We may change our minds about an identification, the fact remains that at some point (dare I say state?) a particular identification was made.

For example, you make a intimate gesture at a party only to discover your spouse wasn’t the recipient of the gesture. But at the time you made the gesture, at least I am willing to believe, you thought it was your spouse. New facts are now apparent. But it is also a new identification. As your spouse will remind you, you did make a prior, incorrect identification.

As I recall, topics (and other information items) are immutable for purposes of merging. (TMDM, 6.2 and following.) That is merging results in a new topic or other new information item. On the other hand, merging also results in updating information items other than the one subject to merging. So those information items are not being treated as immutable.

But since the references are being updates, I don’t think it would be inconsistent with the TMDM to create new information items to be the carriers of the new identifiers and thus treating the information items as immutable.

Would be application/requirement specific but say for accounting/banking/securities and similar applications, it may be important for identifications to be immutable. Such that we can “unroll” a topic map as it were to any prior arbitrary identification or state.

The Need For Immutability

Thursday, June 23rd, 2011

The Need For Immutability by Andrew Binstock.

From the post:

It makes data items ideal for sharing between threads

Andrew recites a short history of immutability.

Immutability also supports stable mappings between subject representatives.