Archive for the ‘Datomic’ Category

Excision [Forgetting But Remembering You Forgot (Datomic)]

Tuesday, June 4th, 2013

Excision

From the post:

It is a key value proposition of Datomic that you can tell not only what you know, but how you came to know it. When you add a fact:

conn.transact(list(":db/add", 42, ":firstName", "John"));

Datomic does more than merely record that 42‘s first name is “John“. Each datom is also associated with a transaction entity, which records the moment (:db/txInstant) the datom was recorded.

(…)

Given this information model, it is easy to see that Datomic can support queries that tell you:

  • what you know now
  • what you knew at some point in the past
  • how and when you came to know any particular datom

So far so good, but there is a fly in the ointment. In certain situations you may be forced to excise data, pulling it out root and branch and forgetting that you ever knew it. This may happen if you store data that must comply with privacy or IP laws, or you may have a regulatory requirement to keep records for seven years and then “shred” them. For these scenarios, Datomic provides excision.

One approach to the unanswered question of what does it means to delete something from a topic map?

Especially interesting because you can play with the answer that Datomic provides.

Doesn’t address the issue of what it means to delete a topic that has caused other topics to merge.

I first saw this in Christophe Lalanne’s A bag of tweets / May 2013.

Should Business Data Have An Audit Trail?

Thursday, March 21st, 2013

The “second slide” I would lead with from Stuart Halloway’s Datomic, and How We Built It would be:

Should Business Data Have An Audit Trail?

Actually Stuart’s slide #65 but who’s counting? ;-)

Stuart points out the irony of git, saying:

developer data is important enough to have an audit trail, but business data is not

Whether business data should always have an audit trail would attract shouts of yes and no, depending on the audience.

Regulators, prosecutors, good government types, etc., mostly shouting yes.

Regulated businesses, security brokers, elected officials, etc., mostly shouting no.

Some in between.

Datomic, which has some common characteristics with topic maps, gives you the ability to answer these questions:

  • Do you want auditable business data or not?
  • If yes to auditable business data, to what degree?

Rather different that just assuming it isn’t possible.

Abstract:

Datomic is a database of flexible, time-based facts, supporting queries and joins, with elastic scalability and ACID transactions. Datomic queries run your application process, giving you both declarative and navigational access to your data. Datomic facts (“datoms”) are time-aware and distributed to all system peers, enabling OLTP, analytics, and detailed auditing in real time from a single system.

In this talk, I will begin with an overview of Datomic, covering the problems that it is intended to solve and how its data model, transaction model, query model, and deployment model work together to solve those problems. I will then use Datomic to illustrate more general points about designing and implementing production software, and where I believe our industry is headed. Key points include:

  • the pragmatic adoption of functional programming
  • how dynamic languages fare in mission- and performance- critical settings
  • the importance of data, and the perils of OO
  • the irony of git, or why developers give themselves better databases than they give their customers
  • perception, coordination, and reducing the barriers to scale

Resources

  • Video from CME Group Technology Conference 2012
  • Slides from CME Group Technology Conference 2012

Davy Suvee on FluxGraph – Towards a time aware graph built on Datomic

Saturday, February 2nd, 2013

Davy Suvee on FluxGraph – Towards a time aware graph built on Datomic by René Pickhardt.

From the post:

Davy really nicely introduced the problem of looking at a snapshot of a data base. This problem obviously exists for any data base technology. You have a lot of timestamped records but running a query as if you fired it a couple of month ago is always a difficult challange.

With FluxGraph a solution to this is introduced.

How I understood him in the talk he introduces new versions of a vertex or an edge everytime it gets updated, added or removed. So far I am wondering about scaling and runtime. This approach seems like a lot of overhead to me. Later during Q & A I began to have the feeling that he has a more efficient way of storing this information so I really have to get in touch with davy to rediscuss the internals.

FluxGraph anyway provides a very clean API to access these temporal information.

FluxGraph at GitHub.

Time is an obvious issue in any business or medical context.

But also important when the news hounds ask: “Who knew what when?”

And there you may have personal relationships, meetings, communications, etc.

Clojure/Datomic creator Rich Hickey on Deconstructing the Database

Saturday, August 25th, 2012

Clojure/Datomic creator Rich Hickey on Deconstructing the Database

From the description:

Rich Hickey, author of Clojure, and designer of Datomic presents a new way to look at database architectures in this talk from JaxConf 2012. What happens when you deconstruct the traditional monolithic database – separating transaction processing, storage and query into independent cooperating services? Coupled with a data model based around atomic facts and awareness of time, you get a significantly different set of capabilities and tradeoffs. This talk with discuss how these ideas play out in the design and architecture of Datomic, a new database for the JVM.

I truly appreciate the description of database updates as “a miracle occurs.”

There is much to enjoy and consider here.

Linked Lists in Datomic [Herein of tolog and Neo4j]

Tuesday, August 21st, 2012

Linked Lists in Datomic by Joachim Hofer.

From the post:

As my last contact with Prolog was over ten years ago, I think it’s time for some fun with Datomic and Datalog. In order to learn to know Datomic better, I will attempt to implement linked lists as a Datomic data structure.

First, I need a database “schema”, which in Datomic means that I have to define a few attributes. I’ll define one :content/name (as a string) for naming my list items, and also the attributes for the list data structure itself, namely :linkedList/head and :linkedList/tail (both are refs):

You may or may not know that tolog, a topic map query language, was inspired in part by Datalog. Understanding Datalog could lead to new insights into tolog.

The other reason to mention this post is that Neo4j uses linked lists as part of its internal data structure.

If I am reading slide 9 (Neo4J Internals (update)) correctly, relationships are hard coded to have start/end nodes (singletons).

Not going to squeeze hyperedges out of that data structure.

What if you replaced the start/end node values with key/value pair as membership criteria for membership in the hyperedge?

Even if most nodes have only start/end nodes meeting a membership criteria, would free you up to have hyperedges when needed.

Will have to look at the implementation details on hyperedges/nodes to see. Suspect others have found better solutions.

Datomic Free Edition

Tuesday, July 24th, 2012

Datomic Free Edition

From the post:

We’re happy to announce today the release of Datomic Free Edition. This edition is oriented around making Datomic easier to get, and use, for open source and smaller production deployments.

  • Datomic Free Edition is … free!
  • The system supports transactor-local storage
  • The peer library includes a memory database and Datomic Datalog
  • The Free transactor and peers are freely redistributable
  • The transactor supports 2 simultaneous peers

Of particular note here is that Datomic Free Edition comes with a redistributable license, and does not require a personal/business-specific license from us. That means you can download Datomic Free, build e.g. an open source application with it, and ship/include Datomic Free binaries with your software. You can also put the Datomic Free bits into public repositories and package managers (as long as you retain the licenses and copyright notices).

There is a ton of capability included in the Free Edition, including the Datomic in-process memory database (great for testing), and the Datomic datalog engine, which works on both Datomic databases and in-memory collections. That’s right, free datalog for everyone.

You can use Datomic Free Edition in production, and you can use it in commercial applications.

Get Datomic!

I first saw this at Alex Popescu’s myNoSQL.

Thinking in Datomic: Your data is not square

Tuesday, July 10th, 2012

Thinking in Datomic: Your data is not square by Pelle Braendgaard.

From the post:

Datomic is so different than regular databases that your average developer will probably choose to ignore it. But for the developer and startup who takes the time to understand it properly I think it can be a real unfair advantage as a choice for a data layer in your application.

In this article I will deal with the core fundamental definition of how data is stored in Datomic. This is very different from all other databases so before we even deal with querying and transactions I think it’s a good idea to look at it.

Yawn, “your data is not square.” ;-) Just teasing.

But we have all heard the criticism of relational tables. I think writers can assume that much, at least in technical forums.

The lasting value of the NoSQL movement (in addition to whichever software packages survive) will be its emphasis on analysis of your data. Your data may fit perfectly well into a square but you need to decide that after looking at your data, not before.

The same can be said about the various NoSQL offerings. Your data may or may not be suited for a particular NoSQL option. The data analysis “cat being out of the bag,” it should be applied to NoSQL options as well. True, almost any option will work, your question should be why is option X the best option for my data/use case?

Distributed Temporal Graph Database Using Datomic

Saturday, April 21st, 2012

Distributed Temporal Graph Database Using Datomic

Post by Alex Popescu calling out construction of a “distributed temporal graph database.”

Temporal used in the sense of timestamping entries in the database.

Beyond such uses, beware, there be dragons.

Temporal modeling isn’t for the faint of heart.

Datomic

Wednesday, March 7th, 2012

Michael Popescu (myNoSQL) has a couple of posts on resources for Datomic.

Intro Videos to Datomic and Datomic Datalog

and,

Datomic: Distributed Database Designed to Enable Scalable, Flexible and Intelligent Applications, Running on Next-Generation Cloud Architectures

I commend the materials you will find there but the white paper in particular, which has the following section:

ATOMIC DATA – THE DATOM

Once you are storing facts, it becomes imperative to choose an appropriate granularity for facts. If you want to record the fact that Sally likes pizza, how best to do so? Most databases require you to update either the Sally record or document, or the set of foods liked by Sally, or the set of likers of pizza. These kind of representational issues complicate and rigidify applications using relational and document models. This can be avoided by recording facts as independent atoms of information. Datomic calls such atomic facts ‘datoms‘. A datom consists of an entity, attribute, value and transaction (time). In this way, any of those sets can be discovered via query, without embedding them into a structural storage model that must be known by applications.

In some views of granularity, the datom “atom” looks like a four-atom molecule to me. ;-) Not to mention that entities/attributes and values can have relationships that don’t involve each other.