Markov Chains in Neo4j by Nicole White.
From the post:
My new favorite thing lately is Neo4j, a graph database. It’s simple yet powerful: a graph database contains nodes and relationships, each which have properties. I recently made this submission to Neo4j’s GraphGist Challenge, which I did pretty well in.
After discovering Neo4j and graph databases a little over a month and a half ago, I’ve become subject to this weird syndrome where I think to myself, “Could I put that into a graph database?” with literally everything I encounter. The answer is usually yes.
Markov Chains
I realized the other day that nodes can have relationships with themselves, and for some reason, this immediately reminded me of Markov chains. The term Markov chain sounds intimidating at first (it did to me when I first saw the term on a syllabus), but they’re actually pretty simple: Markov chains consist of states and probabilities. The number of possible states is finite, and the Markov chain is a stochastic process that transitions, with certain probabilities, from one state to another over what I like to call time-steps.
The most important property of a Markov chain is that it is memoryless; that is, the probability of entering the next state depends only on the current state. We don’t care about where the process has been, only about where it is now.
If you wander over to the Wikipedia page on Markov chains, you’ll see pretty quickly why they are an obvious candidate for a graph database. The main profile picture for the page shows a Markov chain in graph form, where the states are nodes and the probabilities of transitioning from one state to another are the relationships between those nodes. The reason my realization mentioned earlier was important is that there is often a non-zero probability, given a Markov chain is in state A, that it will ‘enter’ state A in the next time-step. This is represented by a node that has a relationship with itself.
Interesting use of Neo4j to create a transition model.
Curious what you think of Nicole’s use of queries to avoid matrix multiplication?
It works but how often do you want to know the probability of one element in one state of a system?
Or would you extend the one element probability query to query more elements in a particular state?
Hey Patrick,
I agree that the functionality is limited with my current approach in Cypher. I am going to have to implement my idea in Java to add more realistic use-cases.
Glad you thought it was neat.
Comment by Nicole W — October 8, 2013 @ 9:21 am