Streaming REST API – Interview with Michael Hunger [Neo4j]
Andreas Kollegger writes:
Recently, Michael Hunger blogged about his lab work to use streaming in Neo4j’s REST interface. On lab days, everyone on the Neo4j team gets to bump the priority of any engineering work that had been lingering in a background thread. I chatted with Michael about his work with streaming.
ABK: What inspired you to focus on streaming for Neo4j?
MH: Because it is a major aspect for Neo4j to behave as performant as possible, especially with so many languages / stacks connecting via the REST API. The existing approach is several orders of magnitude slower than embedded [note: Neo4j is embeddable on the JVM] and not just one as was originally envisioned.ABK: What do you mean by “streaming” in this context, is this http streaming?
MH: Yes, it is http streaming combined with json streaming and having the internal calls to Neo4j generate lazy results (Iterables) instead of pulling all results from the db in one go. So writing to the stream will advance the database operations (or their “cursors”). This applies to: indexing, cypher, and traversals.
The difference in approaches:
the streaming took 10 seconds to return a complete result transferring between 8 to 15 MB/s for 130MB of data. The normal non-streaming result took 1 minute, 8 seconds to provide the same result and a Heap of 2GB.
The interview and Michael’s post should be on your reading list for this week!