Streaming Histograms for Clojure and Java

Streaming Histograms for Clojure and Java

From the post:

We’re happy to announce that we’ve open-sourced our “fancy” streaming histograms. We’ve talked about them before, but now the project has been tidied up and is ready to share.

PDF & CDF for a 32-bin histogram approximating a multimodal distribution.

The histograms are a handy way to compress streams of numeric data. When you want to summarize a stream using limited memory there are two general options. You can either store a sample of data in hopes that it is representative of the whole (such as a reservoir sample) or you can construct some summary statistics, updating as data arrives. The histogram library provides a tool for the latter approach.

The project is a Clojure/Java library. Since we use a lot of Clojure at BigML, the readme’s examples are all Clojure oriented. However, Java developers can still find documentation for the histogram’s public methods.

A tool for visualizing/exploring large amounts of numeric data.

Leave a Reply

You must be logged in to post a comment.