Streaming Histograms for Clojure and Java
From the post:
We’re happy to announce that we’ve open-sourced our “fancy” streaming histograms. We’ve talked about them before, but now the project has been tidied up and is ready to share.
PDF & CDF for a 32-bin histogram approximating a multimodal distribution.
The histograms are a handy way to compress streams of numeric data. When you want to summarize a stream using limited memory there are two general options. You can either store a sample of data in hopes that it is representative of the whole (such as a reservoir sample) or you can construct some summary statistics, updating as data arrives. The histogram library provides a tool for the latter approach.
The project is a Clojure/Java library. Since we use a lot of Clojure at BigML, the readme’s examples are all Clojure oriented. However, Java developers can still find documentation for the histogram’s public methods.
A tool for visualizing/exploring large amounts of numeric data.