Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 18, 2012

A Python Compiler for Big Data

Filed under: Arrays,Python — Patrick Durusau @ 6:41 am

A Python Compiler for Big Data by Stephen Diehl.

From the post:

Blaze is the next generation of NumPy, Python’s extremely popular array library. At Continuum Analytics we aim to tackle some of the hardest problems in large data analytics with our Python stack of Numba and Blaze, which together will form the basis of distributed computation and storage system which is simultaneously able to generate optimized machine code specialized to the data being operated on.

Blaze aims to extend the structural properties of NumPy arrays to to a wider variety of table and array-like structures that support commonly requested features such missing values, type heterogeneity, and labeled arrays.

(images omitted)

Unlike NumPy, Blaze is designed to handle out-of-core computations on large datasets that exceed the system memory capacity, as well as on distributed and streaming data. Blaze is able to operate on datasets transparently as if they behaved like in-memory NumPy arrays.

We aim to allow analysts and scientists to productively write robust and efficient code, without getting bogged down in the details of how to distribute computation, or worse, how to transport and convert data between databases, formats, proprietary data warehouses, and other silos.

Just a thumbnail sketch but enough to get you interested in learning more.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress