Novel Storage Technique Speeds Big Data Processing by Tiffany Trader.
From the post:
Between the data deluge and the proliferation of uber-connected devices, the amount of data that must be stored and processed has exploded to a mind-boggling degree. One commonly cited statistic from Google Chairman Eric Schmidt holds that every two days humankind creates as much information as it did from the dawn of civilization up until 2003.
“Big data” technologies have evolved to get a handle on this information overload, but in order to be useful, the data must be stored in such a way that it is easily retrieved when needed. Until now, high-capacity, low-latency storage architectures have only been available on very high-end systems, but recently a group of MIT scientists have proposed an alternative approach, a novel high-performance storage architecture they call BlueDB (Blue Database Machine) that aims to accelerate the processing of very large datasets.
The researchers from MIT’s Department of Electrical Engineering and Computer Science have written about their work in a paper titled Scalable Multi-Access Flash Store for Big Data Analytics.
….
See the paper for a low-level view and Tiffany’s post for a high-level one.
BTW, the result of this research, BlueDB, will b e demonstrated at the International Symposium on Field-Programmable Gate Arrays in Monterey, California.
A good time to start thinking about how data structures have been influenced by storage speed.
Is normalization a useful optimization with < 1 billion records? Maybe today, but what about six months from now?
I first saw this in a tweet by Stefano Bertolo.