Cloudera’s Impala tool binds Hadoop with business intelligence apps by Christina Farr.
From the post:
In traditional circles, Hadoop is viewed as a bright but unruly problem child.
Indeed, it is still in the nascent stages of development. However the scores of “big data” startups that leverage Hadoop will tell you that it is here to stay.
Cloudera, the venture-backed startup that ushered the mainstream deployment of Hadoop, has unveiled a new technology at the Hadoop World, the data-focused conference in New York.
Its new product, known as “Impala”, addresses many of the concerns that large enterprises still have about Hadoop, namely that it does not integrate well with traditional business intelligence applications.
“We have heard this criticism,” said Charles Zedlewski, Cloudera’s VP of Product in a phone interview with VentureBeat. “That’s why we decided to do something about it,” he said.
Impala enables its users to store vast volumes of unwieldy data and run queries in HBase, Hadoop’s NoSQL database. What’s interesting is that it is built to maximise speed: it runs on top of Hadoop storage, but speaks to SQL and works with pre-existing drivers.
Legacy data is a well known concept.
Are we approaching the point of legacy applications? Applications that are too widely/deeply embedded in IT infrastructure to be replaced?
Or at least not replaced quickly?
The semantics of legacy data are known to be fair game for topic maps. Do the semantics of legacy applications offer the same possibilities?
Mapping the semantics of “legacy” applications, their ancestors and descendants, data, legacy and otherwise, results in a semantic mosh pit.
Some strategies for a semantic “mosh pit:”
- Prohibit it (we know the success rate on that option)
- Ignore it (costly but more “successful” than #1)
- Create an app on top of the legacy app (an error repeated isn’t an error, it’s following precedent)
- Sample it (but what are you missing?)
- Map it (being mindful of cost/benefit)
Which one are you going to choose?