HCatalog, tables and metadata for Hadoop
HCatolog is described at its Apache site as:
Apache HCatalog is a table and storage management service for data created using Apache Hadoop.
This includes:
- Providing a shared schema and data type mechanism.
- Providing a table abstraction so that users need not be concerned with where or how their data is stored.
- Providing interoperability across data processing tools such as Pig, Map Reduce, Streaming, and Hive.
From the post:
Last month the HCatalog project (formerly known as Howl) was accepted into the Apache Incubator. We have already branched for a 0.1 release, which we hope to push in the next few weeks. Given all this activity, I thought it would be a good time to write a post on the motivation behind HCatalog, what features it will provide, and who is working on it.