HCatalog, tables and metadata for Hadoop

HCatalog, tables and metadata for Hadoop

HCatolog is described at its Apache site as:

Apache HCatalog is a table and storage management service for data created using Apache Hadoop.

This includes:

  • Providing a shared schema and data type mechanism.
  • Providing a table abstraction so that users need not be concerned with where or how their data is stored.
  • Providing interoperability across data processing tools such as Pig, Map Reduce, Streaming, and Hive.

From the post:

Last month the HCatalog project (formerly known as Howl) was accepted into the Apache Incubator. We have already branched for a 0.1 release, which we hope to push in the next few weeks. Given all this activity, I thought it would be a good time to write a post on the motivation behind HCatalog, what features it will provide, and who is working on it.

Leave a Reply

You must be logged in to post a comment.