Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 22, 2012

Secondary Indices Have Arrived! (Hypertable)

Filed under: Hypertable,Indexing — Patrick Durusau @ 7:41 pm

Secondary Indices Have Arrived! (Hypertable)

From the post:

Until now, SELECT queries in Hypertable had to include a row key, row prefix or row interval specification in order to be fast. Searching for rows by specifying a cell value or a column qualifier involved a full table scan which resulted in poor performance and scaled badly because queries took longer as the dataset grew. With 0.9.5.6, we’ve implemented secondary indices that will make such SELECT queries lightning fast!

Hypertable supports two kinds of indices: a cell value index and a column qualifier index. This blog post explains what they are, how they work and how to use them.

I am glad to hear about the new indexing features but how do “cell value indexes” and “column qualifier indexes” differ from secondary indexes as described in the PostgreSQL 9.1 documentation as:

All indexes in PostgreSQL are what are known technically as secondary indexes; that is, the index is physically separate from the table file that it describes. Each index is stored as its own physical relation and so is described by an entry in the pg_class catalog. The contents of an index are entirely under the control of its index access method. In practice, all index access methods divide indexes into standard-size pages so that they can use the regular storage manager and buffer manager to access the index contents.

It would be helpful in evaluating new features to know when (if?) they are substantially the same as features known in other contexts.

2 Comments

  1. Hi Patrick, Conceptually our secondary indices are very similar to what you would find in a relational database like Postgres. In essence, they speed up queries that match on a particular column. However, since Hypertable is not a relational database, it a bit of an apples-to-oranges comparison. For instance, in a relational database there is no such thing as a column qualifier, so no comparison can be made. Also, the excerpt from the PostgresSQL 9.1 documentation “… all index access methods divide indexes into standard-size pages …” refers to implementation details of Postgres and Hypertable has a completely different implementation, so no direct comparison can be made here either. However, here’s a few points that hopefully will help in making the comparison:

    1. Hypertable’s cell value index is roughly equivalent to the secondary index you find in a relational database.

    2. Hypertable does not support compound indices. There can be only one column in each index.

    3. Hypertable currently does not support data types, so exact string match and string prefix matches on a specific column are the only kinds of queries that leverage the index.

    4. Hypertable implements secondary indexes as a regular table in Hypertable with a specially formatted row key. This secondary index tables are just like any other table in Hypertable and use the same scaling mechanics. This has the benefit of allowing a secondary index to scale well, potentially across thousands of machines.

    – Doug

    Comment by nuggetwheat — March 23, 2012 @ 1:36 pm

  2. Doug,

    Thanks for the clarification!

    To some degree apples and oranges but if you know what an apple is, it may be easier to explain oranges. And they do have things in common. Seeds, trees, fruit, etc.

    So that the reader’s knowledge of one is extended to include the other, and to understand the differences.

    BTW, impressive numbers on processing triples! (Not the solution for everyone but no solution is.)

    Hope you are having a great weekend!

    Patrick

    Comment by Patrick Durusau — March 25, 2012 @ 10:20 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress