Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

August 11, 2012

HBase FuzzyRowFilter: Alternative to Secondary Indexes

Filed under: HBase — Patrick Durusau @ 3:43 pm

HBase FuzzyRowFilter: Alternative to Secondary Indexes by Alex Baranau.

From the post:

In this post we’ll explain the usage of FuzzyRowFilter which can help in many situations where secondary indexes solutions seems to be the only choice to avoid full table scans.

Background

When it comes to HBase the way you design your row key affects everything. It is a common pattern to have composite row key which consists of several parts, e.g. userId_actionId_timestamp. This allows for fast fetching of rows (or single row) based on start/stop row keys which have to be a prefix of the row keys you want to select. E.g. one may select last time of userX logged in by specifying row key prefix “userX_login_”. Or last action of userX by fetching the first row with prefix “userX_”. These partial row key scans work very fast and does not require scanning the whole table: HBase storage is optimized to make them fast.

Problem

However, there are cases when you need to fetch data based on key parts which happen to be in the middle of the row key. In the example above you may want to find last logged in users. When you don’t know the first parts of the key partial row key scan turns into full table scan which might be very slow and resource intensive.

Although Alex notes the solution he presents is no “silver bullet,” it illustrates:

  • The impact of key design on later usage.
  • Important of knowing all your options for query performance.

I would capture the availability of the “FuzzyRowFilter,” key structure and cardinality of data using a topic map. Saves then next HBase administrator time and effort.

True, they can always work out the details for themselves but then they make not have your analytical skills.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress