Archive for the ‘Phoenix’ Category

Phoenix: Incubating at Apache!

Sunday, January 12th, 2014

Phoenix: Incubating at Apache!

From the webpage:

Phoenix is a SQL skin over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.

Tired of reading already and just want to get started? Take a look at our FAQs, listen to the Phoenix talks from Hadoop Summit 2013 and HBaseConn 2013, and jump over to our quick start guide here.

To see whats supported, go to our language reference. It includes all typical SQL query statement clauses, including SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, etc. It also supports a full set of DML commands as well as table creation and versioned incremental alterations through our DDL commands. We try to follow the SQL standards wherever possible.

Incubating at Apache is no guarantee of success but it does mean sane licensing and a merit based organization/process.

If you are interested in non-NSA corrupted software, consider supporting the Apache Software Foundation.

Phoenix in 15 Minutes or Less

Sunday, April 7th, 2013

Phoenix in 15 Minutes or Less by Justin Kestelyn.

An amusing FAQ by “James Taylor of Salesforce, which recently open-sourced its Phoenix client-embedded JDBC driver for low-latency queries over HBase.”

From the post:

What is this new Phoenix thing I’ve been hearing about?
Phoenix is an open source SQL skin for HBase. You use the standard JDBC APIs instead of the regular HBase client APIs to create tables, insert data, and query your HBase data.

Doesn’t putting an extra layer between my application and HBase just slow things down?
Actually, no. Phoenix achieves as good or likely better performance than if you hand-coded it yourself (not to mention with a heck of a lot less code) by:

  • compiling your SQL queries to native HBase scans
  • determining the optimal start and stop for your scan key
  • orchestrating the parallel execution of your scans
  • bringing the computation to the data by
    • pushing the predicates in your where clause to a server-side filter
    • executing aggregate queries through server-side hooks (called co-processors)

In addition to these items, we’ve got some interesting enhancements in the works to further optimize performance:

  • secondary indexes to improve performance for queries on non row key columns
  • stats gathering to improve parallelization and guide choices between optimizations
  • skip scan filter to optimize IN, LIKE, and OR queries
  • optional salting of row keys to evenly distribute write load


Sounds authentic to me!