Shard-Query

Shard-Query: Open Source MPP database engine

From the webpage:

What is Shard-Query

Shard-Query is a high performance MySQL query engine which offers increased parallelism compared to stand-alone MySQL. This increased parallelism is achieved by taking advantage of MySQL partitioning, sharding, common query features, or some combination thereof (see more below).

The primary goal of Shard-Query is to enable low-latency query access to extremely large volumes of data utilizing commodity hardware and open source database software. Shard-Query is a federated query engine which is designed to perform as much work in parallel as possible.

What kind of queries are supported?

  • You can run just about all SQL queries over your dataset:
  • For SELECT queries:
    • All aggregate functions are supported.
      • SUM,COUNT,MIN,MAX and AVG are the fastest aggregate operations
      • SUM/COUNT(DISTINCT ..) are supported, but are slower
      • STD/VAR/etc are supported but aggregation is not pushed down at all (slowest)
      • Custom aggregate functions are now also supported.
        • PERCENTILE(expr, N) – take a percentile, for example percentile(score,90)
  • JOINs are supported (no self joins, or joins of tables sharded by different keys)
  • ORDER BY, GROUP BY, HAVING, WITH ROLLUP, and LIMIT are supported
  • Also upports INSERT, UPDATE, DELETE
  • Also supports DDL such as CREATE TABLE, ALTER TABLE and DROP TABLE

The numbers on a 24 core server are impressive. Worth a closer look.

I first saw this at Justin Swanhart’s webinar announcement: Building a highly scaleable distributed… [Webinar, MySQL/Shard-Query]

Leave a Reply

You must be logged in to post a comment.