Shard-Query: Open Source MPP database engine
From the webpage:
What is Shard-Query
Shard-Query is a high performance MySQL query engine which offers increased parallelism compared to stand-alone MySQL. This increased parallelism is achieved by taking advantage of MySQL partitioning, sharding, common query features, or some combination thereof (see more below).
The primary goal of Shard-Query is to enable low-latency query access to extremely large volumes of data utilizing commodity hardware and open source database software. Shard-Query is a federated query engine which is designed to perform as much work in parallel as possible.
…
What kind of queries are supported?
- You can run just about all SQL queries over your dataset:
- For SELECT queries:
- All aggregate functions are supported.
- SUM,COUNT,MIN,MAX and AVG are the fastest aggregate operations
- SUM/COUNT(DISTINCT ..) are supported, but are slower
- STD/VAR/etc are supported but aggregation is not pushed down at all (slowest)
- Custom aggregate functions are now also supported.
- PERCENTILE(expr, N) – take a percentile, for example percentile(score,90)
- JOINs are supported (no self joins, or joins of tables sharded by different keys)
- ORDER BY, GROUP BY, HAVING, WITH ROLLUP, and LIMIT are supported
- Also upports INSERT, UPDATE, DELETE
- Also supports DDL such as CREATE TABLE, ALTER TABLE and DROP TABLE
The numbers on a 24 core server are impressive. Worth a closer look.
I first saw this at Justin Swanhart’s webinar announcement: Building a highly scaleable distributed… [Webinar, MySQL/Shard-Query]