Archive for the ‘MRQL’ Category

MRQL – a SQL on Hadoop Miracle

Tuesday, April 23rd, 2013

MRQL – a SQL on Hadoop Miracle by Edward J. Yoon.

From the post:

Recently, the Apache Incubator accepted a new query engine for Hadoop and Hama, called MRQL (pronounced miracle), which was initially developed in 2011 by Leonidas Fegaras.

MRQL (MapReduce Query Language) is a query processing and optimization system for large-scale, distributed data analysis, built on top of Apache Hadoop and Hama. MRQL has some overlapping functionality with Hive, Impala and Drill, but one major difference is that it can capture many complex data analysis algorithms that can not be done easily in those systems in declarative form. So, complex data analysis tasks, such as PageRank, k-means clustering, and matrix multiplication and factorization, can be expressed as short SQL-like queries, while the MRQL system is able to evaluate these queries efficiently.

Another difference from these systems is that the MRQL system can run these queries in BSP (Bulk Synchronous Parallel) mode, in addition to the MapReduce mode. With BSP mode, it achieves lower latency and higher speed. According to MRQL team, “In near future, MRQL will also be able to process very large data effectively fast without memory limitation and significant performance degradation in the BSP mode”.

Maybe I should turn my back on the newsfeed more often. 😉

I suspect the announcement and my distraction were unrelated.

This looks very important.

I can feel another Apache list subscription in the offing.