Having a ChuQL at XML on the Cloud by Shahan Khatchadourian, Mariano P. Consens, and Jérôme Siméon.
MapReduce/Hadoop has gained acceptance as a framework to process, transform, integrate, and analyze massive amounts of Web data on the Cloud. The MapReduce model (simple, fault tolerant, data parallelism on elastic clouds of commodity servers) is also attractive for processing enterprise and scientic data. Despite XML ubiquity, there is yet little support for XML processing on top of MapReduce.
In this paper, we describe ChuQL, a MapReduce extension to XQuery, with its corresponding Hadoop implementation. The ChuQL language incorporates records to support the key/value data model of MapReduce, leverages higher-order functions to provide clean semantics, and exploits side-effects to fully expose to XQuery developers the Hadoop framework. The ChuQL implementation distributes computation to multiple XQuery engines, providing developers with an expressive language to describe tasks over big data.
The aggregation and co-grouping were the most interesting examples for me.
The description of ChuQL was a bit thin. Pointers to more resources would be appreciated.