Archive for the ‘Spring Hadoop’ Category

Spring for Hadoop …

Sunday, March 3rd, 2013

Spring for Hadoop simplifies application development

From the post:

After almost exactly a year of development, SpringSource has released Spring for Hadoop 1.0 with the goal of making the development of Hadoop applications easier for users of the distributed application framework. VMware engineer Costin Leau said in the release announcement that the company has often seen developers use the out-of-the-box tools that come with Hadoop in ways that lead to a “poorly structured collection of command line utilities, scripts and pieces of code stitched together.” Spring for Hadoop aims to change this by applying the Template API design pattern from Spring to Hadoop.

This application gives helper classes such as HBaseTemplate, HiveTemplate and PigTemplate which interface with the different parts of the Hadoop ecosystem, Java-centric APIs such as Cascading can also be used with or without additional configuration. The software enables Spring functionality such as thread-safe access to lower level resources and lightweight object mapping in Hadoop applications. Leau also says that Spring for Hadoop is designed to allow projects to grow organically. To do this, users can mix and match various runner classes for scripts and, as the complexity of the application increases, developers can migrate to Spring Batch and manage these processes through a REST-based API.

Spring for Hadoop 1.0 is available from the SpringSource web site under the Apache 2.0 License. The developers say they are testing the software daily against various Hadoop 1.x distributions such as Apache Hadoop and Greenplum HD, as well as Cloudera CDH3 and CDH4. Greenplum HD already includes Spring for Hadoop in its distribution. Support for Hadoop 2.x is expected “in the near future”.

I’m going to leave characterization of present methods of working with Hadoop for others. 😉

Introducing Spring Hadoop

Monday, March 12th, 2012

Introducing Spring Hadoop by Costin Leau.

From the post:

I am happy to announce that the first milestone release (1.0.0.M1) for Spring Hadoop project is available and talk about some of the work we have been doing over the last few months. Part of the Spring Data umbrella, Spring Hadoop provides support for developing applications based on Hadoop technologies by leveraging the capabilities of the Spring ecosystem. Whether one is writing stand-alone, vanilla MapReduce applications, interacting with data from multiple data stores across the enterprise, or coordinating a complex workflow of HDFS, Pig, or Hive jobs, or anything in between, Spring Hadoop stays true to the Spring philosophy offering a simplified programming model and addresses "accidental complexity" caused by the infrastructure. Spring Hadoop, provides a powerful tool in the developer arsenal for dealing with big data volumes.

I rather like that, “accidental complexity.” 😉

Still, if you are learning Hadoop, Spring Hadoop may ease the learning curve. Not to mention making application development easier. Your mileage may vary but it is worth a long look.