An Overview of Scalding by Dean Wampler.
From the description:
Dean Wampler, Ph.D., is Principal Consultant at Think Big Analytics. In this video he will cover its benefits over the Java API include a dramatic reduction in the source code required, reflecting several Scala improvements over Java, full access to “functional programming” constructs that are ideal for data problems, and a Matrix library addition to support machine learning and other algorithms. He also demonstrates the benefits of Scalding using examples and explains just enough Scala syntax so you can follow along. Dean’s philosophy is that there is no better way to write general-purpose Hadoop MapReduce programs when specialized tools like Hive and Pig aren’t quite what you need. This presentation was given on February 12th at the Nokia offices in Chicago, IL.
During this period of rapid innovation around “big data,” what interests me is the development of tools to fit problems.
As opposed to fitting problems to fixed data models and tools.
Both require a great deal of skill, but they are different skill sets.
I first saw this at Alex Popescu’s myNoSQL.