Apache Hadoop Patterns of Use: Refine, Enrich and Explore by Jim Walter.
From the post:
“OK, Hadoop is pretty cool, but exactly where does it fit and how are other people using it?” Here at Hortonworks, this has got to be the most common question we get from the community… well that and “what is the airspeed velocity of an unladen swallow?”
We think about this (where Hadoop fits) a lot and have gathered a fair amount of expertise on the topic. The core team at Hortonworks includes the original architects, developers and operators of Apache Hadoop and its use at Yahoo, and through this experience and working within the larger community they have been privileged to see Hadoop emerge as the technological underpinning for so many big data projects. That has allowed us to observe certain patterns that we’ve found greatly simplify the concepts associated with Hadoop, and our aim is to share some of those patterns here.
As an organization laser focused on developing, distributing and supporting Apache Hadoop for enterprise customers, we have been fortunate to have a unique vantage point.
With that, we’re delighted to share with you our new whitepaper ‘Apache Hadoop Patterns of Use’. The patterns discussed in the whitepaper are:
Refine: Collect data and apply a known algorithm to it in a trusted operational process.
Enrich: Collect data, analyze and present salient results for online apps.
Explore: Collect data and perform iterative investigation for value.You can download it here, and we hope you enjoy it.
If you are looking for detailed patterns of use, you will be disappointed.
Runs about nine (9) pages in very high level summary mode.
What remains to be written (to my knowledge) is a collection of use patterns with a realistic amount of detail from a cross-section of Hadoop users.
That would truly be a compelling resource for the community.