Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 18, 2013

Real World Hadoop – Implementing a Left Outer Join in Map Reduce

Filed under: Hadoop,MapReduce — Patrick Durusau @ 6:25 am

Real World Hadoop – Implementing a Left Outer Join in Map Reduce by Matthew Rathbone.

From the post:

This article is part of my guide to map reduce frameworks, in which I implement a solution to a real-world problem in each of the most popular hadoop frameworks.

If you’re impatient, you can find the code for the map-reduce implementation on my github, otherwise, read on!

The Problem
Let me quickly restate the problem from my original article.

I have two datasets:

  1. User information (id, email, language, location)
  2. Transaction information (transaction-id, product-id, user-id, purchase-amount, item-description)

Given these datasets, I want to find the number of unique locations in which each product has been sold.

Not as easy a problem as it appears. But I suspect a common one in practice.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress