…Spatial Analytics with Hive and Hadoop

How To Perform Spatial Analytics with Hive and Hadoop by Carter Shanklin.

From the post:

One of the big opportunities that Hadoop provides is the processing power to unlock value in big datasets of varying types from the ‘old’ such as web clickstream and server logs, to the new such as sensor data and geolocation data.

The explosion of smart phones in the consumer space (and smart devices of all kinds more generally) has continued to accelerate the next generation of apps such as Foursquare and Uber which depend on the processing of and insight from huge volumes of incoming data.

In the slides below we look at a sample, anonymized data set from Uber that is available on Infochimps. We step through basics of analyzing the data in Hive and learn how a new using spatial analysis decide whether a new product offering is viable or not.

Great tutorial and slides!

My only reservation is the use of geo-location data to make a judgement about the potential for a new ride service.

Geo-location data is only way to determine potential for a ride service. Surveying potential riders would be another.

Or to put it another way, having data to crunch, doesn’t mean crunching data will lead to the best answer.

Comments are closed.