Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 24, 2014

Data Science Challenge 3

Filed under: Challenges,Contest,Data Science — Patrick Durusau @ 6:39 pm

Data Science Challenge 3

From the post:

Challenge Period

The Fall 2014 Data Science Challenge runs October 11, 2014 through January 21, 2015.

Challenge Prerequisite

You must pass Data Science Essentials (DS-200) prior to registering for the Challenge.

Challenge Description

The Fall 2014 Data Science Challenge incorporates three independent problems derived from real-world scenarios and data sets. Each problem has its own data, can be solved independently, and should take you no longer than eight hours to complete. The Fall 2014 Challenge includes problems dealing with online travel services, digital advertising, and social networks.

Problem 1: SmartFly
You have been contacted by a new online travel service called SmartFly. SmartFly provides its customers with timely travel information and notifications about flights, hotels, destination weather, and airport traffic, with the goal of making your travel experience smoother. SmartFly’s product team has come up with the idea of using the flight data that it has been collecting to predict whether customers’ flights will be delayed in order to respond proactively. The team has now contacted you to help test out the viability of the idea. You will be given SmartFly’s data set from January 1 to September 30, 2014 and be asked to return a list of of upcoming flights sorted from the most likely to the least likely to be delayed.

Problem 2: Almost Famous
Congratulations! You have just published your first book on data science, advanced analytics, and predictive modeling. You’ve also decided to use your skills as a data scientist to build and optimize a website that promotes your book, and you have started several ad campaigns on a popular search engine in order to drive traffic to your site. Using your skills in data munging and statistical analysis, you will be asked to evaluate the performance of a series of campaigns directed towards site visitors using the log data in Hadoop as your source of truth.

Problem 3: WINKLR
WINKLR is a curiously popular social network for fans of the 1970s sitcom Happy Days. Users can post photos, write messages, and, most importantly, follow each other’s posts. This helps members keep up with new content from their favorite users. To help its users discover new people to follow on the site, WINKLR is building a new machine learning system called The Fonz to predict who a given user might like to follow. Phase One of The Fonz project is underway. The engineers can export the entire user graph as tuples. You have joined the Fonz project to implement Phase Two, which improves on this result. Given the user graph and the list of frequent-click tuples, you are being asked to select a 70,000 tuple subset in “user1,user2” format, where you believe user1 is mostly likely to want to follow user2. These will result in emails to the users, inviting them to follow the recommended user.

Prize for success: CCP: Data Scientist status

Great way to start 2015!

I first saw this in a tweet by Sarah.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress