Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 3, 2013

Case study: million songs dataset

Filed under: Data,Dataset,GraphChi,Graphs,Machine Learning — Patrick Durusau @ 6:58 pm

Case study: million songs dataset by Danny Bickson.

From the post:

A couple of days ago I wrote about the million songs dataset. Our man in London, Clive Cox from Rummble Labs, suggested we should implement rankings based on item similarity.

Thanks to Clive suggestion, we have now an implementation of Fabio Aiolli’s cost function as explained in the paper: A Preliminary Study for a Recommender System for the Million Songs Dataset, which is the winning method in this contest.

Following are detailed instructions on how to utilize GraphChi CF toolkit on the million songs dataset data, for computing user ratings out of item similarities. 

Just in case you need some data for practice with your GraphChi installation. 😉

Seriously, nice way to gain familiarity with the data set.

What value you extract from it is up to you.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress