Musicbrainz in Neo4j – Part 1 by Paul Tremberth.
From the post:
What is MusicBrainz?
Quoting Wikipedia, MusicBrainz is an “open content music database [that] was founded in response to the restrictions placed on the CDDB.(…) MusicBrainz captures information about artists, their recorded works, and the relationships between them.”
Anyone can browse the database at http://musicbrainz.org/. If you create an account with them you can contribute new data or fix existing records details, track lengths, send in cover art scans of your favorite albums etc. Edits are peer reviewed, and any member can vote up or down. There are a lot of similarities with Wikipedia.
With this first post, we want to show you how to import the Musicbrainz data into Neo4j for some further analysis with Cypher in the second post. See below for what we will end up with:
…
MusicBrainz data
MusicBrainz currently has around 1000 active users, nearly 800,000 artists, 75,000 record labels, around 1,200,000 releases, more than 12,000,000 tracks, and short under 2,000,000 URLs for these entities (Wikipedia pages, official homepages, YouTube channels etc.) Daily fixes by the community makes their data probably the freshest and most accurate on the web.
You can check the current numbers here and here.
This rocks!
Interesting data, walk through how to load the data into Neo4j and the promise of more interesting activities to follow.
However, I urge caution on showing this to family members. 😉
You may wind up scripting daily data updates and teaching Cypher to family members and no doubt their friends.
Up to you.
I first saw this in a tweet by Peter Neubauer.