While we wait for maid service robots, news that computers can be trained as human mimics for labeling of multimedia resources. Game-powered machine learning reports success with game based training for music labeling.
The authors, Luke Barrington, Douglas Turnbull, and Gert Lanckriet, neatly summarize music labeling as a problem of volume:
…Pandora, a popular Internet radio service, employs musicologists to annotate songs with a fixed vocabulary of about five hundred tags. Pandora then creates personalized music playlists by finding songs that share a large number of tags with a user-specified seed song. After 10 y of effort by up to 50 full time musicologists, less than 1 million songs have been manually annotated (5), representing less than 5% of the current iTunes catalog.
A problem that extends to the “…7 billion images are uploaded to Facebook each month (1), YouTube users upload 24 h of video content per minute….”
The authors created www.HerdIt.org to:
… investigate and answer two important questions. First, we demonstrate that the collective wisdom of Herd It’s crowd of nonexperts can train machine learning algorithms as well as expert annotations by paid musicologists. In addition, our approach offers distinct advantages over training based on static expert annotations: it is cost-effective, scalable, and has the flexibility to model demographic and temporal changes in the semantics of music. Second, we show that integrating Herd It in an active learning loop trains accurate tag models more effectively; i.e., with less human effort, compared to a passive approach.
The approach promises an augmentation (not replacement) of human judgement with regard to classification of music. An augmentation that would enable human judgement to reach further across the musical corpus than ever before:
…while a human-only approach requires the same labeling effort for the first song as for the millionth, our game-powered machine learning solution needs only a small, reliable training set before all future examples can be labeled automatically, improving efficiency and cost by orders of magnitude. Tagging a new song takes 4 s on a modern CPU: in just a week, eight parallel processors could tag 1 million songs or annotate Pandora’s complete song collection, which required a decade of effort from dozens of trained musicologists.
A promising technique for IR with regard to multimedia resources.
What I wonder about is the extension of the technique, games designed to train machine learning for:
- e-discovery in legal proceedings
- “tagging” or indexing if you will, text resources
- vocabulary expansion for searching
- contexts for semantic matching
- etc.
A first person shooter game that annotates the New York Times archives would be really cool!