Panos Ipeirotis writes:
TREC 2012 Crowdsourcing Track – Call for Participation
June 2012 – November 2012https://sites.google.com/site/treccrowd/
Goals
As part of the National Institute of Standards and Technology (NIST)‘s annual Text REtrieval Conference (TREC), the Crowdsourcing track investigates emerging crowd-based methods for search evaluation and/or developing hybrid automation and crowd search systems.
This year, our goal is to evaluate approaches to crowdsourcing high quality relevance judgments for two different types of media:
- textual documents
- images
For each of the two tasks, participants will be expected to crowdsource relevance labels for approximately 20k topic-document pairs (i.e., 40k labels when taking part in both tasks). In the first task, the documents will be from an English news text corpora, while in the second task the documents will be images from Flickr and from a European news agency.
Participants may use any crowdsourcing methods and platforms, including home-grown systems. Submissions will be evaluated against a gold standard set of labels and against consensus labels over all participating teams.
Tentative Schedule
- Jun 1: Document corpora, training topics (for image task) and task guidelines available
- Jul 1: Training labels for the image task
- Aug 1: Test data released
- Sep 15: Submissions due
- Oct 1: Preliminary results released
- Oct 15: Conference notebook papers due
- Nov 6-9: TREC 2012 conference at NIST, Gaithersburg, MD, USA
- Nov 15: Final results released
- Jan 15, 2013: Final papers due
As you know, I am interested in crowd sourcing of paths through data and assignment of semantics.
Although I am puzzled why we continue to put emphasis on post-creation assignment of semantics?
After data is created, we look around surprised the data has no explicit semantics.
Like realizing you are on Main Street without your pants.
Why don’t we look to the data creation process to assign explicit semantics?
Thoughts?