Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 25, 2011

Tang and Lease (2011) Semi-Supervised Consensus Labeling for Crowdsourcing

Filed under: Crowd Sourcing,LingPipe — Patrick Durusau @ 7:49 pm

Tang and Lease (2011) Semi-Supervised Consensus Labeling for Crowdsourcing

From the post:

I came across this paper, which, among other things, describes the data collection being used for the 2011 TREC Crowdsourcing Track:

But that’s not why we’re here today. I want to talk about their modeling decisions.

Tang and Lease apply a Dawid-and-Skene-style model to crowdsourced binary relevance judgments for highly-ranked system responses from a previous TREC information retrieval evaluation. The workers judge document/query pairs as highly relevant, relevant, or irrelevant (though highly relevant and relevant are collapsed in the paper).

The Dawid and Skene model was relatively unsupervised, imputing all of the categories for items being classified as well as the response distribution for each annotator for each category of input (thus characterizing both bias and accuracy of each annotator).

I post this in part for the review of the model in question and also as a warning that competent people really do read research papers in their areas. Yes, on the WWW you can publish anything you want, of whatever quality. But, others in your field will notice. Is that what you want?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress