Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 12, 2011

Lessons of History? Crowdsourcing

Filed under: Crowd Sourcing — Patrick Durusau @ 8:00 am

The post by Panos Ipeirotis, Crowdsourcing: Lessons from Henry Ford on his presentation (and slides), reminded me of Will and Auriel Durant’s Lessons of History observation (paraphrasing):

If you could select them, 10% of the population produces as much as the other 90% combined. History does exactly that.

So Panos saying that “A few workers contribute the majority of the work…” is no surprise.

If you don’t think asking people for their opinions is all that weird, you may enjoy his presentation.*

His summary:

The main points that I wanted to make:

  • It is common to consider crowdsourcing as the “assembly line for knowledge work” and think of the workers as simple cogs in a big machine. It is almost a knee-jerk reaction to think negatively about the concept. However, it was the proper use of the assembly line (together with the proper automation) by Henry Ford that led to the first significant improvement in the level of living for the masses.
  • Crowdsourcing suffers a lot due to significant worker turnover: Everyone who experimented with large tasks on MTurk knows that the participation distribution is very skewed. A few workers contribute the majority of the work, while a large number of workers contribute only minimally. Dealing with these hit-and-run workers is a pain, as we cannot apply any statistically meaningful mechanism for quality control.
  • We ignore the fact that workers give back what they are given. Pay peanuts, get monkeys. Pay well, and get good workers. Needless to say, reputation and other quality signaling mechanisms are of fundamental importance for this task.
  • Keeping the same workers around can give significant improvements in quality. Today on MTurk we have a tremendous turnover of workers, wasting significant effort and efficiencies. Whomever builds a strong base of a few good workers can pay the workers much better and, at the same time, generate a better product for lower cost than relying on an army of inexperienced, noisy workers.

Yes, at the end, crowdourcing is not about the crowd. It is about the individuals in the crowd. And we can now search for these valuable individuals very effectively. Crowdsourcing is crowdsearching.


*It isn’t that people are the best judges of semantics. They are the only judges of semantics.

Automated systems for searching, indexing, sorting, etc., are critical to modern information infrastructures. What they are not doing, appearances to the contrary notwithstanding, is judging semantics.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress