Flock: Hybrid Crowd-Machine Learning Classifiers by Justin Cheng and Michael S. Bernstein.


We present hybrid crowd-machine learning classifiers: classification models that start with a written description of a learning goal, use the crowd to suggest predictive features and label data, and then weigh these features using machine learning to produce models that are accurate and use human-understandable features. These hybrid classifiers enable fast prototyping of machine learning models that can improve on both algorithm performance and human judgment, and accomplish tasks where automated feature extraction is not yet feasible. Flock, an interactive machine learning platform, instantiates this approach. To generate informative features, Flock asks the crowd to compare paired examples, an approach inspired by analogical encoding. The crowd’s efforts can be focused on specific subsets of the input space where machine-extracted features are not predictive, or instead used to partition the input space and improve algorithm performance in subregions of the space. An evaluation on six prediction tasks, ranging from detecting deception to differentiating impressionist artists, demonstrated that aggregating crowd features improves upon both asking the crowd for a direct prediction and off-the-shelf machine learning features by over 10%. Further, hybrid systems that use both crowd-nominated and machine-extracted features can outperform those that use either in isolation.

Let’s see, suggest predictive features (subject identifiers in the non-topic map technical sense) and label data (identify instances of a subject), sounds a lot easier that some of the tedium I have seen for authoring a topic map.

I particularly like the “inducing” of features versus relying on a crowd to suggest identifying features. I suspect that would work well in a topic map authoring context, sans the machine learning aspects.

This paper is being presented this week, CSCW 2015, so you aren’t too far behind. 😉

How would you structure an inducement mechanism for authoring a topic map?

