Boosting (in Machine Learning) as a Metaphor for Diverse Teams [A Quibble]

Boosting (in Machine Learning) as a Metaphor for Diverse Teams by Renee Teate.

Renee’s summary:

tl;dr: Boosting ensemble algorithms in Machine Learning use an approach that is similar to assembling a diverse team with a variety of strengths and experiences. If machines make better decisions by combining a bunch of “less qualified opinions” vs “asking one expert”, then maybe people would, too.

Very much worth your while to read at length but to setup my quibble:

What a Random Forest does is build up a whole bunch of “dumb” decision trees by only analyzing a subset of the data at a time. A limited set of features (columns) from a portion of the overall records (rows) is used to generate each decision tree, and the “depth” of the tree (and/or size of the “leaves”, the number of examples that fall into each final bin) is limited as well. So the trees in the model are “trained” with only a portion of the available data and therefore don’t individually generate very accurate classifications.

However, it turns out that when you combine the results of a bunch of these “dumb” trees (also known as “weak learners”), the combined result is usually even better than the most finely-tuned single full decision tree. (So you can see how the algorithm got its name – a whole bunch of small trees, somewhat randomly generated, but used in combination is a random forest!)

All true but “weak learners” in machine learning are easily reconfigured, combined with different groups of other “weak learners,” or even discarded.

None of which is true for people who are hired to be part of a diverse team.

I don’t mean to discount Renee’s metaphor because I think it has much to recommend it, but diverse “weak learners” make poor decisions too.

Don’t take my word for it, watch the 2016 congressional election results.

Be sure to follow Renee on @BecomingDataSci. I’m interested to see how she develops this metaphor and where it leads.


Comments are closed.