Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 17, 2012

The Rewards of Ignoring Data

Filed under: Boosting,Machine Learning,Random Forests — Patrick Durusau @ 2:55 pm

The Rewards of Ignoring Data by Charles Parker.

From the post:

Can you make smarter decisions by ignoring data? It certainly runs counter to our mission, and sounds a little like an Orwellean dystopia. But as we’re going to see, ignoring some of your data some of the time can be a very useful thing to do.

Charlie does an excellent job of introducing the use of multiple models of data and includes deeper material:

There are fairly deep mathematical reasons for this, and ML scientist par excellence Robert Shapire lays out one of the most important arguments in the landmark paper “The Strength of Weak Learnability” in which he proves that a machine learning algorithm that performs only slightly better than randomly can be “boosted” into a classifier that is able to learn to an arbitrary degree of accuracy. For this incredible contribution (and for the later paper that gave us the Adaboost algorithm), he and his colleague Yoav Freund earned the Gödel Prize for computer science theory, the only time the award has been given for a machine learning paper.

Not being satisfied, Charles demonstrates how you can create a random decision forest from your data.

Which is possible without reading the deeper material.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress