Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 24, 2012

Predictive Analytics: Evaluate Model Performance

Filed under: Predictive Analytics,Statistics — Patrick Durusau @ 4:19 pm

Predictive Analytics: Evaluate Model Performance by Ricky Ho.

Ricky finishes his multi-part series on models for machine learning with the one question left hanging:

OK, so which model should I use?

In previous posts, we have discussed various machine learning techniques including linear regression with regularization, SVM, Neural network, Nearest neighbor, Naive Bayes, Decision Tree and Ensemble models. How do we pick which model is the best ? or even whether the model we pick is better than a random guess ? In this posts, we will cover how we evaluate the performance of the model and what can we do next to improve the performance.

Best guess with no model

First of all, we need to understand the goal of our evaluation. Are we trying to pick the best model ? Are we trying to quantify the improvement of each model ? Regardless of our goal, I found it is always useful to think about what the baseline should be. Usually the baseline is what is your best guess if you don’t have a model.

For classification problem, one approach is to do a random guess (with uniform probability) but a better approach is to guess the output class that has the largest proportion in the training samples. For regression problem, the best guess will be the mean of output of training samples.

Ricky walks you through the steps and code to make an evaluation of each model.

It is always better to have evidence that your choices were better than a coin flip.

Although, I am mindful of the wealth advice story in “Thinking, Fast and Slow” by Daniel Kahneman, where he was given data of investment outcomes for eight years by 28 wealth advisers. The results indicated there was no correlation between “skill” and the outcomes. Luck and not skill was being rewarded with bonuses.

The results were ignored by both management and advisers as inconsistent with their “…personal experiences from experience.” (pp. 215-216)

Do you think the same can be said of search results? Just curious.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress