Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 19, 2012

Big-data Naive Bayes and Classification Trees with R and Netezza

Filed under: Bayesian Data Analysis,Classification Trees,Netezza,R — Patrick Durusau @ 6:54 pm

Big-data Naive Bayes and Classification Trees with R and Netezza

From the post:

The IBM Netezza analytics appliances combine high-capacity storage for Big Data with a massively-parallel processing platform for high-performance computing. With the addition of Revolution R Enterprise for IBM Netezza, you can use the power of the R language to build predictive models on Big Data.

In the demonstration below, Revolution Analytics’ Derek Norton analyzes loan approval data stored on the IBM appliance. You’ll see the R code used to:

  • Explore the raw data (with summary statistics and charts)
  • Prepare the data for statistical analysis, and create training and test sets
  • Create predictive models using classificiation trees and Naïve Bayes
  • Predict using the models, and evaluate model performance using confusion matrices

[embedded presentation omitted]

Note that while R code is being run on Derek’s laptop, the raw data is never moved from the appliance, and the analytic computations take place “in-database” within the appliance itself (where the Revolution R Enterprise engine is also running on each parallel core).

Another incentive for you to be learning R.

Does it sound to you like “Derek’s computer” is a terminal entering instructions that are executed elsewhere? 😉 (If the computing fabric develops fast enough, we may lose the distinction of a “personal” computer. There will simply be computing.)

Meant to mention this the other day. Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress