Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 17, 2015

Data Mining: Spring 2013 (CMU)

Filed under: Data Mining,R — Patrick Durusau @ 2:33 pm

Data Mining: Spring 2013 (CMU) by Ryan Tibshirani.

Overview and Objectives [from syllabus]

Data mining is the science of discovering structure and making predictions in data sets (typically, large ones). Applications of data mining are happening all around you|and if they are done well, they may sometimes even go unnoticed. How does Google web search work? How does Shazam recognize a song playing in the background? How does Net Flix recommend movies to each of its users? How could we predict whether or not a person will develop breast cancer based on genetic information? How could we search for possible subgroups among breast cancer patients, suggesting diff erent variants of the disease? An expert’s answer to any one of these questions may very well contain enough material to fill its own course, but basic answers stem from the principles of data mining.

Data mining spans the fi elds of statistics and computer science. Since this is a course in statistics, we will adopt a statistical perspective the majority of the course. Data mining also involves a good deal of both applied work (programming, problem solving, data analysis) and theoretical work (learning, understanding, and evaluating methodologies). We will try to maintain a balance between the two.

Upon completing this course, you should be able to tackle new data mining problems, by: (1) selecting the appropriate methods and justifying your choices; (2) implementing these methods programmatically (using, say, the R programming language) and evaluating your results; (3) explaining your results to a researcher outside of statistics or computer science.

Lecture notes, R files, what more could you want? 😉

Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress