Essential Elements of Data Mining by Keith McCormick
From the post:
This is my attempt to clarify what Data Mining is and what it isn’t. According to Wikipedia, “In philosophy, essentialism is the view that, for any specific kind of entity, there is a set of characteristics or properties all of which any entity of that kind must possess.” I do not seek the Platonic form of Data Mining, but I do seek clarity where it is often lacking. There is much confusion surrounding how Data Mining is distinct from related areas like Statistics and Business Intelligence. My primary goal is to clarify the characteristics that a project must have to be a Data Mining project. By implication, Statistical Analysis (hypothesis testing), Business Intelligence reporting, Exploratory Data Analysis, etc., do not have all of these defining properties. They are highly valuable, but have their own unique characteristics. I have come up with ten. It is quite appropriate to emphasize the first and the last. They are the bookends of the list, and they capture the heart of the matter.
Comments? Characteristics you would add or take away?
How important is it to have a definition? Recall that creeds are created to separate sheep from goats, wheat from chaff. Are “essential characteristics” any different from a creed? If so, how?