Summary of paper on data mining algorithms nominated and voted on by ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to come up with a top 10 list.
I was curious about how the entries on the list from 2007 have fared.
I searched CiteseerX limiting the publication year to 2010.
The results, algorithm followed by citation count, were as follows:
- C4.5 – 41
- The k-Means algorithm – 86
- Support Vector Machines – 64
- The Apriori algorithm – 46
- Expectation-Maximization – 41
- PageRank – 19
- AdaBoost – 11
- k-Nearest Neighbor Classification – 36*
- Naive Bayes – 25
- CART (Classification and Regression Trees) – 11
*Searched as “k-Nearest Neighbor”.
Not a scientific study but enough variation to make me curious about:
- Broader survey of algorithm citation.
- What articles cite more than one algorithm?
- Are there any groupings by subject of study?
Not a high priority item but something I want to return to examine more closely.