Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 23, 2015

Association Rule Mining – Not Your Typical Data Science Algorithm

Filed under: Association Rule Mining,Hadoop,MapReduce — Patrick Durusau @ 7:00 pm

Association Rule Mining – Not Your Typical Data Science Algorithm by Dr. Kirk Borne.

From the post:

Many machine learning algorithms that are used for data mining and data science work with numeric data. And many algorithms tend to be very mathematical (such as Support Vector Machines, which we previously discussed). But, association rule mining is perfect for categorical (non-numeric) data and it involves little more than simple counting! That’s the kind of algorithm that MapReduce is really good at, and it can also lead to some really interesting discoveries.

Association rule mining is primarily focused on finding frequent co-occurring associations among a collection of items. It is sometimes referred to as “Market Basket Analysis”, since that was the original application area of association mining. The goal is to find associations of items that occur together more often than you would expect from a random sampling of all possibilities. The classic example of this is the famous Beer and Diapers association that is often mentioned in data mining books. The story goes like this: men who go to the store to buy diapers will also tend to buy beer at the same time. Let us illustrate this with a simple example. Suppose that a store’s retail transactions database includes the following information:

If you aren’t familiar with association rule mining, I think you will find Dr. Borne’s post an entertaining introduction.

I would not go quite as far as Dr. Borne with “explanations” for the pop-tart purchases before hurricanes. For retail purposes, so long as we spot the pattern, they could be building dikes out of them. The same is the case for other purchases. Take advantage of the patterns and try to avoid second guessing consumers. You can read more about testing patterns Selling Blue Elephants.

Enjoy!

March 1, 2013

Incremental association rule mining: a survey

Filed under: Association Rule Mining,Machine Learning — Patrick Durusau @ 5:33 pm

Incremental association rule mining: a survey by B. Nath, D. K. Bhattacharyya, A. Ghosh. (WIREs Data Mining Knowl Discov 2013. doi: 10.1002/widm.1086)

Abstract:

Association rule mining is a computationally expensive task. Despite the huge processing cost, it has gained tremendous popularity due to the usefulness of association rules. Several efficient algorithms can be found in the literature. This paper provides a comprehensive survey on the state-of-the-art algorithms for association rule mining, specially when the data sets used for rule mining are not static. Addition of new data to a data set may lead to additional rules or to the modification of existing rules. Finding the association rules from the whole data set may lead to significant waste of time if the process has started from the scratch. Several algorithms have been evolved to attend this important issue of the association rule mining problem. This paper analyzes some of them to tackle the incremental association rule mining problem.

Not suggesting that it is always a good idea to model association rules as “associations” in the topic map sense but it is an important area of data mining.

The paper provides:

  • a taxonomy on the existing frequent itemset generation techniques and an analysis of their pros and cons,
  • a comprehensive review on the existing static and incremental rule generation techniques and their pros and cons, and
  • identification of several important issues and research challenges.

Some thirteen (13) pages and sixty-six (66) citations to the literature so a good starting point for research in this area.

If you need a more basic starting point, consider: Association rule learning (Wikipedia).

Powered by WordPress