# Association Analysis in Data Mining

Data Mining | Association Analysis: In this tutorial, we will learn about the association rule mining or association analysis in data mining.
By **Palkesh Jain** Last updated : April 17, 2023

## What is Association Analysis?

Association analysis is most widely used to discover hidden patterns in large data sets. These hidden and uncovered relationships can be represented in the form of **association rules** or **sets of frequent items**. The role of identifying interesting associations in large databases is correlation analysis. There can be two types of these enthralling relationships: frequent itemsets or rules of the association. Frequent object sets are a collection of objects that mostly take place together. Association rules are the method of viewing fascinating relationships. The rules of association show that a close bond occurs between two or more objects.

Fig: Market Basket Analysis

## Market Basket Analysis

In transactional data, each case is connected with a set of objects. In principle, the list may contain all possible data items in the collection. For example - in a single market-basket analysis, goods with related items may be bought. However, only a small subset of all potential goods are present in a given set; only a small fraction of the items available for sale in the shop reflect the items in the market basket.

A common example of a regular pattern (item set) mining for association rules is market basket analysis. Business basket, the research analyzes the purchasing patterns of consumers by identifying correlations with the multiple items carried in their shopping baskets by customers.

An example of association rule - milk, bread

In a shop, if a shopkeeper sales milk then it is a probability to sell bread because a customer who is buying milk may also purchase bread. So it is showing that milk and bread are correlated with one another.

## Association Rule

An associative rule is an example of the implication of the form X→YX→Y, where XX and YY are disjoint item sets (X∩Y=∅X∩Y=∅).

In terms of its support and trust, the strength of an alliance rule can be calculated. Legislation that has very low support will happen purely by chance. The reliability of the conclusion made by a rule is determined by trust.

Support of an association rule X→YX→Y

σ(X)σ(X) is the support count of XX

NN is the count of the transactions set TT.

s(X→Y)=σ(X∪Y)Ns(X→Y)=σ(X∪Y)N

Confidence of an association rule X→YX→Y

σ(X)σ(X) is the support count of XX

NN is the count of the transactions set TT.

conf(X→Y)=σ(X∪Y)σ(X)conf(X→Y)=σ(X∪Y)σ(X)

The interest of an association rule X→YX→Y

P(Y)=s(Y)P(Y)=s(Y) is the support of YY (fraction of baskets that contain YY)

If the interest of a rule is close to 1, then it is uninteresting.

I(X→Y)=1→XI(X→Y)=1→X and YY are independent

I(X→Y)>1→XI(X→Y)>1→X and YY are positively correlated

I(X→Y)<1→XI(X→Y)<1→X and YY are negative correlated

I(X→Y)=P(X,Y)P(X)×P(Y)I(X→Y)=P(X,Y)P(X)×P(Y)

For example, given a table of market basket transactions:

TID | Items |
---|---|

1 | {Bread, Milk} |

2 | {Bread, Diaper, Beer, Eggs} |

3 | {Milk, Diaper, Beer, Coke} |

4 | {Bread, Milk, Diaper, Beer} |

5 | {Bread, Milk, Diaper, Coke} |

We can conclude that,

s({Milk,Diaper}→{Beer})=2/5=0.4s({Milk,Diaper}→{Beer})=2/5=0.4

conf({Milk,Diaper}→{Beer})=2/3=0.67conf({Milk,Diaper}→{Beer})=2/3=0.67

I({Milk,Diaper}→{Beer})=2/53/5×3/5=10/9=1.11

**Reference:** Market Basket Analysis

Related Tutorials

- Data Mining: Introduction, Advantages, Disadvantages, and Applications
- Data Types in Data Mining
- Data Mining Tasks – Overview
- Data Mining Functionalities
- Data Exploration in Data Mining
- OLAP: What It Is, Applications, Types, Advantages, and Disadvantages
- OLAP Cube and Operations
- Data Preprocessing in Data Mining
- KDD Process in Data Mining
- Data Cleaning in Data Mining
- Motivation or Importance of Data Mining
- Classification of Data Mining Systems
- Difference Between Classification and Prediction in Data Mining
- Cluster Analysis: What It Is, Methods, Applications, and Needs
- Data Mining Outlier Analysis: What It Is, Why It Is Used?
- Data Integration in Data Mining
- Major Issues in Data Mining-Purpose and Challenges
- Data Transformation in Data Mining
- Data Reduction in Data Mining
- Data Cube Technology in Data Mining
- Data Discretization in Data Mining

Comments and Discussions!