[Abstract]

Tightness-based Clustering of Association Rules: An Aid to Chance Discovery in Data Mining

Rajesh Natarajan1 and Balasubramaniam Shekar2
1)Information Technology & Systems Group, Indian Institute of Management Lucknow, India
2)Quantitative Methods and Information Systems Area, Indian Institute of Management Bangalore, India



In this paper we present an approa ch to mitigate the `rule immensity' and the resulting `understandability' problem in association rule (AR) mining. Clustering `similar' rules facilitates exploration of connections among rules and discovery of the underlying structure. This in turn promotes the proactive discovery of chance events in a retail market. We first introduce the notion of `tightness' of an AR. It reveals the strength of binding between various items present in an AR. We elaborate on its usefulness in the retail market-basket context. After providing the intuition, we develop a distance -function on the basis of `tightness.' This distance function forms the basis for clustering ARs. Average linkage method is used to cluster ARs obtained from a small artificial dataset. Clusters thus obtained are compared with those obtained by running a standard method (from recent data mining literature) on the same data set.