A Kernel Density Estimation Based Interestingness Measure for Association Rule Mining

Article Preview

Abstract:

Association rules provide a concise statement of potentially useful information, and have been widely used in real applications. However, the usefulness of association rules highly depends on the interestingness measure which is used to select interesting rules from millions of candidates. In this study, a probability analysis of association rules is conducted, and a discrete kernel density estimation based interestingness measure is proposed accordingly. The new proposed interestingness measure makes the most of the information contained in the data set and obtains much lower falsely discovery rate than the existing interestingness measures. Experimental results show the effectiveness of the proposed interestingness measure.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

389-394

Citation:

Online since:

January 2010

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2010 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] B. Liu, W. Hsu, and Y. Ma, Integrating Classification and Association Rule Mining, Proceeding of KDD Conference(1998), pp.80-86.

Google Scholar

[2] Cheng, H., Yan, X., Han, J. & Yu, P.S. Direct Discriminative Pattern Mining for Effective Classification. Proceeding of ICDE Conference (2008), pp.169-178.

Google Scholar

[3] G. Cong, A.K.H. Tung, X. Xu, F. Pan, and J. Yang, FARMER: Finding Interesting Rule Groups in Microarray Datasets, Proceeding of SIGMOD Conference(2004), pp.143-154.

DOI: 10.1145/1007568.1007587

Google Scholar

[4] G. Cong, K. Tan, A.K.H. Tung, and X. Xu, Mining Top-k Covering Rule Groups for Gene Expression Data, Proceeding of SIGMOD Conference(2005) pp.670-681.

DOI: 10.1145/1066157.1066234

Google Scholar

[5] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann(2000).

Google Scholar

[6] Brin, S., Motwani, R., Ullman, J.D. & Tsur, S. Dynamic Itemset Counting and Implication Rules for Market Basket Data. Proceeding of SIGMOD Conference(1997), pp.255-264.

DOI: 10.1145/253262.253325

Google Scholar

[7] Webb, G.I. Discovering significant rules. Machine Learning (2006), pp.434-443.

Google Scholar

[8] J. Li, H. Li, L. Wong, J. Pei, and G. Dong, Minimum Description Length Principle: Generators Are Preferable to Closed Patterns, Proceeding of AAAI Conference, (2006).

Google Scholar

[9] J. AITCHISON and C.G.G. AITKEN, Multivariate binary discrimination by the kernel method, Biometrika, vol. 63(1976), pp.413-420.

DOI: 10.1093/biomet/63.3.413

Google Scholar

[10] P. HALL, On nonparametric multivariate binary discrimination, Biometrika, vol. 68(1981), pp.287-294.

DOI: 10.1093/biomet/68.1.287

Google Scholar

[11] G. TUTZ, An alternative choice of smoothing for kernel-based density estimates in discrete discriminant analysis, Biometrika, vol. 73(1986), pp.405-411.

DOI: 10.1093/biomet/73.2.405

Google Scholar

[12] C.F. Aliferis, I. Tsamardinos, A.R. Statnikov, and L.E. Brown, Causal Explorer: A Probabilistic Network Learning Toolkit for Biomedical Discovery, (2004).

Google Scholar

[13] A.C. Tan, D.Q. Naiman, L. Xu, R.L. Winslow, and D. Geman, Simple decision rules for classifying human cancers from gene expression profiles, Bioinformatics, vol. 21(2005), pp.3896-3904.

DOI: 10.1093/bioinformatics/bti631

Google Scholar

[14] R. Kohavi, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, Proceeding of IJCAI Conference(1995) pp.1145-1137.

Google Scholar