A Research Using Correlation Coefficient to Make Bayesian Classification Data Mining

Article Preview

Abstract:

In traditional Bayesian classification data mining methods, there may be defects such as predictions unreliable because the selected predictors are little or not related with the target factor. this paper analyzes the correlation between predictors and the target factor using correlation coefficient based on Bayesian classification model and combines with Hadoop distributed file system and parallel programming models to explore an improved algorithm. The experiments show that this method not only makes the prediction more reliable but also saves resources and improves the efficiency of the algorithm greatly. In addition, it is suitable for massive data processing.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

18-22

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Gong Xiujun. Bayesian Theory and Its Application Research [D]. Chinese Academy of Sciences (Institute of Computing Technology), (2002).

Google Scholar

[2] Jianlin Wang, Wang XueLing . Bayesian classifier of Data Mining. Changchun University of Technology, 2006, 29 (3) : 52-53.

Google Scholar

[3] Tim White, Hadoop: The Definitive Guide [M], O' Reilly Media, June 2009, ISBN059652197: 15-75, 129-257.

Google Scholar

[4] Pedro Domingos, Michael Pazzzani. On the Optimality of the Simple Bayesian Classifier under zero-one Loss[J]. Machine Learning. 1997, 29: 103-130.

Google Scholar

[5] Zeng Qinghua , Yuan Jiabin, Zhang Yunzhou. Bayesian filtering MapReduce model based on Hadoop . Computer Engineering, 2013, 39 (11) : 58-64.

Google Scholar

[6] Pedro Domingos, Michael Pazzzani. On the Optimality of the Simple Bayesian Classifier under zero-one Loss[J]. Machine Learning. 1997, 29: 103-130.

Google Scholar

[7] Yang, Y. and Webb, G, I. Weighted proportional k-interval discretization for Naive Bayes classifiers[C]. The 7th PAKDD, 2003: 501-512.

DOI: 10.1007/3-540-36175-8_50

Google Scholar