The Nearest Neighbor Algorithm of Filling Missing Data Based on Cluster Analysis

Article Preview

Abstract:

Missing data universally exists in various research fields and it results in bad computational performance and effcet. In order to improve the accuracy of filling in the missing data, a filling missing data algorithm of the nearest neighbor based on the cluster analysis is proposed by this paper. After clustering data analysis,the algorithm assigns weights according to the categories and improves calculation formula and filling value calculation based on the MGNN (Mahalanobis-Gray and Nearest Neighbor algorithm) algorithm.The experimental results show that the filling accuracy of the method is higher than traditional KNN algorithm and MGNN algorithm.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2324-2328

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Liu Xingyi Zeng Chunhua. The Handling and challenges of missing data[J]. Qinzhou University, 2008, (6) : 25-29.

Google Scholar

[2] Liu Xingyi, Nong Guocai. The comparison of several different filling missing values method[J]Nanning Teachers College, 2007, (3): 148-150.

Google Scholar

[3] Yu Yuemeng, Huang Xiaobin. A algorithm based on KNN text classification[J]. Computer Knowledge and Technology, 2012, (3) : 1564-1566.

Google Scholar

[4] Liu Xingyi, Tan Yao, Zeng Chunhua. Filling missing data method based on the Mahalanobis distance [J]. Microcomputer Information, 2010, (9) : 225-226.

Google Scholar

[5] Liu Xingyi. Filling missing value algorithm based on Mahalanobis distance and gray analysis[J]. Journal of Computer Applications, 2009, (9): 2502-2506.

Google Scholar

[6] SunWensheng. Economic forecasting methods[M]Beijing: China Agricultural University Press, (2005).

Google Scholar

[7] Su Yijuan. Multiple imputation method for missing values by gray relation analysis[J]. Computer Engineering and Applications, 2009, (15) : 169-172.

Google Scholar

[8] Yao Guangqun, Wang Yongsheng. Intrusion detection algorithm based on fuzzy evaluation and clustering analysis[J]. Computer Engineering and Applications, 2012, (21) : 169-172 . 99-103.

Google Scholar

[9] LITTLE R, RUBIN D. Statistical analysis with missing data [M] . 2 nd ed. New York: John Wiley and Sons, (2002).

Google Scholar

[10] HUANG CC, LEE H M. A grey-based nearest neighbor approach for missing attribute value prediction [J]. Applied Intelligence 2004, 20 (3): 239-252.

DOI: 10.1023/b:apin.0000021416.41043.0f

Google Scholar