A Method of Feature Automatic Selection Based on Mutual Information Grouping and Clustering

Article Preview

Abstract:

For the problem about a large number of irrelevant and redundant features may reduce the performance of data classification in massive data sets, a method of feature automatic selection based on mutual information and fuzzy clustering algorithm is proposed. The method is carried out as follows: The first is to work out the feature correlation based on mutual information, and to group the data according to the feature of the maximum correlation. The second is to automatically determine the optimal number of feature and compression features dimension by fuzzy c-means clustering algorithm in the data groups. The theoretical analysis and the experiment indicate that the method can obtain higher efficiency in data classification.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1613-1618

Citation:

Online since:

March 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] ZHU Lin. Research on feature weighting and feature selection-based data mining algorithms[D]. Shanghai Jiao Tong University Press, (2013).

Google Scholar

[2] ZHANG Xiao-guang, SUN Zheng, XU Gui-yun, et al. A feature selection algorithm combining within-class variance with correlation measure[J]. Journal of Harbin Institute of Technology, 2011, 43(2): 133~136.

Google Scholar

[3] LIU Huan, YU Lei. Toward integrating feature selection algorithms for classification and clustering[J]. IEEE Trans on Knowledge and Data Engineering, 2005, 17(3): 491~502.

DOI: 10.1109/tkde.2005.66

Google Scholar

[4] Oliveira L S, Morita M, Sabourin R, et al. Multi-objective genetic algorithms to create ensemble of classifiers[C]. /Proc. of the 3rd international conference on Evolutionary Multi -Criterion Optimization, 2005: 592~606.

DOI: 10.1007/978-3-540-31880-4_41

Google Scholar

[5] QIU Guo-yong, WANG Na, WANG Wan-zi. Two-stage feature selection algorithm based on mutual information and genetic algorithm[J]. Application Research of Computers, 2012, 29(8): 2903~2905.

Google Scholar

[6] Estevez P A, Michel T, Perez C A, et al. Normalized mutual information feature selection[J]. IEEE Trans. on Neural Networks, 2009, 20(2): 189~201.

DOI: 10.1109/tnn.2008.2005601

Google Scholar

[7] XU Jun-ling, ZHOU Yu-ming, CHEN Lin, et al. An unsupervised feature selection approach based on mutual information[J]. Journal Computer Research and Development, 2012, 49(2): 372~382.

Google Scholar

[8] YAO Xu, WANG Xiao-dan, ZHANG Yu-xi, et al. Ensemble feature selection algorithm based on Markov blanket and mutual information[J]. Systems Engineering and Electronics, 2012, 34(5): 1046~1050.

Google Scholar

[9] Sylvain V, Teodor T, Abdessamad K. Fault detection and identification with a new festure selection based on mutual information[J]. Journal of Press Control, 2008, 18(5): 479~490.

Google Scholar

[10] Guo B F, Mark S N. Gait feature subset selection by mutual information[J]. IEEE Trans. on systems, Man and Cybernetics-Part A: System and Humans, 2009, 39(1): 36~46.

DOI: 10.1109/tsmca.2008.2007977

Google Scholar

[11] HSU H H, HSIEH C W, LU Ming-da. Hybrid feature selection by combining fliters and wrappers[J]. Expert Systems with Applications, 2011, 38(7): 8144~8150.

DOI: 10.1016/j.eswa.2010.12.156

Google Scholar

[12] Blake C, Merz C. UCI repository of machine learning database [EB/OL]. [2009-03-15]. http: /www. ics. uci. edu/~mlearn/MLRepository. html.

Google Scholar