Research on an Improved CHI Feature Selection Method

Article Preview

Abstract:

In order to make the features selected are distributed intensively in a certain class ,evenly in that certain class as much as possible, and make features appear in that certain class as many as possible , three adjusted parameters are added to the originally traditional CHI-square feature selection method through analyzing the relevance between features and classes. Var-CHI statistic method based on variance makes the precision and recall improved apparently by comparing the experiments of the traditional CHI-square feature selection method and the improved one.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2841-2844

Citation:

Online since:

December 2012

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Sebastiani F., 2002 Machine Learning in Automated Text Categorization. ACM Computing Surveys, Vol. 34, No. 1, pp.1-47.

DOI: 10.1145/505282.505283

Google Scholar

[2] P. Saengsiri, P. Meesad, S. Na Wichian and U. Herwig, "Comparison of Hybrid Feature Selection Models on Gene Expression Data," IEEE International Conference on ICT and Knowledge Engineering, 2010, pp.13-18.

DOI: 10.1109/ictke.2010.5692905

Google Scholar

[3] Yang, Y., and X. Liu, 1999, A re-examination of text categorization methods," in 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'99), pp.42-49.

DOI: 10.1145/312624.312647

Google Scholar

[4] Yang Y., and J. Pedersen, 1997 A comparative study on feature selection in text categorization. In J. D. H. Fisher, editor, The Fourteenth International Conference on Machine Learning (ICML'97), pages 412-420.

Google Scholar

[5] Joachims ,T., 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning, pages 137–142.

DOI: 10.1007/bfb0026683

Google Scholar

[6] Yiming Yang. An evaluation of statistical approaches to text categorization [J] . Journal of Information Retrieval , 1999, 1 (1/ 2): 67288.

Google Scholar