Text Classification Combined an Improved CHI and Category Relevance Factor

Pei Ying Zhang

doi:10.4028/www.scientific.net/AMR.524-527.3866

Paper Titles

Binary Sort Tree Visualization Demonstration System Design and Realization
p.3845

Design and Realization Based on SAR Image Target Model Base
p.3849

Research on Construction of the Mutually Supportive Network Using a Spring-Water Spots
p.3853

Study of Teaching Assessment Based on BP Neural Network
p.3861

Text Classification Combined an Improved CHI and Category Relevance Factor
p.3866

Base on SIFT-Harris Operator of the Document Image Matching Method
p.3870

Analysis of the Development of Contemporary Industrial Design
p.3875

Study of Application of “Green” Design Concept to Fashion Design
p.3880

Weighted-Correct Empirical likelihood for Linear EV Models
p.3884

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 524-527Text Classification Combined an Improved CHI and...

Text Classification Combined an Improved CHI and Category Relevance Factor

Abstract:

Text classification is the task of assigning natural language textual documents to predefined categories based on their context. The main concern in this paper is to improve the accuracy of text classification system combined an improved CHI method and category relevance factor. Firstly, use an improved CHI method to select features from the raw features aim to reduce the dimensions of the features. Secondly, through the TF-CRF method to calculate the feature weight, this method mainly consider that the features have different distributions in different categories. Finally, we carried out a series of experiments compared with other methods using the F1-measure. Experimental results show that our new method makes an important improvement in all categories.

You might also be interested in these eBooks

Natural Resources and Sustainable Development II

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 524-527)

Pages:

3866-3869

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.524-527.3866

Citation:

Cite this paper

Online since:

May 2012

Authors:

Pei Ying Zhang

Keywords:

An Improved CHI Method, Category Relevance Factor, Feature Selection Method, Support Vector Machine (SVM), Text Classification

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Wen Zhang, Taketoshi Yoshida, Xijin Tang, Text classification based on multi-word with support vector machine, Knowledge-Based Systems, 2008, pp.879-886.

DOI: 10.1016/j.knosys.2008.03.044

Google Scholar

[2] Y.Hao et al.(Eds.): CIS 2005, Part I, LNAI 3801, pp.458-463,2005.

Google Scholar

[3] S.C. Deerwester, S.T. Dumais, T.K. Landauer, G.W. Furnas, R.A Harshman, Indexing by latent semantic analysis, Journal of the American Society of Information Science 41 (1990) :391-407.

DOI: 10.1002/(sici)1097-4571(199009)41:6<391::aid-asi1>3.0.co;2-9

Google Scholar

[4] LU Ting, WANG Hao, YAO Hongliang. K-nearest neighbor chinese text categorization algorithm based on center documents. Computer Engineering and Applications, 2011, 47(2):127-130.

Google Scholar

[5] Dhillon I, Kogan J, Nicholas C. Feature selection and document clustering [C]//Proceedings 2002 CAD IP Research Symposium 2002:70-130

Google Scholar

[6] Y. Yang, J. Pedersen. A comparative study on feature selection in text categorization, in: Proceedings of the 14th International Conference on Machine Learning, Nashville, USA, 1997, pp: 412-420

Google Scholar

[7] PEI Yingbo, LIU Xiaoxia. Study on improved CHI for feature selection in Chinese text categorization. Computer Engineering and Applications, 2011, 47(4):128-130. (In Chinese)

Google Scholar

[8] Jiana Meng, Hongfei Lin, Yuhai Yu. A two-stage feature selection method for text categorization. Computers and Mathematics with Applications, 2011.07(45):2793-2800

DOI: 10.1016/j.camwa.2011.07.045

Google Scholar

[9] G. Forman. An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research. 2003(3):1289-1305

Google Scholar

[10] Chang C C, Lin C J.LIBSVM: a library for support vector machines [EB/OL]. [2010-04-11].http://www.csie.ntu.edu.tw/cjlin/libsvm/

Google Scholar