Hierarchical Classification Methods of Chinese Scientific Papers Based on Extracting Key Words

Article Preview

Abstract:

In recent years, there have been extensive studies and rapid progresses in automatic text classification, which is one of the hotspots and key techniques in the information retrieval and data mining field. Feature extraction and classification algorithm are the crucial technologies for this problem. This paper firstly proposed feature extraction algorithm based on key words, the algorithm selected key words set from special part of scientific papers, and employed mutual information to extract features. And then, proposed an improved hierarchical classification method, and realized hierarchical classification of Chinese scientific papers.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1006-1011

Citation:

Online since:

November 2010

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Fabrizio Sebastiani. Machine learning in automated text categorization ACM Computing Surveys, 2002, 34(1): 1-47.

DOI: 10.1145/505282.505283

Google Scholar

[2] Furnkranz J. Exploiting structural information for text classification on the WWW . In: Hand DJ, Kok JN, Berthold MR , eds. Proc. of the Advances in Intelligent Data Analysis, Springer-Verlag, 1999: 487-497.

DOI: 10.1007/3-540-48412-4_41

Google Scholar

[3] Yiming Yang, Thomas Ault, Thomas Pierce, et al. Improving text categorization methods for event tracking. In: Proc. of the 23rd Annual ACM SIGIR Int'l Conf. on Research and Development in Information Retrieval, New York, 2000: 65-72.

DOI: 10.1145/345508.345550

Google Scholar

[4] Sun A, Lim EP, Ng WK, et al. Blocking reduction strategies in hierarchical text classification. IEEE Trans. on Knowledge and Data Engineering, 2004, 16(10): 1305-1308.

DOI: 10.1109/tkde.2004.50

Google Scholar

[5] Liu Li, He Zhong-shi, Term selection and weighting approach based on key words in text categorization. J. Computer Engineering and Design. 2006. 3, 934-936.

Google Scholar

[6] LIU Hai-feng, YAO Ze-qing, WANG Ze-yan, ZHANG Xue-ren. A Study of Text Term weighting Based on Position. J. 2009. 2 188-192.

Google Scholar

[7] Gao Bo, Zhang Zheng, Research on text hierarchical classification system J. Computer engineering and application, 2006. 11.

Google Scholar

[8] Wang Junying, Research on Chines text categorization algorithm based on technology text D. Yanshan University, 2007. 3.

Google Scholar