A Fast Algorithm for Chinese Text Categorization Based on Key Tree

Xin Liu; Ren Ren Liu; Wen Jing He

doi:10.4028/www.scientific.net/AMM.58-60.1106

Paper Titles

The Bending Interface Model for Flexible Three-Dimensional Microstructure
p.1082

Research on Exception Handling Mechanism Based on Directed Graph in Service Composition
p.1088

Verification of Time Constraints Consistency on Web Service Composition Based on ETPN
p.1094

A Polishing Algorithm for the Profile Curve of GIF Image
p.1100

A Fast Algorithm for Chinese Text Categorization Based on Key Tree
p.1106

Real-Time Implementation of Chirp Scaling Algorithm
p.1113

A Time-Series Analysis of Land Surface Temperature in Macao, China
p.1119

Research and Implementation of ABS Virtual Simulation Platform
p.1124

A Method of Multi-Attribute Group Decision-Making Based on LWD and LOWA Operator
p.1130

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 58-60A Fast Algorithm for Chinese Text Categorization...

A Fast Algorithm for Chinese Text Categorization Based on Key Tree

Abstract:

To solving Chinese text categorization, a fast algorithm is proposed. The basic idea of the algorithm is: first constructs a weighted value of keywords dictionary which is constructed in key tree, then using the Hash function and the principle of giving priority for long term matching to mapping the strings in documentations to the dictionary. After that, calculate the sum of weights of the keywords which has been matched successfully. Finally take the maximum for the result of the classification. The algorithm can avoid the difficulty of Chinese word segmentation and its influence on accuracy of result. Theoretical analysis and experimental results indicate that the accuracy and the time efficiency of the algorithm is higher, whose comprehensive performance reaches to the level of current major technology.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 58-60)

Pages:

1106-1112

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.58-60.1106

Citation:

Cite this paper

Online since:

June 2011

Authors:

Xin Liu, Ren Ren Liu, Wen Jing He

Keywords:

Gain Weight, Hash Function, Key Tree, Text Categorization

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Li Xiaoming, Yan Hongfei, Wang Jiming: Search Engineer—Principle, Technology and System. [M]. Beijing : Science Publishing House, 2004: 197-221.

Google Scholar

[2] Thosten Joachims: Text Categorization with Support Vector Machines: Learning with Many Relevant Features[EB]. http: /www-ai. informatik. uni-dormund. de/ls8-repots. html.

Google Scholar

[3] Li Ronglu, etc. : Using Maximum Entropy Model for Chinese Text Categorization [J]. Journal of Computer Research and Development, 2005, 1: 22-29.

Google Scholar

[4] D.D. Lewis: Navie(Bayes)at forty: the independence assumption in information retrieval[C]. The 10thEuropean Conference on Machine Learning. New York: Spring, 1998: 4-15.

Google Scholar

[5] J S Pan, Y L Qiao, S H Sun: A fast K nearest neighbors classification algorithm [J]. IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences. 2004, E87-A(4): 961 963.

DOI: 10.1093/ietfec/e89-a.8.2239

Google Scholar

[6] Jiang Faqun, Zhou Jingye, Cao Juan: A Chinese Input Approach Implication Word Segmentation And its Implementation [J]. Natural Science Journal of Xiangtan University, 2002, 25(3): 26-29.

Google Scholar

[7] Wang Mengyun, Cao Suqing. The System for Automatic Text Categorization Based on Chinese Character Vector [J]. Journal of the China Society for Scientific andTechnical Information, 2000, 19(6): 644-649.

Google Scholar