The Application of Semantic Similarity in Text Classification

Pei Ying Zhang

doi:10.4028/www.scientific.net/AMM.346.141

Paper Titles

A MapReduce Clone Car Identification Model over Traffic Data Stream
p.117

An Adaptive Intra-Frame Refresh Algorithm Based on Rate Distortion Optimization
p.123

A New Method for the Optimal Configuration of Ship-Based Aircraft Based on the Extended Backpack
p.129

Application of Bayesian for the Situation Assessment of Sea-Battlefield
p.135

The Application of Semantic Similarity in Text Classification
p.141

The Research of Clock Traffic Congestion Model and Synchronization Technology
p.145

Research on Motion Capture Instruments in Sports
p.151

A Study of Innovative Product Design for Drinking Cups
p.157

A Study of Customized Innovative Self-Portrait Product Design
p.163

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 346The Application of Semantic Similarity in Text...

The Application of Semantic Similarity in Text Classification

Abstract:

Text classification is a challenging problem which aims to automatically assign unlabeled documents to predefined one or more classes according to its contents. The major problem of text classification is the high dimensionality of the feature space. This paper proposes an approach based on the semantic similarity between the title vectors and the category vectors using the tf*rf weighting method. Experiments show that text classifier based on semantic similarity helps dimension sensitive learning algorithms such as KNN to eliminate the “curse of dimensionality” and as a result makes an important improvement in all categories.

You might also be interested in these eBooks

Modern Development in Materials, Machinery and Automation

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 346)

Pages:

141-144

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.346.141

Citation:

Cite this paper

Online since:

August 2013

Authors:

Pei Ying Zhang

Keywords:

K-Nearest Neighbor (KNN) Algorithm, Semantic Similarity, Semantic-Based, Text Classification

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Zhenyu Lu, Yongmin Lin, Shuang Zhao, Xuebin Chen. Study on feature selection and weighting based on synonym merge in text categorization: the second International Conference on Future Networks. (2010): 105-109.

DOI: 10.1109/icfn.2010.70

Google Scholar

[2] Roberto Navigli, Stefano Faralli: Two Birds with One Stone: Learning Semantic Models for Text Categorization and Word Sense Disambiguation. International Conference on Information and Knowledge Management, Proceedings, pp.2317-2320.

DOI: 10.1145/2063576.2063955

Google Scholar

[3] Lan, M. et al., 2009. Supervised and traditional term weighting methods for automatic text categorization. IEEE Trans. Pattern Anal. Machine Intell. 31(4): 721-735.

DOI: 10.1109/tpami.2008.110

Google Scholar

[4] Tong, Yala, Wang, ChunZhi, 2009. Dimensionality reduction in webpage categorization using probabilistic latent semantic analysis and adaptive general particle swarm optimization. In: 2009 International Workshop on Intelligent Systems and Applications.

DOI: 10.1109/iwisa.2009.5072835

Google Scholar

[5] Xue, Xiaobing, Zhou, Zhihua, 2009. Distributional features for text categorization. Trans. Knowl. Data Eng. 21(3): 428-441.

DOI: 10.1109/tkde.2008.166

Google Scholar

[6] Li Zhixing, Xiong Zhongyang, Zhang Yufang, Liu Chunyong, Li Kuan. Fast text categorization using concise semantic analysis. Pattern Recognition Letters 32 (2011): 441-448.

DOI: 10.1016/j.patrec.2010.11.001

Google Scholar

[7] Gabrilovich, E., Markovitch, S., 2009. Wikipedia-based semantic interpretation for natural language processing. J. Artif. Intell. Res. 34, 443-498.

DOI: 10.1613/jair.2669

Google Scholar

[8] Information on http: /www. keenage. com.

Google Scholar

[9] G. Miller, R. Beckwith, C. Felbaum, Introduction to wordnet: an online lexical database, (1933).

Google Scholar

[10] Jamal Abdul Nasir, Asim Karim, George Tsatsaronis, and Iraklis Varlamis. A knowledge-based semantic kernel for text classification. SPIRE 2011, LNCS 7024, pp.261-266.

DOI: 10.1007/978-3-642-24583-1_25

Google Scholar

[11] Liu Q., Li S., Based on the HowNet vocabulary semantic similarity calculation,. Computational Linguistics and Chinese Language Processing, (2002).

Google Scholar

[12] Yang, Y., Liu, X., 1999. A re-examination of text categorization methods. In: Annual ACM Conference on Research and Development in Information Retrieval, pp.42-49.

DOI: 10.1145/312624.312647

Google Scholar

[13] Wenqian Shang, Houkuan Huang, Haibin Zhu, et al. A novel feature selection algorithm for text categorization. Expert Systems with Applications 33(2007): 1-5.

DOI: 10.1016/j.eswa.2006.04.001

Google Scholar