Incorporate Syntactic Information for Short Text Classification

Article Preview

Abstract:

As the volume of online short text documents grow tremendously on the Internet, it is much more urgent to solve the task of organizing the short texts well. However, the traditional feature selection methods cannot suitable for the short text. In this paper, we proposed a method to incorporate syntactic information for the short text. It emphasizes the feature which has more dependency relations with other words. The classifier SVM and machine learning environment Weka are involved in our experiments. The experiment results show that incorporate syntactic information in the short text, we can get more powerful features than traditional feature selection methods, such as DF, CHI. The precision of short text classification improved from 86.2% to 90.8%.

You might also be interested in these eBooks

Info:

Periodical:

Advanced Materials Research (Volumes 268-270)

Pages:

697-700

Citation:

Online since:

July 2011

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2011 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] S. Zelikovitz and H. Hirsh, Improving Short-Text Classification Using Unlabeled Background Knowledge to Assess Document Similarity. In Proceedings of the Seventeenth International Conference on Machine Learning, 2000, pp.1183-1190.

Google Scholar

[2] S. Zelikovitz and H. Hirsh : Integrating Background Knowledge into Nearest-Neighbor Text Classification.

Google Scholar

[3] S. Zelikovitz and F. Marquez, Transductive Learning for Short-Text Classification Problems Using Latent Semantic Indexing, IJPRAI 2005, vol. 19, issue 2, pp.143-163.

DOI: 10.1142/s0218001405003971

Google Scholar

[4] Jiyuan An and Yi-Ping Phoebe Chen : Finding Short Patterns to Classify Text Documents. WI, (2006).

Google Scholar

[5] M. Radovanovi´ c and M. Ivanovi´: Document Representations for Classification of Short Web-page Descriptions.

Google Scholar

[6] M. Sahami and Timothy D Heilman, A web-based kernel function for measuring the similarity of short text snippets, Proceedings of the 15th international conference on World Wide Web, 2006, pp.377-386.

DOI: 10.1145/1135777.1135834

Google Scholar

[7] V. Nastase, J. Sayyad Shirabad : Using Dependency Relations for Text Classification. (2003).

Google Scholar

[8] Hudson, R.: Word Grammar. Blackwell, Oxford , (1984).

Google Scholar

[9] Witten, I.H., Frank, E.: DataMining: PracticalMachine Learning Tools and Techniques. 2nd . Morgan Kaufmann Publishers , (2005).

Google Scholar