A Text Categorization Method Based on SVM and Improved K-Means

Article Preview

Abstract:

Traditional supervised classification method such as support vector machine (SVM) could achieve high performance in text categorization. However, we should first hand-labeled the samples before classifying. Its a time-consuming task. Unsupervised method such as k-means could also be used for handling the text categorization problem. However, Traditional k-means could easily be affected by several isolated observations. In this paper, we proposed a new text categorization method. First we improved the traditional k-means clustering algorithm. The improved k-means is used for clustering vectors in our vector space model. After that, we use the SVM to categorize vectors which are preprocessed by improved k-means. The experiments show that our algorithm could out-perform the traditional SVM text categorization method.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2449-2453

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization.

Google Scholar

[2] One-against-one fuzzy support vector machine classifier An approach to text categorization.

Google Scholar

[3] feature selection in SVM text categorization.

Google Scholar

[4] Support Vector Machine Active Learning with Applications to Text Classification.

Google Scholar

[5] Feature Weighting in k-Means Clustering.

Google Scholar

[6] Partitioning-based clustering for Web document categorization.

Google Scholar

[7] Adaptive Dimension Reduction Using Discriminant Analysis and K-means Clustering.

Google Scholar

[8] http: / www. ictclas. org.

Google Scholar

[9] http: /weka. wikispaces. com.

Google Scholar