Realization of Text Categorization for Small-Scaled Dataset

Hua Liu

doi:10.4028/www.scientific.net/AMR.532-533.1239

Paper Titles

An Improved Error Concealment Based on Redundant Motion Vector Information
p.1219

Paratactic Spatial-Temporal Two Dimension Data Fusion Based on Support Vector Machines for Traffic Flow Prediction of Abnormal State
p.1225

Design and Verification of Security Protocol for Information Transmission in Digital Campus
p.1230

Multiprocessor Scheduling Problem Based on Ant Colony Optimization Algorithm
p.1235

Realization of Text Categorization for Small-Scaled Dataset
p.1239

Automatic Construction of Collocation Dictionary Based on Text Mining
p.1243

Research of the Construction of Network Data Storage in the New Period
p.1248

A New Method for Noisy Speech Classification Based on Gaussian Mixture Models
p.1253

Multi-Objects Detection in Remote Sensing Images Using Multiple Kernel Learning
p.1258

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 532-533Realization of Text Categorization for...

Realization of Text Categorization for Small-Scaled Dataset

Abstract:

Testing of the text categorization and comparison testing is carried out based on small-scaled dataset. In case of lack of trained set, without training, the indexed text keywords are used to categorize the expert subject terms, with large categorization accuracy amounted to 0.82. In case of less trained set, after training, the characteristics vectors acquired from the training are added into experts’ subject terms and are categorized, with large accuracy amounted to 0.94, the level-3 accuracy amounted to 0.73, so the results are satisfying.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 532-533)

Pages:

1239-1242

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.532-533.1239

Citation:

Cite this paper

Online since:

June 2012

Authors:

Hua Liu

Keywords:

Small-Scaled Dataset, Text Categorization, Vector Space Model

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Fabrizio Sebastiani. Machine learning in automated text categorization [J]. ACM Computing Surveys，2002，34(1)：1-47.

Google Scholar

[2] Y. Yang. An evaluation of statistical approaches to text categorization [J]. Journal of Information Retrieval， 1(1/2): 67-88, (1999).

Google Scholar

[3] Pang Jianfeng, etal. Research and implementation of text automatic categorization system based on vector space model [J]. computer application investigation, 2001, 18(9): 23～26.

Google Scholar

[4] Zhou Xuezhong. Researches on Chinese text categorization feature representation and categorization methods [C]. Advances in Computation of Oriental Languages. Beijing: publishing company of Tsinghua University, (2003).

Google Scholar

[5] Chen Keli. Balanced language material analysis and text categorization methods based on large-scale real texts [C]. Advances in Computation of Oriental Languages. Beijing: publishing company of Tsinghua University, (2003).

Google Scholar

[6] Shi tongnian, Lu zhongliang. Researches on multi-classification and multi-label Chinese text automatic categorization [J]. Journal of Information，2003, 22(3): 306-309.

Google Scholar