Research and Implementation of Text Classification Algorithm

Jian Si Ren

doi:10.4028/www.scientific.net/AMM.644-650.2395

Paper Titles

A Sufficient Condition for Low-Rank Recovery via Iterative Hard Thresholding Pursuit
p.2378

Research on Equipment Maintenance Support Information Modeling
p.2382

A New Method for Reproducing Oil Paintings Based on 3D Printing
p.2386

Research on Optimization Technologies in Passenger Evacuation
p.2390

Research and Implementation of Text Classification Algorithm
p.2395

Research on Ranking Model Based on Multi-User Attribute Comprehensive Evaluation Method
p.2399

Computational Complexity of Training Algorithm for a Kind of Neural Network
p.2403

Error Analysis of Neural Network with Rational Spline Weight Function
p.2407

A Recommendation System of Highway ETC Card Based on Decision Tree Theory
p.2411

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 644-650Research and Implementation of Text Classification...

Research and Implementation of Text Classification Algorithm

Abstract:

The development of Internet and digital library has triggered a lot of text categorization methods. How to find desired information accurately and timely is becoming more and more important and automatic text categorization can help us achieve this goal. In general, text classifier is implemented by using some traditional classification methods such as Naive-Bayes (NB). ARC-BC (Associative Rule-based Classifier by Category) can be used for text categorization by dividing text documents into subsets in which all documents belong to the same category and generate associative classification rules for each subset. This classifier differs from previous methods in that it consists of discovered association rules between words and categories extracted from the training set. In order to train and test this classifier, we constructed training data and testing data respectively by selecting documents from Yahoo. The experimental result shows that the performance of ARC-BC based text categorization is very pretty efficient and effective and it is comparable to Naïve Bayesian algorithm based text categorization.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 644-650)

Pages:

2395-2398

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.644-650.2395

Citation:

Cite this paper

Online since:

September 2014

Authors:

Jian Si Ren*

Keywords:

ARC-Bc Algorithm, Naive-Bayes, Support Vector Machine (SVM)

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Deng Cai, Xiaofei He, Manifold Adaptive Experimental Design for Text Categorization, Knowledge and Data Engineering, IEEE Transactions on, Volume 24, Issue 4, pages 707-719, (2012).

DOI: 10.1109/tkde.2011.104

Google Scholar

[2] Yang Y., Slattery S., and Ghani R, A study of approaches to hypertext categorization, Journal of Intelligent Information Systems, Volume 18, Number 2, (2002).

Google Scholar

[3] Huiling Chen, Bo Yang, Jie Liu, Dayou Liu, A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis, Expert Systems with Applications, Volume 38, Issue 7, pages 9014-9022, (2011).

DOI: 10.1016/j.eswa.2011.01.120

Google Scholar

[4] Yang Y. and Liu X., A re-examination of text categorization methods, In International ACM-SIGIR Conference on Research and Development in Information retrieval, (1999).

DOI: 10.1145/312624.312647

Google Scholar

[5] Agrawal, R., Srikant, Fast Algorithm for Mining Association Rules, Proc. VLDB Conf., 487-499, Santiago, Chile, (1994).

Google Scholar

[6] Han J., Kamber M., Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, ISBN 1-55860-489-8, (2002).

Google Scholar

[7] Bijalwan Vishwanath, Kumar Vinay, Kumari Pinki, Pascual Jordan, KNN based Machine Learning Approach for Text and Document Mining, International Journal of Database Theory & Application, Volume 7, Issue 1, pages 61-70, (2014).

DOI: 10.14257/ijdta.2014.7.1.06

Google Scholar

[8] Shouhui Pan, Li Wang, Guoping Xia, Mining association rules from consumer product safety cases based on text classification, Journal of Convergence Information Technology, Volume 7, Number 9, pages 422-430, (2012).

DOI: 10.4156/jcit.vol7.issue9.50

Google Scholar

[9] Osmar R. Zaïane, Maria-Luiza Antonie, Classifying text documents by associating terms with text categories, " in Proc. of the Thirteenth Australasian Database Conference (ADC, 02), Melbourne, Australia, January 28-February 1, (2002).

Google Scholar

[10] Baharum Baharudin, Lam Hong Lee, Khairullah Khan, A Review of Machine Learning Algorithms for Text-Documents Classification, Journal of Advances in Information Technology, Volume 1, Number 1, pages 4-20, (2010).

DOI: 10.4304/jait.1.1.4-20

Google Scholar