Algorithm of Text Categorization Based on Cloud Computing

Li Qin Huang; Li Qun Lin; Yan Huang Liu

doi:10.4028/www.scientific.net/AMM.311.158

Paper Titles

Using Graphic Hardware to Accelerate Pocketing Tool-Path Generation
p.135

The Development of an Automatic Question Generation System on Facebook Using an Artificial Bee Colony Algorithm
p.141

Utilizing Traveral Sequence Order for Storage Layout in Walkthrough
p.147

A Fast and Smooth Carving Algorithm for Online 3D Reconstruction
p.153

Algorithm of Text Categorization Based on Cloud Computing
p.158

The DTMF and Ethernet-Based Family Safety Monitoring System
p.167

Implementation of Face Recognition Based on 3D Image
p.173

Implementation of Hybrid Algorithms for Real-Time Face Recognition
p.179

Using the Interactive Design of Gesture Recognition in Augmented Reality
p.185

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 311Algorithm of Text Categorization Based on Cloud...

Algorithm of Text Categorization Based on Cloud Computing

Abstract:

MapReduce framework of cloud computing has an effective way to achieve massive text categorization. In this paper a distributed parallel text training algorithm in cloud computing environment based on multi-class Support Vector Machines(SVM) is designed. In cloud computing environment Map tasks realize distributing various types of samples and Reduce tasks realize the specific SVM training. Experimental results show that the execution time of text training decreases with the number of Reduce tasks increasing. Also a parallel text classifying based on cloud computing is designed and implemented, which classify the unknown type texts. Experimental results show that the speed of text classifying increases with the number of Map tasks increasing.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 311)

Pages:

158-163

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.311.158

Citation:

Cite this paper

Online since:

February 2013

Authors:

Li Qin Huang, Li Qun Lin, Yan Huang Liu

Keywords:

Cloud Computing, MapReduce, Parallelization, Text Categorization

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] A Weiss, Computing in the Clouds, NetWorker, 11(4) (2007) 16-25.

Google Scholar

[2] R Buyya, CS Yeo, S Venugopal, Market-Oriented Cloud Computing,Vision, Hype, and Reality for Delivering IT Services as Computing Utilities, Proceedings of the 2008 l0th IEEE International Conference on High Performance Computing and Communications, 2008, pp.5-13.

DOI: 10.1109/hpcc.2008.172

Google Scholar

[3] Dean J, Ghemmawat S. MapReduce, Simplied data processing on large clusters, Proceedings of the 6th Sympesium on Operating System Design and Implementation, New York, ACM Press, 2004, p.137 – 150.

Google Scholar

[4] Chu C T, Kim S K, Lin Y A, Yu Y, Bradski G R, Ng A Y, Olukotun K, Map-Reduce for Machine Learning on Multicore, 2006, pp.281-288.

Google Scholar

[5] Ghemawat S, Gobioff H, Leung S T, The Google file systern, Proceedings of the 19th ACM Symposium on Operating Systems Principles, New York, ACM Press, 2003, pp.29-43.

DOI: 10.1145/945445.945450

Google Scholar

[6] Chang F, Dean J, Ghemawat S, et al. BigTable, A distributed storage system for structured data, ACM Transactions on Computer Systems, 26(2)(2008)1-26.

DOI: 10.1145/1365815.1365816

Google Scholar

[7] Xiang Xiaojun, Gao Yang, Shang Lin, Yang Yubin, Parallel Text Categorization of Massive Text Based on Hadoop, Computer Science, 38(10)(2011)153-158.

Google Scholar

[8] Apache. Hadoop on http://hadoop.apache.org.

Google Scholar

[9] Tom White, Hadoop: The Definitive Guide, first ed., O'Reilly Media Inc., 2009.

Google Scholar

[10] Bicheng Li, Meizhen Shao, Jie Huang, Pattern Recognition Theory and Application, first ed., Xi'an University of Electronic Science and Technology Press, Xi'an, 2008.

Google Scholar

[11] Sebastiani F, Machine learning in automated text categorization, ACM Computing Surveys, 34(12)(2002)41-47.

DOI: 10.1145/505282.505283

Google Scholar

[12] Thorsten Joachims, Training linear SVMs in linear time, Proceedings of the 12thth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, 2006, pp.217-226.

DOI: 10.1145/1150402.1150429

Google Scholar

[13] ICTCLAS(Institute of Computing Technology, Chinese Lexical Analysis System) on http://ictclas.org/.

Google Scholar

[14] Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2 (3)(2011) 301-312.

DOI: 10.1109/72.857780

Google Scholar