E-Mail Classification Based Learning Algorithm Using Support Vector Machine

Article Preview

Abstract:

Due to the distribution of personal computers and the internet, E-mail has become one of the most widely used communicative means. However, a massive amount of spam mail is polluting mailboxes everyday, taking advantage of the ability to send mail to any number of random people through the internet. In this paper we will introduce an efficient method of classifying E-mails using the SVM(Support Vector Machine) learning algorithm, which is recently showing high performance in the field of classifying documents. The disposition of the words inside the E-mail documents are extracted, and the performance of classification is compared and examined through the learning based on the change of DF value which occurs to reduce the disposition space in the learning level. To assess the performance of the SVM, the SVM is compared to the Naïve Bayes classifier (which uses probability methods) and a vector model classifier in order to verify that the method of using the learning algorithm of SVM shows better performance.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1844-1848

Citation:

Online since:

December 2012

Authors:

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] V. Vapnik, The Natural of Statistical Learning Theory, Springer-Verlag, (1995).

Google Scholar

[2] Shih, F.Y. and Zhang, K, Support vector machine networks for multi-class classification, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 19 No. 6, pp.775-86, (2005).

DOI: 10.1142/s0218001405004320

Google Scholar

[3] Bekkerman R., El-Yaniv R., Tkshby N., Winter Y., On Feature Distributional Clustering for Text Categorization, Proceedings of SIGIR 2001, the Twenty-Fourth Annual International ACM SIGIR Conference, pp.146-153, (2001).

DOI: 10.1145/383952.383976

Google Scholar

[4] William W. Cohen, Learning Rules that Classify E-Mail, AAAI Spring Symposium on Machine Learning in Information Access, 18 25, (1996).

Google Scholar

[5] Tao Li, Shenghuo Zho, Mitsunori Orkhara, Topic Hierarchy Generation via Linear Discriminant Projection, Proceedings of SIGIR 2003, the Twenty-Sixth Annual International ACM SIGIR Conference, pp.421-422, (2003).

DOI: 10.1145/860435.860531

Google Scholar

[6] Harris Drucker and Vladimir N. Vapnik, Support Vector Machines for Spam Categorization, IEEE Transactions on Neural Networks, Vol 10, No 5, September (1999).

DOI: 10.1109/72.788645

Google Scholar

[7] SHEN Fengshan, ZHANG Junying, YUAN Xiguo, Novel Method of Mining Classification Information for SVM Training, Wuhan University Journal of Natural Sciences Vol 16, No 6, 475-480 ISSN 1007-1202, (2011).

DOI: 10.1007/s11859-011-0784-1

Google Scholar

[8] T. Joachims, SVM Light, http: /ais. gmd. de/~thorsten/svm_light, (1998).

Google Scholar

[9] C. Cortes amd V. Vapnik, Support Vector Networks, Machine Learning, Vol. 20, pp.273-297, (1995).

DOI: 10.1007/bf00994018

Google Scholar