E-Mail Filtration and Classification Based on Variable Weights of the Bayesian Algorithm

Zong Jie Wang; Yi Liu; Zhong Jian Wang

doi:10.4028/www.scientific.net/AMM.513-517.2111

Paper Titles

A Novel Bit-Flipping LDPC Decoder for Solid-State Data Storage
p.2094

Collection and Analysis of Emotional Data in Bulletin Board System Forum of University
p.2099

Application of Data Mining Technology in CRM
p.2103

A Research of GIS Software Application Based on Cloud Computing
p.2107

E-Mail Filtration and Classification Based on Variable Weights of the Bayesian Algorithm
p.2111

Migration of Stored Procedure to Distributed Cloud Database
p.2115

Design of Books Analysis System in University Library Based on Data Warehouse
p.2121

The Key Technology for Mobile Digital Campus
p.2125

Electronic Data Interchange on Logistics System Based on Embedded Linux
p.2129

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 513-517E-Mail Filtration and Classification Based on...

E-Mail Filtration and Classification Based on Variable Weights of the Bayesian Algorithm

Abstract:

The co-occurrence word emphasize the word and word internal relations, so its use can improve shortage from the hypothetical of Bayesian algorithm. To build Token Dictionary, Information Gain algorithm is used to choose Tokens, and Synonyms Dictionary is used to acquire more Tokens. By large amounts of training, the matching scores of Token are counted, according to the matching rate the Tokens that is valuable are selected, and the Token Dictionary is established. The proposed method is used to E-mail classification experiment, the results show that the accuracy of spam filter has a well improvement.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 513-517)

Pages:

2111-2114

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.513-517.2111

Citation:

Cite this paper

Online since:

February 2014

Authors:

Zong Jie Wang*, Yi Liu, Zhong Jian Wang

Keywords:

Co-Occurrence Word, Information Gain, Legitimate E-Mail, Naive Bayesian, Spam

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Qiu Kening, Guo Qingshun, Zhang Xiaobo. The Research of Personalized Classification E-mail System Based on Agent. Computer Engineering and Application. Vol. 30No. 7, July 2005, pp.176-178.

Google Scholar

[2] METSIS V, ANDROUTSOPOULOS I, PALIOURAS G. Spam filtering with Naive Bayes-Which Naive Bayes? [ C] / / Proc of the 2nd Conference on E-mail and AntiSpam( CEAS) . California Mountain View, 2006 : 27- 28.

Google Scholar

[3] Y. H. Li and A. K. Jain. Classification of Text Documents. The Computer Journal. Vol. 41(8). 1998: 537-546.

Google Scholar

[4] Mitchell TM. Machine Learning[M]. McGraw-Hill. (1997).

Google Scholar

[5] Kenneth.W. C and Patrick H. Word Association Norms, Mutual Information and Lexicography. In Proceedings of ACL 27, Vancouver, Canada, 1989. PP: 76-83.

Google Scholar

[6] CERNET Computer Emergency Response Team. http: /www. ccert. edu. cn/spam/sa/datasets. htm.

Google Scholar