An Improved Naive Bayesian Classification Algorithm for Sentiment Classification of Microblogs

Article Preview

Abstract:

For the attribute-weighted based naive Bayesian classification algorithms, the selection of the weight directly affects the classification results. Based on this, the drawbacks of the TFIDF feature selection approaches in sentiment classification for the microblogs are analyzed, and an improved algorithm named TF-D(t)-CHI is proposed, which applies statistical calculation to obtain the correlation degree between the feature words and the classes. It presents the distribution of the feature items by variance in classes, which solves the problem that the short-texts contain few feature words while the high frequency feature words have too high weight. Experimental result indicate that TF-D(T)-CHI based naive Bayesian classification for feature selection and weight calculation has better classification results in sentiment classification for microblogs.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

3614-3620

Citation:

Online since:

March 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] A. Java, X. D. Song and T Finin. Why we twitter: Understanding microblogging usage and communities[C]/ Proc of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis. New York: ACM (2007), pp.56-65.

DOI: 10.1145/1348549.1348556

Google Scholar

[2] Z. M. Liu and L. Liu. Computer Engineering and Applications. Vol. 48 (2012), pp.1-4. In Chinese.

Google Scholar

[3] T. Zagibalov and J. Carroll. Unsupervised Classificaiton of Sentiment and Objectivity in Chinese Text. IJCNLP'08, Hyderabad, India (2008), pp.304-311.

Google Scholar

[4] Y. S. Jing, V. Pavlovic and J. M. Rehg. Machine Learning. Vol. 73 (2008), pp.155-184.

Google Scholar

[5] S. Y. Zhang, S. M. Wang and H. Y. Huang. Journal of the China Society for Scientific and Technical Information. Vol. 24 (2007), pp.58-65. In Chinese.

Google Scholar

[6] S. M. Peng. Research on Feature Extraction Algorithm in Chinese Text Categorization. Master Thesis in Chongqing University (2006). In Chinese.

Google Scholar

[7] X. H. Fan and M. S. Sun. Chinese Journal of Computers. Vol. 29 (2006), pp.124-131. In Chinese.

Google Scholar

[8] S. W. Shan, S. C. Feng and X. M. Li. Computer Engineering and Applications. Vol. 39 (2003), pp.146-148. In Chinese.

Google Scholar

[9] F. Zhang. Microcomputer Development. Vol. 15 (2005), pp.125-127. In Chinese.

Google Scholar

[10] L. Liu. Research and Application of Naïve Bayes Classification algorithm Based on Feature Weighting. Master Thesis in Sun Yat-Sen University (2009). In Chinese.

Google Scholar

[11] S. L. Zhou. Journal of Zhengzhou University. Vol. 43 (2011), pp.73-77. In Chinese.

Google Scholar