Sentiment Analysis of Chinese Micro Blog Using Machine Learning and an Improved Feature Selection Method

Article Preview

Abstract:

With the rapid development of Internet and occurrence of social media services, many users are becoming the creators of social information. However, the normal manual work can't deal with a large number of subjective messages. As a new kind of social media service, micro blog has been widely accepted and can be used for sentiment analysis. This paper compared performances of three machine learning methods on sentiment analysis of Chinese micro blog. We also proposed an improved feature selection method that increases the accuracy of classification. Experiment results show that SVM is closed to Naïve Bayes and they are better than logistic regression in most cases.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1219-1223

Citation:

Online since:

September 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Yanyan Zhao, Bing Qin, Ting Liu. Sentiment analysis. Journal of Software, 2010, 21(8): 1834−1848. In Chinese.

Google Scholar

[2] Pang Bo, Lee Lillian and Vaithyanathan Shivakumar. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP, 2002. 79–86.

DOI: 10.3115/1118693.1118704

Google Scholar

[3] Dasgupta Sajib, Vincent Ng. Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In: the 47th ACL. Volume 2, 2009. 701-709.

DOI: 10.3115/1690219.1690244

Google Scholar

[4] WAN, Xiaojun. Co-training for cross-lingual sentiment classification. In: the 47th Annual Meeting of the ACL: Volume 1-Volume 1, 2009. 235-243.

DOI: 10.3115/1687878.1687913

Google Scholar

[5] TURNEY, Peter D. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Isabelle P, ed. Proc. of the ACL 2002. 417−424.

DOI: 10.3115/1073083.1073153

Google Scholar

[6] Pak, Alexander, Patrick Paroubek. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. LREC, (2010).

Google Scholar

[7] Alec Go, Lei Huang, Richa Bhayani. Twitter sentiment analysis. Information on http: /www-nlp. stanford. edu/courses/cs224n/2009/fp/3. pdf, (2009).

Google Scholar

[8] Lixing Xie, Ming Zhou, Maosong Sun. Hierarchical Structure Based Hybrid Approach to sentiment analysis of Chinese micro blog and its feature Extraction. Journal of Chinese information processing, 2012, 26. 1: 73-83. In Chinese.

Google Scholar

[9] Ikonomakis, M., S. Kotsiantis, and V. Tampakas. Text classification using machine learning techniques. WSEAS Transactions on Computers 4. 8 (2005): 966-974.

Google Scholar

[10] Kolcz, Aleksander, Wen-Tau Yih. Raising the baseline for high-precision text classifiers. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, (2007).

DOI: 10.1145/1281192.1281237

Google Scholar

[11] McCallum, Andrew, and Nigam, Kamal. A comparison of event models for Naive Bayes text classification. AAAI/ICML-98 workshop. on learning for text categorization, p.41–48. Menlo Park, CA: AAAI Press. (1998).

Google Scholar

[12] T Joachims. Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Proceedings of ECML-98, 10th European Conference on Machine Learning. (1997).

DOI: 10.1007/bfb0026683

Google Scholar