Chinese Sentiment Classifier Machine Learning Based on Optimized Information Gain Feature Selection

Article Preview

Abstract:

Machine learning is important solution in the research of Chinese text sentiment categorization , the text feature selection is critical to the classification performance. However, the classical feature selection methods have better effect on the global categories, but it misses many representative feature words of each category. This paper presents an improved information gain method that integrates word frequency and degree of feature word sentiment into traditional information gain methods. Experiments show that classifier improved by this method has better classification .

You might also be interested in these eBooks

Info:

Periodical:

Pages:

511-516

Citation:

Online since:

July 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Zhao Zhiwei. Chinese Text Orientation Analysis[D]. Anhui University, (2012).

Google Scholar

[2] Zhao Yanyan, Qin Bing, Liu Ting. Sentiment Analysis[J]. Journal of Software, 2010,21(8):pp.1834-1848.

Google Scholar

[3] Bo Pang, Lillian Lee, Shivakumar Vaithyanathan. Sentiment Classfication using Machine Learning Techniques, the 2002 Conference on Empirical Methods in Natural Language Processing, 2002, pp.79-86.

DOI: 10.3115/1118693.1118704

Google Scholar

[4] Xu Linhong, Lin Hongfei, YangZhihao. Text Orientation Identification Based on Semantic Comprehension. [J]. Journal of Chinese Information Processing, 2007, 21(1), pp.96-100.

Google Scholar

[5] Tang Huifeng, Tan Songbo, Chen Xueqi. Research on Sentiment Classification of Chinese Reviews Based on Supervised Machine Learning Techniques[J]. Journal of Chinese Information Processing, 2007,21(6):pp.88-94.

Google Scholar

[6] JiangHong. Text Representation and Algorithms for Chinese Text Classification[D].Zhejiang Normal University, (2007).

Google Scholar

[7] Zhang Yun-tao, Gong Ling, Wang Yong-cheng. An improved TF-IDF approach for text classification[J]. Journal of Zhejiang University SCIENCE. 2005 6A(1): pp.49-55.

DOI: 10.1631/jzus.2005.a49

Google Scholar

[8] Yiming Yang, Jan O. Pedersen. A Comparative Study on Feature Selection in Text Categorization[A]. Proceedings of the 14th International Conference on Machine learning[C]. Nashville: Morgan Kaufmann, 1997: pp.412-420.

Google Scholar

[9] Jiawei Han, Micheline Kamber. Data Mining: Concepts and Techniques[M]. Translate by Fan Ming, Meng Xiaofeng. China Machine Press, (2011).

Google Scholar

[10] Lv Hao, Lin Jun, Zeng Xiaoxian. Research and Application of Improved Naïve Bayesian Classification Algorithm[J]. Journal of Hunan University, 2012, 12. pp.1-4.

Google Scholar

[11] Qian Xiaodong, Wang Zheng-ou. Text Categorization Method Based on Improved KNN[J]. Information Science, 2005(4).

Google Scholar

[12] Eduardo Jose Bayro-Corrochano, Nancy Arana-Daniel. Clifford Support Vector Machines for Classification, Regression and Recurrence[J]. IEEE Transactions on Neural Networks, 2010, 21(11).

DOI: 10.1109/tnn.2010.2060352

Google Scholar

[13] Liu Qinghe, Liang Zhengyou. Optimized approach of feature selection based on information gain. Computer Engineering and Applications,2011,47(12):pp.130-132.

Google Scholar

[14] He Fengying. Orientation analysis for Chinese blog text based on semantic comprehension. Journal of Computer Applications. 2011, 31(8): pp.2130-2136.

Google Scholar