A Survey of Using Weakly Supervised and Semi-Supervised for Cross-Domain Sentiment Classification

Article Preview

Abstract:

Supervised machine learning techniques can analyze sentiment very effectively. However, in many languages, there are few appropriate data for training sentiment classifiers. Thus, they need a large corpus of training data. In this paper, weakly-supervised techniques using a large collection of unlabeled text to determine sentiment is presented. The performance of this method maybe less depends on the domain, topic and time period represented by the testing data. In addition, semi-supervised classification using a sentiment-sensitive thesaurus is mentioned. It can be applicable when it does not have any labeled data for a target domain but have some labeled data for other multiple domains designated as the source domains. This method can learn efficiently from multiple source domains. The results show that the weakly-supervised techniques are suitable for applications requiring sentiment classification across some domains and semi-supervised techniques can learn efficiently from multiple source domains.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

637-641

Citation:

Online since:

April 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] P. D. Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, (2002).

DOI: 10.3115/1073083.1073153

Google Scholar

[2] P. D. Turney. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning, Berlin, (2001).

DOI: 10.1007/3-540-44795-4_42

Google Scholar

[3] G. Grefenstette. Corpus-derived first-order, second-order and third-order word affinities. In Proceedings of Euralex, pages 279–290, Amsterdam, (1994).

Google Scholar

[4] K. W. Church and P. Hanks. Word association norms, mutual information, and lexicography. Computational Linguistics, 16: 22–29, (1990).

DOI: 10.3115/981623.981633

Google Scholar

[5] W. Lowe. Towards a theory of semantic space. In Proceedings of the 6th Neural Computation and Psychology Workshop, pages 303–311. Springer Verlag, (2001).

Google Scholar

[6] D. Bollegala, D. Weirand, J. Carroll (2011, in press). Using multiple sources to construct a sentimentsensitive thesaurus for cross-domain sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon.

Google Scholar

[7] B. Pang and L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, (2004).

DOI: 10.3115/1218955.1218990

Google Scholar

[8] J. Carroll. Unsupervised and Semi-Supervised Approaches to Cross-Domain Sentiment Classification. School of Informatics, University of Sussex, UK.

Google Scholar