Semi-Supervised Method for News Summarization in Microblog

Article Preview

Abstract:

With development of Internet, an increasing number of user-generated-contents provide valuable information to the public. Microblog is a new platform where peoples discuss all kinds of topics. It also provides a good opportunity for the researchers to explore the online public opinion. News collection and summarization has been attracted lots of research previously. However, manually labeling is impossible since the task is time-consuming. In this paper, we focus on news summarization with few labeled samples. A semi-supervised learning method has been proposed to tackle the problem. We employ Co-Training method to extract the news information. Posts and replies of Microblog have been identified as two independent views to train a classification model. Entity, Time, place and incident of news have been identified as well. Experimental result in different datasets shows the proposed method outperform the baseline methods.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

5918-5921

Citation:

Online since:

May 2014

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2014 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Sharifi, Beaux, M-A. Hutton, Jugal K. Kalita. Experiments in microblog summarization[C]/ Social Computing (SocialCom), 2010 IEEE Second International Conference on.

DOI: 10.1109/socialcom.2010.17

Google Scholar

[2] Luhn, Hans P. The automatic creation of literature abstracts. IBM Journal of research and development 2. 2 (1958): 159-165.

Google Scholar

[3] Blum, Avrim, Tom Mitchell. Combining labeled and unlabeled data with co-training[C]/ Proceedings of the eleventh annual conference on Computational learning theory. ACM, (1998).

DOI: 10.1145/279943.279962

Google Scholar

[4] Hofmann, Thomas. Probabilistic latent semantic indexing[C]/ Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, (1999).

DOI: 10.1145/312624.312649

Google Scholar

[5] Blei, David M., Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Allocation [J]/the Journal of machine learning research 3 (2003): 993-1022.

Google Scholar