Extracting Hot Topics from Microblogging Based on Keywords Detection and Text Clustering

Article Preview

Abstract:

Following with news and forums, microblogging becomes the third largest source of Internet public opinion. So it is necessary to do research of microblogging topic discovery. Firstly, we detect hot topics through the the keywords detection algorithm. Secondly, elect most popular microblogging text in the massive microblogging text by combining of keywords weigh and textual information entropy. Finally, using the dynamic clustering algorithm, the microblogging text elected, will form into different news topics by clustering polymerization. According to experimental validation of the true microblogging data, hot topic can be effectively detected from the text of a large number through the method.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2289-2293

Citation:

Online since:

February 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] China Internet Network Information Center, China's Internet Development Survey Report. (2011).

Google Scholar

[2] J. Allan,J. Carbonell,G. Doddington, J. Yamron and Y. Yang .Topic detection and tracking pilot study: Finalreport [A] .In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, Virginia: Lansdowne, February 1998:194-218P.

Google Scholar

[3] Yongheng Wang , Yan Jia , massive short message text clustering technology . Computer Engineering, 2007, 33 ( 14): 38-40.

Google Scholar

[4] Chunxia Jin , Haiyan Zhou , the dynamics of Chinese short text clustering . computer engineering and applications, 2011, 47 ( 33): 156-158.

Google Scholar

[5] Zhang H P, Yu H K, Xiong D Y , et al. HHMM-based Chinese lexical analyzer ICTCLAS[A] / Proceedings of the second SIGHAN workshop on Chinese language processing-Volume 17, 2003[C]. Sapporo, Japan: Association for Computational Linguistics, 2003: 184-187.

DOI: 10.3115/1119250.1119280

Google Scholar

[6] M. Sahami and T.D. Heilman. A web-based kernel function for measuring the similarity of short text snippets[C]. In Proc of WWW`06, 2006, pp.377-386.

DOI: 10.1145/1135777.1135834

Google Scholar

[7] Tao He , Xianbin Cao , based on Chinese network short text clustering algorithm. Journal of automation, 2009, 35 ( 7): 896-902.

DOI: 10.3724/sp.j.1004.2009.00896

Google Scholar