A Semantic-Based Algorithm for Microblogs Clustering

Article Preview

Abstract:

Microblogging has become a major tool for people to not only share information, but also to talk about current affairs. Has become the most popular content in the analysis, interested companies and researchers. We focus on the micro-blog clustering high-dimensional, high sparse, and proposed a new algorithm based on k-means-k frequent itemsets. In addition, the development of a method to capture long-term mutual information context knowledge in microblogging and algorithms are also designed to measure the conversation Similar. In order to support the new micro-blog clustering algorithm. Experimental results show that the clustering algorithm has higher accuracy than the standard k-means and two points in k-means algorithm toward large-capacity and highly sparse microblogging also maintain good scalability.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1174-1177

Citation:

Online since:

January 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] http: /www. instantmessagingplanet. com.

Google Scholar

[2] Faisal M. Khan, Todd A. Fisher, Lori Shuler, Tianhao Wu, and William M. Pottenger.: Mining chat-room conversations for social and semantic interactions (2002).

Google Scholar

[3] J. Resig and A. Teredesai.: A framework for mining instant messaging services. In Proceedings of the 2004 SIAM Workshop on Link Analysis, Counter-terrorism, and Privacy, Lake Buena Vista, Florida (2004).

Google Scholar

[4] Tianhao We, Fisal M. Khan, Todd A. Fisher, Lori A. Shuler, and Willian M. Pottenger.: Error-driven boolean-logic-rule-based learning for mining chat-room conversations (2002).

Google Scholar

[5] Nakajima, Shinsuke, Junichi Tatemura, Yoichiro Hino, Yoshinori Hara, and Katsumi Tanaka.: Discovering Important Bloggers based on Analyzing Blog Threads. WWW2005, Workshop on the Weblogging Ecosystem. Chiba, Japan (2005).

Google Scholar

[6] Andreas Hotho, Alexander Maedche, Steffen Staab.: Ontology-based Text Document Clustering. KI 16(4) (2002) 48-54.

Google Scholar

[7] L. Jing, L. Zhou, M.K. Ng, and J.Z. Huang.: Ontology-based distance measure for text clustering, In Porc. Of the SIAM SDM on Text Mining Workshop (2006).

Google Scholar

[8] ParvathiChundi, Rui Zhang, MaluCastellanos.: Entropy Based Measure Functions for Analyzing Time Stamped Documents. SIAM (2006).

Google Scholar

[9] Zhou Meili.: Some concepts and mathematical consideration of similarity system theory. Journal of System Science&Systems Engineering 1 (1992) 84~92.

Google Scholar

[10] HowNet http: /www. keenage. com.

Google Scholar

[11] Yi Guan, Xiao-long Wang, Xiang-yongKong, Jian Zhao.: Quantifying Semantic Similarity of Chinese Words from Hownet. IEEE Proceedings of 2002 internetional conference on machine learning and cybernetics(ICMLC02), Volumn 1. Beijing (2002) 234-239.

DOI: 10.1109/icmlc.2002.1176746

Google Scholar

[12] A. Daemi, J. Calmet: From Ontologies to Trust through Entropy, Proceedings of the International Conference on Advances in Intelligent Systems - Theory and Applications, Luxembourg (2004).

Google Scholar