Design and Implementation of Distributed Crawler System for Opinion Mining

Article Preview

Abstract:

With the development of Internet, Network public opinion has been serving an import role in reflection of social public opinion. As there are a large number of websites and forums on the Internet, we need a powerful crawler system which can meet the demands of opinion mining. However, common crawler systems concern more about ranking and recommendation algorithms, which is less important in opinion mining. In this article, we introduced the design and implementation of a distributed crawler system for opinion mining. We also introduced some extra parameters such as keywords count and published time into the ranking and refreshing strategies. Experimental results demonstrate that the system can well support different sites, and the improved strategies can greatly enhance the crawling and monitoring efficiency.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

2506-2510

Citation:

Online since:

August 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] YuFeng Zhang, Fei Long and Lv Bin , Identifying Opinion Sentences and Opinion Holders in Internet Public Opinion, Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on , vol., no., pp.1668-1671, 23-25 Aug. (2012).

DOI: 10.1109/icicee.2012.441

Google Scholar

[2] Kc. M., Hagenbuchner. M. and Ah Chung Tsoi, A Scalable Lightweight Distributed Crawler for Crawling with Limited Resources, " Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT , 08. IEEE/WIC/ACM International Conference on , vol. 3, no., pp.663-666, 9-12 Dec. (2008).

DOI: 10.1109/wiiat.2008.234

Google Scholar

[3] Gupta. P. and Johari. K., Implementation of Web Crawler, Emerging Trends in Engineering and Technology (ICETET), 2009 2nd International Conference on , vol., no., pp.838-843, 16-18 Dec. (2009).

DOI: 10.1109/icetet.2009.124

Google Scholar

[4] Bing Zhou, Bo Xiao, Zhiqing Lin and Chuang Zhang, A distributed vertical crawler using crawling-period based strategy, Future Computer and Communication (ICFCC), 2010 2nd International Conference on , vol. 1, no., pp. V1-306-V1-311, 21-24 May (2010).

DOI: 10.1109/icfcc.2010.5497780

Google Scholar

[5] Suganthan and G.C.P., AJAX Crawler, Data Science & Engineering (ICDSE), 2012 International Conference on , vol., no., pp.27-30, 18-20 July (2012).

DOI: 10.1109/icdse.2012.6282319

Google Scholar

[6] Shaojie Qiao, Tianrui Li, Hong Li, Yan Zhu, Jing Peng and Jiangtao Qiu, SimRank: A Page Rank approach based on similarity measure, Intelligent Systems and Knowledge Engineering (ISKE), 2010 International Conference on , vol., no., pp.390-395, 15-16 Nov. (2010).

DOI: 10.1109/iske.2010.5680842

Google Scholar

[7] Harb H.M., Khalifa A.R. and Ishkewy H.M., Personal search engine based on user interests and modified page rank, Computer Engineering & Systems, 2009. ICCES 2009. International Conference on , vol., no., pp.411-417, 14-16 Dec. (2009).

DOI: 10.1109/icces.2009.5383228

Google Scholar