The Design and Implementation of Micro-Blog User Interest Search Engine Base on Cloud Computing Technology

Bei Zhan Wang; Kang Chen; Wei Long Ye; Xu Wang

doi:10.4028/www.scientific.net/AMM.543-547.3294

Paper Titles

Security Problem Modeling of Database Connection Pool
p.3276

Research on the Construction of SaaS-Based Network Publicity Support Platform
p.3280

Research on the Security Audit of Database Connection Pool
p.3286

Framework Design for RFID Middleware
p.3290

The Design and Implementation of Micro-Blog User Interest Search Engine Base on Cloud Computing Technology
p.3294

Fully Secure Codes Based Tracing and Revoking Scheme with Constant Ciphertext
p.3300

Research on Functional Structure and Database Platform for Enterprise OA System
p.3308

Scheme Selection of Software Quality Based on Support Vector Machine
p.3312

The Design and Implementation of Resources Sharing System Based on the Cloud Computing
p.3316

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 543-547The Design and Implementation of Micro-Blog User...

The Design and Implementation of Micro-Blog User Interest Search Engine Base on Cloud Computing Technology

Abstract:

With the rapid development of Internet and the explosive growth of Internet information, massive data processing received more concerns. Micro-blog, which is an important representative pattern of the Internet development in the future, has become the essential tool of communication and marketing to all of us. Processing and using the massive data resulting from micro-blog activities has becomes a hot topic. In this paper, we propose a method to design and implement the User Interest Based Search Engine, a search engine can be used to search for the same interest micro-blog users. We at first crawl massive micro-blog data from micro-blog websites, and store this data in HBase. Then we process the massive data and build indices using MapReduce. Finally, we build a search engine web site based on Solr, and we propose a rank algorithm for searching. By employing this User Interest Based Search Engine, we can accurately search other users with the same interests as ourselves.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 543-547)

Pages:

3294-3299

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.543-547.3294

Citation:

Cite this paper

Online since:

March 2014

Authors:

Bei Zhan Wang*, Kang Chen, Wei Long Ye, Xu Wang

Keywords:

Cloud Computing, Information Retrieval, MapReduce, User Interest Search

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable: A distributed storage system for structured data, In: Proc. of the 7th USENIX Syrup. on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2006. 205-218.

DOI: 10.1145/1365815.1365816

Google Scholar

[2] Ghemawat S, Gobioff H, Leung ST. The Google file system, In: Proc. Of the 19th ACM Symp. on Operating Systems Principles. New York: ACM Press. 2003. 29-43.

DOI: 10.1145/945445.945450

Google Scholar

[3] Dean J, Ghemawat S. MapReduce: Simplified dataprocessing on large clusters, In: Proc. of the 6th Symp. on Operating System Design andImplementation. Berkeley: USENIX Association, 2004. 137-150.

Google Scholar

[4] Tom White Hadoop: The Definitive Guide, 3rd Edition，O'Reilly Media. (2012).

Google Scholar

[5] Apache Solr, http: /lucene. apache. org/solr.

DOI: 10.1007/978-1-4842-1070-3_1

Google Scholar

[6] Apache HBase, http: /hbase. apache. org.

Google Scholar

[7] IK Analyzer, https: /code. google. com/p/ik-analyzer.

Google Scholar

[8] G. Salton, A. Wong, and C. S. Yang , A Vector Space Model for Automatic Indexing, Communications of the ACM, vol. 18, nr. 11, 1975, pages 613–620. (Article in which a vector space model was presented).

DOI: 10.1145/361219.361220

Google Scholar

[9] Bell, J. L. Boolean-Valued Models and Independence Proofs in Set Theory, Oxford, (1985).

Google Scholar

[10] Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry: The PageRank Citation Ranking: Bringing Order to the Web,. Technical Report. Stanford InfoLab. (1999).

Google Scholar

[11] Hosmer, D. W. and S. Lemeshow: Applied logistic regression. New York; Chichester, Wiley, (2000).

Google Scholar