Research and Implementation of Improved Real-Time Crawler Modeling

Xiang Lin Zuo; Wen Bo Wang; Ying Wang; Wan Li Zuo

doi:10.4028/www.scientific.net/AMM.312.791

Paper Titles

Large-Scale Data Classification Based on Ball Vector Machine
p.771

Application of PLC Technology in the Control System of Plate Shearing Machine
p.777

Design of Universal Control System Development Platform for Small and Medium-Sized Packaging Machinery and Equipment
p.782

Modeling and Simulation of Sensorless Control of BLDC Motor Based on Submersible Pump
p.786

Research and Implementation of Improved Real-Time Crawler Modeling
p.791

Analysis of Platen Die-Cutting Mechanism Based on the Axiomatic Design Theory
p.796

Parametric Modeling Method For Globoidal Indexing Cam
p.800

An Intelligent Modeling Method for Welding Deviation of Rotating Arc NGW
p.805

Digitized Decision-Making System Research for Warehouse Planning Based on CAD
p.810

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vol. 312Research and Implementation of Improved Real-Time...

Research and Implementation of Improved Real-Time Crawler Modeling

Abstract:

The past decade has witnessed the rapid development of search engines, which has become an indispensable part of everyday life. However, people are no longer satisfied with accessing to ordinary information, and they may instead pay more attention to fresh information. This demand poses challenges to traditional search engines, which concern more about relevance and importance of web pages. A search engine compresses three modules: crawler, indexer and searcher. Changes are needed for all these three parts to improve search engine's freshness. This paper investigates the first part of search engine crawler, we analyze the requirements for real-time crawler, and propose a novel real-time crawler based on more accurate estimation of refresh time. Experimental results demonstrate that the proposed real-time crawler can help search engine improve its freshness.

You might also be interested in these eBooks

Applied Research and Engineering Solutions in Industry

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volume 312)

Pages:

791-795

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.312.791

Citation:

Cite this paper

Online since:

February 2013

Authors:

Xiang Lin Zuo, Wen Bo Wang, Ying Wang, Wan Li Zuo

Keywords:

Crawler, Real-Time, Search Engine

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Krazit, T. Google launches Twitter timeline search. http://news.cnet.com/8301-30684_3-20002 453-265. html, 2010.

Google Scholar

[2] Xin Zhang, Ben He, Tiejian Luo, Baobin Li. Query-biased learning to rank for real-time twitter search. CIKM, Pages 1915-1919, 2012.

Google Scholar

[3] Takeshi Sakaki, Makoto Okazaki, Yutaka Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors. WWW, Pages 851-860, 2010.

DOI: 10.1145/1772690.1772777

Google Scholar

[4] MISRA, P., SORENSON, H. 1975. Parameter estimation in Poisson processes. IEEE Trans. Inf. Theory IT-21, 87–90.

DOI: 10.1109/tit.1975.1055324

Google Scholar

[5] JunJunghoo Cho, Hector Garcia-Molina. Estimating frequency of change. ACM Transactions on Internet Technology, Vol. 3, No. 3, August 2003, Pages 256–290.

DOI: 10.1145/857166.857170

Google Scholar

[6] Ashutosh Dixit, Dr. K. Sharma, A Mathematical Model for Crawler Revisit Frequency. 2010 IEEE 2nd International Advance Computing Conference.

DOI: 10.1109/iadcc.2010.5422936

Google Scholar

[7] Junghoo Cho, Hector Garcia-Molina. The Evolution of the Web and Implications for an Incremental Crawler. In Proceedings of the 8thWorld-Wide Web Conference, 2003.

Google Scholar

[8] Donald Metzler, Rosie Jones, Fuchun Peng, Ruiqiang Zhang. Improving Search Relevance for Implicitly Temporal Queries.SIGIR'09, July 19–23, 2009, Boston, Massachusetts, USA.

DOI: 10.1145/1571941.1572085

Google Scholar

[9] Xiao Ling, Daniel S. Temporal Information Extraction. Weld, 2010, Association for the Advancement of Artificial Intelligence.

Google Scholar

[10] Anlei Dong, Yi Chang, Zhaohui Zheng, Gilad Mishne, Jing Bai, Ruiqiang Zhang, Karolina Buchner, Ciya Liao, Fernando Diaz. Towards Recency Ranking in Web Search. WSDM'10, February 4–6, 2010, New York City, New York, USA.

DOI: 10.1145/1718487.1718490

Google Scholar