DOM-Based Multi-Factor Web Information Extraction Study

Shun Zhang; Xing Shu Chen; Jun Tan

doi:10.4028/www.scientific.net/KEM.467-469.1267

Paper Titles

A HSEG Model for Bending Size-Dependency of Polymer Nanofiber
p.1245

Study the Residential Land Demand by Ridge Regression and Multiple Linear Regression
p.1250

PPPs Performance Evaluation Based on RBF Neural Network
p.1256

The Application of Binocular Visual Theory in the Depth Measurement on Castings Defect
p.1262

DOM-Based Multi-Factor Web Information Extraction Study
p.1267

Research and Development for Cable Industry-Oriented Manufacturing Process Information System Based on Rough Set Theory
p.1273

Plastic Deformation and Surface Recrystallization of Cu-4 Mass%Zn Alloy under Instantaneous Extrusion and High Speed Friction
p.1280

Multi-Objective Parameters Optimization Design of Single-Pipe Ring-Type Mixing Water Oil-Gathering Pipe Network
p.1285

Neural Ensemble Coding during Working Memory Task in Rat Prefrontal Cortex
p.1291

HomeKey Engineering MaterialsKey Engineering Materials Vols. 467-469DOM-Based Multi-Factor Web Information Extraction...

DOM-Based Multi-Factor Web Information Extraction Study

Abstract:

With the development of Internet, web page is still the main form of network information transmission. The number of web pages is growing at the rate of 10 million a day, and also the complexity of web information itself, which all make it difficult for the theme search engines to search information rapidly and accurately. Therefore, higher requirements are put forward to web information extraction. In this paper, a DOM-based multi-factor web information extract Algorithm (DMWE) is proposed, which can extract theme information rapidly and accurately.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Key Engineering Materials (Volumes 467-469)

Pages:

1267-1272

DOI:

https://doi.org/10.4028/www.scientific.net/KEM.467-469.1267

Citation:

Cite this paper

Online since:

February 2011

Authors:

Shun Zhang, Xing Shu Chen, Jun Tan

Keywords:

DOM Tree, Information Extraction, Search Engine, Topic Information

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Long Li, Pang Hong-Shen, Review of domestic and intermation Web information extraction[J]. Library science, 2008. pp.13-15.

Google Scholar

[2] Zhu Wei-Hua, Lu Yi, Liu Bing-Bing, HMM-based Web information extraction algorithm and its application[J], Computer Science, 2010. pp.203-206.

Google Scholar

[3] BUYUKKOKTEN O，GARCIA—MOLINA H，PAEPCKE A．Accor—Dion summarization for end-game browsing on PDAs and cllular phones[C]／Proc of ACM Conference Of Human Factors in Computing Systems．New York：ACM Press，2001. p.213—220.

DOI: 10.1145/365024.365102

Google Scholar

[4] Deng Cai, Shipeng Yu, Ji-Rong Wen and Wei-Ying Ma—Ectracting Content Structure for Web Pages based on Visual Representation / Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications[C], 2003. pp.207-214.

DOI: 10.1007/3-540-36901-5_42

Google Scholar

[5] Yang Jun, Li Zhi-Shu, DOM-based web information extraction[J], Sichuan University(Natural Science), 2008. pp.1077-1080.

Google Scholar

[6] Bailey, P., Craswell, N., and Hawking, D., Engineering a multi-purpose test collection for Web retrieval experiments, Information Processing and Management, 2001. pp.369-377.

DOI: 10.1016/s0306-4573(02)00084-5

Google Scholar

[7] Gupta S , Kaiser G, Neistadt D , et al . DOM2based content extraction of HTML documents [J]. 12th In2 ternational World Wide Web Conference, 2003 (5). pp.235-238.

DOI: 10.1145/775152.775182

Google Scholar

[8] Gu Yun-Hua, Tian Wei, DOM-based Web information extraction model extension[J], Computer Science, 2009. pp.1254-1263.

Google Scholar

[9] Huang Wen-Bei, Yang Jing, Gu Jun-Zhong, Block-based web information extraction algorithm[J], Computer Scince, 2007. pp.24-26, 30.

Google Scholar

[10] Wang Shu, Zhu Min, Zhang Ming, A feature symbol-based information extraction[J]. Computer Science, 2009. pp.4539-4541.

Google Scholar