Research of Information Retrieval Based on Web Page Segmentation

Yang Xin Yu

doi:10.4028/www.scientific.net/AMM.204-208.4928

Paper Titles

A New Method to Mine Classification Rules
p.4904

An Advanced Multi-Objective Genetic Algorithm Based on Borda Number
p.4909

Motion Simulation of Dual-Frequency Vibrating Screen
p.4916

A New Construction Method for Web-Based Large-Scale 3D Terrain Model
p.4922

Research of Information Retrieval Based on Web Page Segmentation
p.4928

Reliability Model of Series and Parallel Systems under Imperfect Information
p.4932

A Novel Image Registration Approach With SIFT Algorithm and Tangent-Cross-Point Feature
p.4936

Application of Virtual Reality Technology for Emergency Evacuation in High-Rise Buildings
p.4941

Decentralized NExT/ERA and RDT/ERA System Identification in Wireless Smart Sensor Networks
p.4946

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 204-208Research of Information Retrieval Based on Web...

Research of Information Retrieval Based on Web Page Segmentation

Abstract:

A Web information retrieval algorithm based on Web page segment is designed, the key idea of which is to segment each Web page into different topic areas or segments according to its HTML tags and contents since Web pages are semi-structure. First, the algorithm builds a HTML tag tree, and then it combines nodes in the tree under the rule of content similarity and visual similarity. During the process of retrieval and ranking, the algorithm makes full use of the segmentation information to sequence the relevant pages. The experimental results show that this method is able to improve the precision in search significantly and it is also a good reference for the design of the future search engines.

You might also be interested in these eBooks

Progress in Industrial and Civil Engineering

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 204-208)

Pages:

4928-4931

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.204-208.4928

Citation:

Cite this paper

Online since:

October 2012

Authors:

Yang Xin Yu

Keywords:

HTML Tag, Information Retrieval, Page Segment, Similarity

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Yangxin Yu. Information Query Model Based on OWL-S Matching. Computers and Applied Chemistry (In Chinese), 2007, 24(9): 1277-1280.

Google Scholar

[2] Zhengyu Zhu, Kunfeng Yuan, Xinghuan Chen. Method of Information Retrieval Based on Computing Maximum-weight-matching, 2007, 43(33): 176-179.

Google Scholar

[3] Park.J. S, Chen.M. S, Yu.P.S. An Effective Hashbased Algorithm for Mining Association Rules. In Proceedings of the ACM SIGMOD. International Conference on Management of Data, 1995: 175-186.

DOI: 10.1145/223784.223813

Google Scholar

[4] Yajun Liu, Yi Xu. Automatic Question Answering System Based on Weighted Semantic Similarity Model. Journal of Southeast University (In Chinese), 2004, 34(05): 609-612.

Google Scholar