Study on Web Information Intelligent Extraction for Agricultural Product Quantity Security System

Abstract:

Article Preview

Web information is main data source for the agricultural product quantity security system which is used to provide comprehensive analysis and early warning for national agriculture through large amounts of basic data. In this paper, Web information extraction architecture and a novel approach of wrapper construction are presented. The intelligence of wrapper is that both intensive and sparse data in web pages can be distinguished and extracted at one time. During the wrapper construction, hierarchical clustering is used to determine key information node and DOM technique and heuristic rules are applied to generate extraction expression according to different types of data. Experiments on a large of Web pages from different Web sites indicate that the extraction method, which has a high rate of recall and precision, is feasible and efficient.

Info:

Periodical:

Advanced Materials Research (Volumes 108-111)

Edited by:

Yanwen Wu

Pages:

222-227

DOI:

10.4028/www.scientific.net/AMR.108-111.222

Citation:

S. D. Zhang et al., "Study on Web Information Intelligent Extraction for Agricultural Product Quantity Security System", Advanced Materials Research, Vols. 108-111, pp. 222-227, 2010

Online since:

May 2010

Export:

Price:

$35.00

In order to see related information, you need to Login.

In order to see related information, you need to Login.