Extracting Data Records Based on Global Schema
With the rapid increasing of web data, deep web is the fastest growing web data carrier. Therefore, the research of deep web, especially on extracting data records from Result pages, has already become an urgent task. We present a data records extraction based on Global Schema method, which automatically extracts the query result records from web pages. This method first analyzes the Query interface and result records instances to build a Global Schema by ontology. Then, the Global Schema is used in the process of extracting data records from result pages and storing these data in a table. Experimental results indicate that this method is accurate to extract data records, as well as to save in a table with a Global Schema.
K. R. Chen et al., "Extracting Data Records Based on Global Schema", Applied Mechanics and Materials, Vols. 20-23, pp. 553-558, 2010