Web Data Extraction Research Based on Wrapper and XPath Technology
For satisfy people’s various need, some websites consist of pages that are dynamically generated using a common template populated with data from www, such as product description pages on e-commerce sites. In this paper, it merges wrapper technology with XPath to form a dependable, robust process for web data extraction. Through validating such a method in some experiments; we get results that it has high efficiency in extracting list page.
H. Liu and Y. X. Ma, "Web Data Extraction Research Based on Wrapper and XPath Technology", Advanced Materials Research, Vols. 271-273, pp. 706-712, 2011