The Ranking of Deep Web Sources Based on Data Quality

Hu Yin; Yun Fei Lv; Wei Wei Wang

doi:10.4028/www.scientific.net/AMM.303-306.2437

Paper Titles

Reacher in Users Recommended of Social Data
p.2416

Study on Security Problems of the Internet of Things
p.2425

Task Scheduling Algorithm Based on Improved Min-Min Algorithm in Cloud Computing Environment
p.2429

The Analysis of the Government Information Sharing Based on the Game Theory
p.2433

The Ranking of Deep Web Sources Based on Data Quality
p.2437

Transplant and Tailor of gSOAP Based on Embedded Linux
p.2445

Virtual Reality Applications Force Feedback Device in Hand Rehabilitation Training
p.2449

The Application of Virtual Reality Technology in Transmission Live Working Training
p.2453

Beneficiation Process Optimization Study on a Low-Grade Hematite Ore
p.2461

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 303-306The Ranking of Deep Web Sources Based on Data...

The Ranking of Deep Web Sources Based on Data Quality

Abstract:

Deep Web technology makes a large number of useful information which hidden behind the interface easier to be found by users. However，with the increase of data source , how to find a suitable result quickly from a number of sources is becoming more and more important. In this paper, we start discussing from the quality of the data, setting 6 quality standards for the data source and giving the method of calculation. Meanwhile, we solve corresponding weight vector of quality standards by the feeling of the users; and based on this quality standards, we calculate a random data source according to weight vector to gain a general score. Then this paper discusses the sampling theory and proposes a reasonable sampling method for the experiment. The experiment result shows that it is of good veracity and operability to evaluate and score the data quality of data source according to sampling analysis.

You might also be interested in these eBooks

Sensors, Measurement and Intelligent Materials

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 303-306)

Pages:

2437-2444

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.303-306.2437

Citation:

Cite this paper

Online since:

February 2013

Authors:

Hu Yin, Yun Fei Lv, Wei Wei Wang

Keywords:

Data Quality, Deep Web Ranking, Quality Vector, Sampling Estimates

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] F. Naumann: Quality-Driven Query Answering. LNCS 2261, pp.51-66, (2002).

Google Scholar

[2] Chiara Francalanci, Barbara Pernici : Information quality assessment: Data quality assessment from the user's perspective. IQIS '04. June (2004).

Google Scholar

[3] Arjun Dasgupta: A Random Walk Approach to Sampling Hidden Databases. Sigmod'07.

Google Scholar

[4] Yang W. Lee, Diane M. Strong : Knowing-Why About Data Processes and Data Quality. Journal of Management Information Systems. December (2003).

Google Scholar

[5] Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma: Query Selection Techniques for Efficient Crawling of Structured Web Sources. ICDE 2006: 47.

DOI: 10.1109/icde.2006.124

Google Scholar

[6] Jayant Madhavan, David Ko, Łucja Kot. Google's Deep-Web Crawl. In Proceedings of the VLDB, (2008).

Google Scholar

[7] Sriram Raghavan, Hector Garcia-Molina: Crawling the Hidden Web. VLDB 2001: 129-138.

Google Scholar

[8] Augusto de Carvalho Fontes, Fábio Soares Silva: SmartCrawl: a new strategy for the exploration of the hidden Web. WIDM 2004: 9-15.

Google Scholar

[9] A. Arasu, and H. Garcia-Molina. Extracting structured data from Web pages. In SIGMOD, (2003).

DOI: 10.1145/872757.872799

Google Scholar

[10] Jiying Wang, Ji-Rong Wen, Frederick H. Lochovsky, Wei-Ying Ma: Instance-based Schema Matching for Web Databases by Domain-specific Query Probing. VLDB 2004: 408-419.

DOI: 10.1016/b978-012088469-8.50038-3

Google Scholar

[11] Zhen Zhang, Bin He, Kevin Chen-Chuan Chang: Light-weight Domain-based Form Assistant: Querying Web Databases On the Fly. VLDB 2005: 97-108.

Google Scholar

[12] James Caverlee, Ling Liu, Daniel Rocco: Discovering Interesting Relationships among Deep Web Databases: A Source-Biased Approach. World Wide Web 2006, 9(4): 585-622.

DOI: 10.1007/s11280-006-0227-7

Google Scholar

[13] Wensheng Wu, Clement T. Yu, AnHai Doan, Weiyi Meng: An Interactive Clustering-based Approach to Integrating Source Query interfaces on the Deep Web. SIGMOD Conference 2004: 95-106.

DOI: 10.1145/1007568.1007582

Google Scholar