A Histogram Based Analytical Approximate Query Processing for Massive Data

Article Preview

Abstract:

In this paper, we study the characteristics of analytical query processing and proposed a histogram based approximate method for query processing over massive data. We implemented this approach into Hive system and evaluate it with Hive and BlinkDB cluster, the experimental results verified that our method is significantly fast than these existing techniques.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

362-365

Citation:

Online since:

September 2013

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2013 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Sameer Agarwal, Aurojit Panda, Barzan Mozafari, Anand P. Iyer, Samuel Madden, Ion Stoica. Blink and It's Done: Interactive Queries on Very Large Data. In PVLDB 5(12): 1902-1905, (2012).

DOI: 10.14778/2367502.2367533

Google Scholar

[2] Sameer Agarwal, Barzan Mozafari, Aurojit Panda, etc. BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. To Appear in ACM EuroSys (2013).

DOI: 10.1145/2465351.2465355

Google Scholar

[3] Feng Kaiping, Zhang Hua, Feng Chaoying, Chen Heng, in: Application of Histogram Method on Cost Estimate in Query Optimization. Computer & Digital Engineering, (2010).

Google Scholar

[4] S. Acharya, P. B. Gibbons, and V. Poosala. Congressional samples for approximate answering of group-by queries. In ACM SIGMOD, May (2000).

DOI: 10.1145/335191.335450

Google Scholar

[5] S. Agarwal, S. Kandula, N. Bruno, etc. Re-optimizing Data Parallel Computing. In NSDI, (2012).

Google Scholar

[6] G. Cormode. Sketch techniques for massive data. In Synopses for Massive Data: Samples, Histograms, Wavelets and Sketches, (2011).

DOI: 10.1561/9781601985170

Google Scholar