Paper Titles

Performance Comparison for Routing Protocols in Multi-Radio Multi-Channel Multi-Hop Wireless Networks
p.5107

FFU Motors Remote Monitoring System Based on Power Line Communication and GPRS
p.5113

High Capacity Steganographic Scheme for JPEG Compression Using Particle Swarm Optimization
p.5118

A Self-Adaptive Listen-Time Sleeping Algorithm Based on the Number of Neighbor Nodes for Wireless Mesh Sensor Networks
p.5123

Clustering Columns of the Wide-Table in Cloud Computing
p.5129

Identified Traffic Partition Grooming in Multi-Granularity Optical Network
p.5136

Evaluation of Vehicle Combination Property Based on Evidential Reasoning Method
p.5142

Improving the Network Load Balance by Adding an Edge
p.5147

Application of Density-Based Adaptive K-Means Clustering Algorithm in Web Log Mining
p.5152

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 433-440Clustering Columns of the Wide-Table in Cloud...

Clustering Columns of the Wide-Table in Cloud Computing

Article Preview

Abstract:

Various data-centric web applications are becoming the developing trend of information society. Cloud computing currently adopt column-oriented storage wide table to represent the heterogeneous structured data of these applications. The wide table reduces the waste of storage space, but slows down query efficiency. The paper implements the hybrid partition on access frequent (HPAF) to horizontally and vertically partition a wide table. It uses a variant of consistent hashing to dynamically horizontally partition a wide table across multiple storage nodes on each node’s performance; It use entropy to represent the number of reducing access data block from the table with N columns than from N column-oriented storage tables. According to the second law of thermodynamics, the paper designs an entropy increasing clustering algorithm to classify the columns of a wide table. The algorithm finds a cluster with multiple classes which save maximum access time. The paper implements an algorithm for structured query across multiple materialized views too. Lastly the paper demonstrates the query performance and storage efficiency of our strategy compared to single column storage.

You might also be interested in these eBooks

Materials Science and Information Technology

Info:

Periodical:

Advanced Materials Research (Volumes 433-440)

Pages:

5129-5135

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.433-440.5129

Citation:

Cite this paper

Online since:

January 2012

Authors:

Bin Huang, Yu Xing Peng

Keywords:

Clustering Algorithm, Entropy, Partition, Wide Table

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2012 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

[1] Bin YANG, Weining QIAN, Aoying ZHOU, Using Wide Table to manage web data: a survey, Front. Comput. Sci. China 2008, 2(3): 211–223.

DOI: 10.1007/s11704-008-0050-7

[2] http: /www. amazon. com.

[3] Delicious website. http: /www. delicious. com.

[4] Flickr website. http: /www. flickr. com.

[5] Google co-op website. http: /www. google. com/coop.

[6] www. google. com.

[7] Google base website. http: /base. google. com.

[8] Agrawal R, Somani A, Xu Y. Storage and querying of e-commerce data. In: Proceedings of the 27th International Conference on Very Large Data Bases, 2001, 149–158.

[9] Chu E, Beckmann J, Naughton J. The case for a wide-table approach to manage sparse relational data sets. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, 2007, 821–832.

DOI: 10.1145/1247480.1247571

[10] Bei Yu, Guoliang Li, Beng Chin Ooi, LiZhu Zhou. One Table Stores All: Enabling Painless FreeandEasy Data Publishing and Sharing.

[11] Abadi d j. Column stores for wide and sparse data. In: Proceedings of the Third Biennial Conference on Innovative Data Systems Research (CIDR), (2007).

[12] Stonebraker M, O'Neil E, O'Neil P, et al. C-store: a columnoriented DBMS. In: Proceedings of the 31st International Conference on Very Large Data Bases, 2005, 553–564.

[13] Hoque A S M L. Storage and querying of high dimensional sparsely populated data in compressed representation. In: Proceedings of the First EurAsian Conference on Information and Communication Technology, 2002, 418–425.

DOI: 10.1007/3-540-36087-5_49

[14] Boncz P, Zukowski M, Nes N. MonetDB/X100: hyper-pipelining query execution. In: Proceedings of the Second Biennial Conference on Innovative Data Systems Research (CIDR), (2005).

[15] Chang F, Dean J, Ghemawat S, et al. Bigtable: a distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI06), 2006, 205–218.

[16] Hbase website. http: /wiki. apache. org/lucene-hadoop/Hbase.

[17] Hadoop website. http: /lucene. apache. org/hadoop.

[18] Copeland G P, Khoshafian S N. A decomposition storage model. ACM SIGMOD Record, 1985, 14(4): 268–279.

DOI: 10.1145/971699.318923

[19] Khoshafian S, Copeland G P, Jagodis T, et al. A query processing strategy for the decomposed storage model. In ICDE, 1987, 636–643.

[20] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. SIGOPS, (2007).

DOI: 10.1145/1294261.1294281

[21] J. L. Beckmann, A. Halverson, R. Krishnamurthy, and J. F. Naughton. Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. In Proc. of ICDE, (2006).

DOI: 10.1109/icde.2006.67