A Data Localization Algorithm for Distributing Column Storage System of Big Data

Jia Man Ding; Ying Jiang; Qing Xin Wang; Ying Li Liu; Meng Juan Li

doi:10.4028/www.scientific.net/AMR.756-759.3089

Paper Titles

The Recursive Algorithms of Yule-Walker Equation in Generalized Stationary Prediction
p.3070

Human Factor Analysis Model of Civil Aviation Incidents Based on Bayesian Network
p.3074

Comprehensive Evaluation of the Level of Consumption Based on Principal Component Analysis and Cluster Analysis
p.3079

Regular (ϵ,ϵνq_k) - Fuzzy Duo Ordered Semigroups
p.3084

A Data Localization Algorithm for Distributing Column Storage System of Big Data
p.3089

Research on Calculating the Parameters of Signal Timing for TSP Based on Enumeration Method
p.3094

Static Voltage Stability Margin Calculation for the Microgrid Based on Immune Algorithm
p.3099

A Tag-Based Search Algorithm for Causal Bayesian Networks
p.3103

A Novel Non-Data-Aided Frequency Estimation Algorithm for M-PSK Signals
p.3109

HomeAdvanced Materials ResearchAdvanced Materials Research Vols. 756-759A Data Localization Algorithm for Distributing...

A Data Localization Algorithm for Distributing Column Storage System of Big Data

Abstract:

Distributing column storage is one of the techniques to improve the efficiency of big data access under the cloud computing environment. To achieving the aim and reducing network data access frequency, paper established a data localization strategy and designed a multi-thread algorithm. Firstly, segmentalize data in the horizontal direction, and then divide vertically the data table into data column, and ensure that the same level column data localize on the same node in the cluster. Secondly, the essay designed and realized the data localization algorithm under Hadoop distributed cloud computing framework. Finally, experiments show remarkable reduces in the network access with the usage of data localization algorithm, and improvement of the data access efficiency.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Advanced Materials Research (Volumes 756-759)

Pages:

3089-3093

DOI:

https://doi.org/10.4028/www.scientific.net/AMR.756-759.3089

Citation:

Cite this paper

Online since:

September 2013

Authors:

Jia Man Ding, Ying Jiang, Qing Xin Wang, Ying Li Liu, Meng Juan Li

Keywords:

Column Storage, Data Localization, Distribution, Hadoop

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

References

[1] Jinguo You, Lianying Jia, Jianhua Hu, Qingsong Huang, Jianqing Xi. Double Table Switch: An Efficient Partitioning Algorithm for Bottom-Up Computation of Data Cubes. The International Conference on Advanced Data Mining and Applications (ADMA2010), 2010, pp.183-190.

DOI: 10.1007/978-3-642-17313-4_19

Google Scholar

[2] Zhuoluo Yang, Jinguo You, Jian Wang, and Jianhua Hu. Bizard: An Online Multi-dimensional Data Analysis Visualization Tool. The 14th Asian-Pacific Web Conference (APWeb 2012), April 23-25, 2012, pp.775-778.

DOI: 10.1007/978-3-642-29253-8_76

Google Scholar

[3] Jeffrey Cohen , Brian Dolan , Mark Dunlap , Joseph M. Hellerstein , Caleb Welton, MAD skills: new analysis practices for big data, Proceedings of the VLDB Endowment, v. 2 n. 2, August (2009).

DOI: 10.14778/1687553.1687576

Google Scholar

[4] Jinguo You, Jianqing Xi, Pingjian Zhang, Hu Chen. A Parallel Algorithm for Closed Cube Computation. The Seventh IEEE/ACIS International Conference on Computer and Information Science. 2008, pp.95-99.

DOI: 10.1109/icis.2008.63

Google Scholar

[5] Lin Yao，Yongku Zhang. Storage and extensible distributed on NoSQL. Computer Engineering. 2012, pp.40-43.

Google Scholar

[6] Jinguo You, Jianqing Xi, Chuan Zhang, Gengqi Guo. HDW: A High Performance Large Scale Data Warehouse. The Third International Multi-Symposiums on Computer and Computational Sciences. 2008, pp.200-202.

DOI: 10.1109/imsccs.2008.16

Google Scholar

[7] Stratos Idreos et al. Self-organizing tuple reconstruction in column-stores/ Proceedings of the SIGMOD. Providence, Rhode Island, USA, 2009, pp.297-308.

DOI: 10.1145/1559845.1559878

Google Scholar

[8] Harizopoulos S, Liang V, Abadi D J, et al. Performance tradeoffs in read-optimized databases [C]Proc of the 32nd VLDB Conf. Trondheim, Norway: VLDB Endowment, 2006, pp.487-498.

Google Scholar

[9] Azza Abouzeid , Kamil Bajda-Pawlikowski , Daniel Abadi , Avi Silberschatz , Alexander Rasin, HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads, Proceedings of the VLDB Endowment, v. 2 n. 1, August (2009).

DOI: 10.14778/1687627.1687731

Google Scholar