ARRA: An Associated Replica Replacement Algorithm Based on Apriori Approach for Data Intensive Jobs in Data Grid
Creating many replicas in the processing of data-intensive jobs in data grid is an efficient strategy. Replica replacement is the crucial step to this strategy. Economic model, popularity model and hybrid model etc. have been proposed to solve this issue of replica replacement with analysis and prediction based on each data file, however, these models neglect association relationships among different data files. To find out these association relationships hidden in data-intensive jobs, Apriori algorithm in data mining field is adopted to analyze behaviors of each data-intensive job. An associated replica replacement algorithm based on Apriori approach in data grid is proposed in this paper. This algorithm has two major steps: 1) associated behavior analysis and classification of data files in each node; 2) generation and application of replica replacement rules. Our proposed algorithm is simulated in Optorsim to be compared with LFU algorithm. The experiment shows that there is a relative advantage compared with LFU in mean job times of all jobs, number of remote file access and effective network usage perspectives.
J. H. Jiang et al., "ARRA: An Associated Replica Replacement Algorithm Based on Apriori Approach for Data Intensive Jobs in Data Grid", Key Engineering Materials, Vols. 439-440, pp. 1409-1414, 2010