Exploring the Reference Management in Parallel De-Duplication

Rui Zhu; Lei Hua Qin; Jing Li Zhou

doi:10.4028/www.scientific.net/AMM.411-414.236

Paper Titles

The Creation of Ergonomics Database Using Ergo&Log^© Analytical Application
p.219

An Assessment and Intervention of Sub-Health Management Information System Based on Lightweight Java EE Framework
p.223

Computer Aided Organic Synthesis Based on Graph Grammars
p.227

Effectiveness Evaluation of Information Management System Based on Modified Normal Cloud Model
p.231

Exploring the Reference Management in Parallel De-Duplication
p.236

Research Focuses & Frontiers of Information Management
p.240

The Research of Network Cluster Behavior Mode in Online Public Opinion
p.244

Prototype System of Knowledge Management Based on Data Mining
p.251

Network Features Measurement of Social Media
p.255

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 411-414Exploring the Reference Management in Parallel...

Exploring the Reference Management in Parallel De-Duplication

Abstract:

As the explosion of digital information in network, data de-duplication has been widely used in most backup systems to improve space efficiency. When only unique data segments are stored and shared by backup files, the reference information between the files and their data segments is becoming more and more important to track the data usage and reclaim of freed space. However, as the usage of multi-core and many-core, genereal Group Mark-and-Sweep method has some poor performance in concurrent reference updates due to the synchronization overhead. To alleviate this challenge, a Parallel Mark-and-Sweep mechanism has been exploied based on the research of DHT and similrity method. In our experiments with real-world datasets, it has shown a better performance comparing to the sequential method.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 411-414)

Pages:

236-239

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.411-414.236

Citation:

Cite this paper

Online since:

September 2013

Authors:

Rui Zhu*, Lei Hua Qin, Jing Li Zhou

Keywords:

De-Duplication, Parallel, Reference Management

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] The Digital Universal Decade – Are You Ready ?, An IDC Analyze Report, http: /www. emc. com/collateral/demos/microsites/idc-digital-universe/iview. html.

Google Scholar

[2] Netapp De-duplication (ASIS). http: /www. netapp. com/us/products/platform-os/dedup. html.

Google Scholar

[3] S. Quinlan and S. Dorward. Venti: A new approach to archive data storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02), Monterey, CA, Jan. (2002).

Google Scholar

[4] S. Rhea, R. Cox, and A. Pesterev. Fast, inexpensive content-addressed storage in Foundation. In Proceedings of the 1st USENIX ANNUAL Technical Conference (USENIX ATC'08), Boston, MA. June 2008, USENIX.

Google Scholar

[5] Udi Manber. Finding Similar Files in A Large File System. Technical Report TR 93-33, Department of Computer Science, University of Arizona. October 1993, also in Proceedings of the USENIX Winter 1994 Technical Conference, 17-21, (1994).

Google Scholar

[6] F. Guo, P. Efstathopoulos. Building a High-performance De-duplication System. In Proceedings of the 2011 conference on USENIX Annual Technical Conference (ATC'11), USENIX.

Google Scholar

[7] D. Bhagwat, K. Eshghi, D. D. E. long, M, Lillibridge. Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup. In Proceedings of 17th IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'09). London: IEEE press, 2009: 1-9.

DOI: 10.1109/mascot.2009.5366623

Google Scholar