Research on Parallel DBSCAN Algorithm Design Based on MapReduce
Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, more researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel DBSCAN clustering algorithm based on Hadoop, which is a simple yet powerful parallel programming platform. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.
Riza Esa and Yanwen Wu
Y. X. Fu et al., "Research on Parallel DBSCAN Algorithm Design Based on MapReduce", Advanced Materials Research, Vols. 301-303, pp. 1133-1138, 2011