p.2118
p.2124
p.2130
p.2134
p.2138
p.2142
p.2147
p.2153
p.2157
Feedback Clustering Algorithm for Detecting Approximately Duplicate Records
Abstract:
Detecting and merging approximately duplicate records is not an emerging issue in the field of data cleansing, the majority of duplicated records detecting method is based on the "sort-merge" thinking. Although clustering methods have been applied to data cleaning, a large number of non-duplicated records exist in clusters after analysis as a result of the increasing records. Response to this shortcoming, this paper presents a data cleansing method based on Clustering Feedback Pattern. Comparison results of clustering are fed back to the cluster process so that recall and precision improve.
Info:
Periodical:
Pages:
2138-2141
Citation:
Online since:
August 2014
Authors:
Keywords:
Price:
Сopyright:
© 2014 Trans Tech Publications Ltd. All Rights Reserved
Share:
Citation: