A Fuzzy C-Means Approach for Incomplete Data Sets Based on Nearest-Neighbor Intervals

Dan Li; Chong Quan Zhong; Shi Qiang Wang

doi:10.4028/www.scientific.net/AMM.411-414.1108

Paper Titles

Efficient Monte Carlo Simulation for Pricing Variance Derivatives under Multi-Factor Stochastic Volatility Models
p.1089

A Novel Trend Relational Algorithm Based on Fuzzy Clustering Method
p.1095

Analysis and Study of the Process Enterprise Knowledge Management Model Based on the Recommendation Algorithm
p.1099

An Earthquake Clustering Method Based on Soft Distance Calculations
p.1104

A Fuzzy C-Means Approach for Incomplete Data Sets Based on Nearest-Neighbor Intervals
p.1108

Hierarchical Text Classification Based on LDA and Domain Ontology
p.1112

Chaotic Characteristics of Heart Sound Signals Based on the Largest Lyapunov Exponent
p.1117

An Improved ICA Algorithm Based on the Negative Entropy and Simulated Annealing Algorithm
p.1125

ARIMA Models are Clicks Away
p.1129

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 411-414A Fuzzy C-Means Approach for Incomplete Data Sets...

A Fuzzy C-Means Approach for Incomplete Data Sets Based on Nearest-Neighbor Intervals

Abstract:

Partially missing data sets are a prevailing problem in pattern recognition. In this paper, the problem of clustering incomplete data sets is considered, and missing attribute values are imputed by the centers of corresponding nearest-neighbor intervals. Firstly, the algorithm estimates the nearest-neighbor intervals of missing attribute values by using the attribute distribution information of the data sets sufficiently. Secondly, the missing attribute values are imputed by the center of the intervals so as to clustering incomplete data sets. The proposed algorithm introduces the nearest neighbor information into incomplete data clustering, and the comparisons of the experimental results for two UCI data sets demonstrate the capability of the proposed algorithm.