Incomplete Big Data Distributed Clustering

Yong Lin Leng

doi:10.4028/www.scientific.net/AMM.687-691.1496

Paper Titles

On the Computational Study of Artificial Fish Swarm Algorithm and its Improvement
p.1480

The Subspaces Cover of Linear Space
p.1485

An Electronic Commerce Collaborative Filtering Recommedation Algorithm Based on User Context
p.1488

A Flexible Multi-Resolution Modeling Method of SoS Combat
p.1492

Incomplete Big Data Distributed Clustering
p.1496

Missing Data Clustering Based on Incomplete Information System
p.1500

Construction of Emergency Special Database Based on Quality Control Theory
p.1504

Traffic State Index Prediction Model Based on Hybrid Intelligent Methods
p.1508

Energy Field Filling of NEIC Broadband Radiated Energy Catalogue Based on Support Vector Machine Regression Model
p.1514

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 687-691Incomplete Big Data Distributed Clustering

Incomplete Big Data Distributed Clustering

Abstract:

Partially missing or blurring attribute values make data become incomplete during collecting data. Generally we use inputation or discarding method to deal with incomplete data before clustering. In this paper we proposed an a new similarity metrics algorithm based on incomplete information system. First algorithm divided the data set into a complete data set and non complete data set, and then the complete data set was clustered using the affinity propagation clustering algorithm, incomplete data according to the design method of the similarity metric is divided into the corresponding cluster. In order to improve the efficiency of the algorithm, designing the distributed clustering algorithm based on cloud computing technology. Experiment demonstrates the proposed algorithm can cluster the incomplete big data directly and improve the accuracy and effectively.

You might also be interested in these eBooks

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 687-691)

Pages:

1496-1499

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.687-691.1496

Citation:

Cite this paper

Online since:

November 2014

Authors:

Yong Lin Leng*

Keywords:

Ap Clustering, Cloud Computing, Incomplete Big Data

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

Citation:

* - Corresponding Author

References

[1] F. B. J and D. Delbert, Clustering by passing messages between data points, Science, vol. 315, no. 5814, pp.972-976, (2007).

DOI: 10.1126/science.1136800

Google Scholar

[2] J. Hathaway, C. Bezdek, Fuzzy c-means clustering of incomplete data, IEEE Transactions on Systems, Man and Cybernetics, vol. 31, no. 5, pp.735-744, (2001).

DOI: 10.1109/3477.956035

Google Scholar

[3] J. Hathaway, C. Bezdek, Clustering incomplete relational data using the non-Euclidean relational fuzzy c-means algorithm, , Pattern Recognition Letters, vol. 23, no. 1, p.151–160, (2002).

DOI: 10.1016/s0167-8655(01)00115-5

Google Scholar

[4] D. Li, H. Gu, L. Zhang, a hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals, Soft Computing, vol. 17, no. 10, pp.1787-1796, (2013).

DOI: 10.1007/s00500-013-0997-7

Google Scholar

[5] K. Chen, D. Yang, C. Zhang, Novel algorithm for filling incomplete data of internet of things based on attribute reduction, Computer Engineering and Design, vol. 34, no. 2, pp.418-422, (2013).

Google Scholar

[6] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, and I. Stoica, A view of cloud computing, Communications of the ACM, vol. 53, no. 4, pp.50-58, (2010).

DOI: 10.1145/1721654.1721672

Google Scholar

[7] J. Dean and S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, vol. 51, no. 1, pp.107-113, (2008).

DOI: 10.1145/1327452.1327492

Google Scholar