Authors: Wei Jin, Xiao Rong Zhao
Abstract: Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. in this method.The number of clusters is predefined and the technique is highly dependent off the initial identification of elements that represent the clusters well. As the dataset’s scale increases rapidly, it is difficult to use K-means and deal with massive data. partitions.To prevent this problem,refining initial points algorithm provided.it can reduce execution time and improve solutions for large data by setting the refinement of initial conditions.The experiments demonstrate that sample-based K-means is more stable and more accurate.
472
Authors: Xiao Hai Wang, Xian Xian Pan, Jiang Wan, Jun Qi
Abstract: Accurate model of the wind farm is the basis for the analysis of wind power integrating grid. This paper proposed the equivalent P-v(active power-wind speed) model with K-means algorithm based on the measured operating data. An actual wind farm is adopted to verify the effectiveness of the model. The result of error comparative analysis shows that the accuracy of the model is greatly improved with the method proposed in this paper compared with the traditional model.
233
Authors: He Wei Zhang, Lei Sun, Hong Zhang
Abstract: K - means algorithm is the classical algorithm to solve the problem of clustering in the area of data mining, when the sample data meets certain conditions, the results of clustering is better. But the algorithm is sensitive to the initial clustering center and clustering results will change as the differences of initial clustering center its number. Aimed at this shortage, this paper proposes a new algorithm based on prim algorithm to select the initial clustering center, details the basic idea of the algorithm and improves the specific methods and implementation steps, finally uses a test for the contrastive analysis. Results show that the improved K - means clustering algorithm needs not to specify the initial clustering center in advance, and it is not sensitive to abnormal value, and at the same time the use of greedy strategy makes the clustering effect more optimal than usual algorithms.
2063
Authors: Li Ying Cao, He Long Yu, Gui Fen Chen, Ting Ting Yang
Abstract: precision agriculture, soil fertility evaluation is the foundation of variable rate fertilization, the initial clustering centers of K means algorithm soil fertility levels in the traditional evaluation methods generated randomly from the data set, the clustering result is not stable. This paper proposes an improved K-means algorithm density algorithm to optimize the initial clustering center selection algorithm based on K, the most far away to each other in high density region point as the initial cluster center. Experiments show that, the improved K-means algorithm can eliminate the dependence on the initial cluster center; the clustering result has been greatly improved.
2047
Authors: Hong Bo Zhou, Jun Tao Gao
Abstract: Clustering result is easily influenced by the initial clustering centers in the K-means algorithm,an improved algorithm about initial clustering centers selection is presented.The algorithm finds the maximun Euclidean distance of cluster firstly,and then makes the cluster to split by used two data objects which have the maximum distance as new clustering centers,repeat the above steps until the specified number of clustering centers are obtained.Compared to the original algorithm,the improved algorithm can solve the problem of the instability of clustering effect generated by randomness, and its time complexity was also decreased.
337
Authors: Zhe Yuan Ding, Ming Ke He, Ming Ze Gao, Fang Fang Li
Abstract: K-means algorithm is common in text clustering algorithm. The traditional K-means algorithm has sensitivity to the initial centers. The result of clustering depends on the initial centers excessively. For different input, the output fluctuated considerably. The K-means algorithm combined features dictionary with density based on outlier detection to detect the outliers in text data. In the first stage, the density parameter is given to all of the data objects using the custom distance function. In the second stage, K-means is used to cluster base on the distribution of density. K data objects are chosen to be the initial clustering centers as they belong to high density area and have the farthest distance for each other. In the third stage, the exception text sets can be identified from the clustering by the outlier detection algorithm. Experimental results show that the proposed approach can efficiently detect outliers in data set.
2233
Authors: Er Jing Xu, Zhen Hong Jia, Lie Jun Wang, Ying Jie Hu, Jie Yang
Abstract: Due to the characteristic of remote sensing image, we propose a novel method based on K-means algorithm also with the improved multi-phrase level set model. Comparing with the classical multi-phase C-V model, the improved model considers the region area information, gradient information and edge detection .Proper use of gradient information can overcome the inaccurate edge localization defects in image segmentation. The edge detection is used for keeping the boundary information better in the evolution process .For the reason of picking up the contour’s convergent speed and enable the avoidance of trap .Four stages are constructed. Firstly, a median filtering is applied to smooth the original image and reduce parts of noise. Secondly, the usage of K-means algorithm gains more obvious differences of characteristics. Next, the reconstruction of gradient is obtained by using Sobel operator. Finally, segmentation result is achieved by using an improved method of multi-phase level set image segmentation. Experimental results show that the proposed approach has advantages in rapid and efficient application of remote sensing image segmentation.
457
Authors: Li Li Dong, Zong Shuai Ma, Wei Dong, Xiang Zhang
Abstract: This paper analyzed the employees' MMPI Psychological data of a company. Aiming at the problem that traditional K-Means algorithm is sensitive to the initial clustering center, this paper used hierarchical clustering algorithm CURE to mitigate the problem. Finally using CUDA technology clustered several times, so as to improve the execution efficiency of the algorithm. Through experimental verification, the improved K-Means algorithm behaved well in both execution efficiency and clustering results.
1664
Authors: Hong Bo Zhou, Jun Tao Gao
Abstract: K-means clustering algorithm clusters datasets according to the certain clustering number k.However k cannot be confirmed beforehand.A new clustering validity index was designed from the standpoint of sample geometry.Based on the index a new method for determining the optimal clustering number in K-means clustering algorithm was proposed.
231
Authors: Jin Yan Tang, Yue Lei Xie, Cheng Cheng Peng
Abstract: In this paper, a sub-array divided technique using K-means algorithm for spherical conformal array is proposed. All elements of spherical conformal array can be divided into a few sub-arrays by employing the K-means algorithm, and the standard multiple signal classification (MUSIC) algorithm is applied to estimate signals Direction-of-arrival (DOA) on these sub-arrays. Simulations of estimating DOA on a rotational spherical conformal array have been made and the results show that the resolution of DOA is improved by our method compare to existing methods.
2884