...
首页> 外文期刊>International Journal of Intelligent Systems and Applications >Density Based Initialization Method for K-Means Clustering Algorithm
【24h】

Density Based Initialization Method for K-Means Clustering Algorithm

机译:K均值聚类算法的基于密度的初始化方法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Data clustering is a basic technique to show the structure of a data set. K-means clustering is a widely acceptable method of data clustering, which follow a partitioned approach for dividing the given data set into non-overlapping groups. Unfortunately, it has the pitfall of randomly choosing the initial cluster centers. Due to its gradient nature, this algorithm is highly sensitive to the initial seed value. In this paper, we propose a kernel density-based method to compute an initial seed value for the k-means algorithm. The idea is to select an initial point from the denser region because they truly reflect the property of the overall data set. Subsequently, we are avoiding the selection of outliers as an initial seed value. We have verified the proposed method on real data sets with the help of different internal and external validity measures. The experimental analysis illustrates that the proposed method has better performance over the k-means, k-means++ algorithm, and other recent initialization methods.
机译:数据聚类是显示数据集结构的基本技术。 K均值聚类是一种广泛接受的数据聚类方法,它遵循一种分区方法,用于将给定数据集划分为非重叠组。不幸的是,它具有随机选择初始聚类中心的陷阱。由于其梯度性质,该算法对初始种子值高度敏感。在本文中,我们提出了一种基于核密度的方法来计算k均值算法的初始种子值。这样做的想法是从较密集的区域中选择一个初始点,因为它们确实反映了整个数据集的属性。随后,我们避免选择离群值作为初始种子值。我们已经借助不同的内部和外部有效性度量对真实数据集验证了该方法。实验分析表明,该方法具有优于k-means,k-means ++算法和其他近期初始化方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号