...
首页> 外文期刊>Journal of software >An Improved K-means Algorithm Based on Structure Features
【24h】

An Improved K-means Algorithm Based on Structure Features

机译:一种改进的基于结构特征的k均值算法

获取原文
           

摘要

In K-means clustering, we are given a set of n data points in multidimensional space, and the problem is to determine the number k of clusters. In this paper, we present three methods which are used to determine the true number of spherical Gaussian clusters with additional noise features. Our algorithms take into account the structure of Gaussian data sets and the initial centroids. These three algorithms have their own emphases and characteristics. The first method uses Minkowski distance as a measure of similarity, which is suitable for the discovery of non-convex spherical shape or the clusters with a large difference in size. The second method uses feature weighted Minkowski distance, which emphasizes the different importance of different features for the clustering results. The third method combines Minkowski distance with the best feature factors. We experiment with a variety of general evaluation indexes on Gaussian data sets with and without noise features. The results showed that the algorithms have higher precision than traditional K-means algorithm.
机译:在K-means集群中,我们在多维空间中给出了一组N个数据点,问题是确定群集的数字k。在本文中,我们提出了三种方法,用于确定具有额外噪声特征的球形高斯簇的真实数量。我们的算法考虑了高斯数据集的结构和初始质心。这三种算法有自己的重点和特征。第一方法使用Minkowski距离作为相似性的量度,这适于发现非凸球形或具有较大尺寸差异的簇。第二种方法使用特征加权Minkowski距离,这强调了不同特征对聚类结果的不同重要性。第三种方法将Minkowski距离与最佳特征因素组合。我们在高斯数据集上进行各种一般评估指标,具有和不具有噪声功能。结果表明,算法比传统的K均值算法具有更高的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号