基于数据密集性的自适应K均值初始化方法

韩最蛟

首页> 中文期刊> 《计算机应用与软件》 >基于数据密集性的自适应K均值初始化方法

基于数据密集性的自适应K均值初始化方法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

K-means clustering algorithm is widely used in data mining and machine learning region.However the choosing of the initial clustering centroids greatly influences the entire clustering effect.Therefore how to reasonably initialize K-means clustering algorithm becomes an important research orientation.The article proposes an adaptive initialization clustering center choosing method based on data intrinsic den-sity,which is implemented in two stages.Stage one,it provides the definition of data density and chooses,on the basis of data density,the can-didate initialization clustering centroids that meet requirements;stage two,after-process is executed on the chosen candidate initialization cen-troids so that their number conforms to the data class.Experiment proves that the proposed method has the following advantages:1 )it can au-tomatically discover the density of data distribution from datasets and can reasonably find out the initialization clustering centroids;2)it is ro-bust to outliers and noises;3)it reduces the iteration steps of k-means clustering algorithm;4)it is convenient to implement.%K均值聚类算法在数据挖掘、机器学习领域被广泛应用。但其初始聚类中心的选取对整个聚类效果会产生很大的影响，因此，如何合理地初始化K均值聚类算法成为重要的研究方向。提出一种基于数据内在密集性的自适应初始聚类中心选取方法。该方法分为两个过程，第一个过程给出数据密集性的定义，并基于数据密集性选出满足条件的候选初始聚类中心，第二个过程是对选出的候选初始中心进行后处理，使其个数与数据类一致。实验证明，提出的方法有如下优势：1）能够自主发现数据集中数据分布的密集性，并能够合理找出初始聚类中心；2）对离群点和噪声鲁棒；3）减少了K均值聚类算法的迭代步骤；4）易于实现。

著录项

来源
《计算机应用与软件》 |2014年第2期|182-187|共6页
作者
韩最蛟;
展开▼
作者单位

四川行政学院计算机系四川成都610072;

展开▼
原文格式 PDF
正文语种 chi
中图分类计算机仿真;
关键词
聚类; K均值; 初始化; 初始聚类中心选取;

相似文献

中文文献
外文文献
专利

1. 基于拟蒙特卡洛的K均值聚类中心初始化方法 [J] . 庄瑞格 ,倪泽邦 ,刘学艺 . 济南大学学报（自然科学版） . 2017,第001期
2. 基于线性判别分析和二分K均值的高维数据自适应聚类方法 [J] . 汪万紫 ,裘国永 ,张兵权 . 郑州轻工业学院学报（自然科学版） . 2011,第002期
3. 基于最小方差的自适应K-均值初始化方法 [J] . 肖洋 ,李平 ,王鹏 . 长春理工大学学报（自然科学版） . 2015,第005期
4. 基于自适应K均值聚类和霍夫变换的船舶干舷视觉检测 [J] . 姜苗苗 ,史国友 ,许拴梅 . 上海海事大学学报 . 2021,第002期
5. 基于Lab颜色空间的自适应K均值彩色图像分割方法 [J] . 陈梦涛 ,余粟 . 软件导刊 . 2021,第006期
6. 基于百万数据点的IEC61850客户端速初始化方法研究 [C] . 彭鹏 ,姜闿笈 ,陈满 . 2014未来配电网形态及关键技术研讨会 . 2014
7. 知识密集型小企业投资可行性研究——基于财务数据分析 [A] . 王怡 . 2011

基于数据密集性的自适应K均值初始化方法

摘要

著录项

相似文献

相关主题

期刊订阅