首页> 中文期刊> 《计算机工程与科学》 >面向混合属性数据集的双重聚类方法

面向混合属性数据集的双重聚类方法

         

摘要

In order to effectively preprocessing mixed data sets from complex information environment, this paper proposes a dual clustering method. This dual clustering method is implemented by a construction algorithm of a dual near neighbor undirected graph or its improved algorithm, a clustering algorithm based on merging disjoint-set, a clustering algorithm based on breadth-first-search, or a clustering algorithm based on depth-first-search. Through the simulation experiments of some artificial data sets and UCI standard data sets, we can verify that the three clustering algorithms can obtain the same results in the end, although they use different search strategies. The experimental results also show that this dual clustering method can often obtain better clustering quality than k-means algorithm and AP algorithm when handling some data sets with apparent clusters and without near neighbors noises. This demonstrates the dual clustering method is comparatively effective and practical. In the end, some research expectations are given to disinter and popularize this method.%面对复杂信息环境下的数据预处理需求,提出了一种可以处理混合属性数据集的双重聚类方法.这种双重聚类方法由双重近邻无向图的构造算法或其改进算法,基于分离集合并的双重近邻图聚类算法、基于宽度优先搜索的双重近邻图聚类算法、或基于深度优先搜索的双重近邻图聚类算法来实现.通过人工数据集和UCI标准数据集的仿真实验,可以验证,尽管这三个聚类算法所采用的搜索策略不同,但最终的结果是一致的.仿真实验结果还表明,对于一些具有明显聚类分布结构且无近邻噪声干扰的数据集,该方法经常能取得比K-means算法和AP算法更好的聚类精度,从而说明这种双重聚类方法具有一定的有效性.为进一步推广并在实际中发掘出该方法的应用价值,最后给出了一点较有价值的研究展望.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号