...
【24h】

On the use of divergence distance in fuzzy clustering

机译:论散度距离在模糊聚类中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Clustering algorithms divide up a dataset into a set of classes/clusters, where similar data objects are assigned to the same cluster. When the boundary between clusters is ill defined, which yields situations where the same data object belongs to more than one class, the notion of fuzzy clustering becomes relevant. In this course, each datum belongs to a given class with some membership grade, between 0 and 1. The most prominent fuzzy clustering algorithm is the fuzzy c-means introduced by Bezdek (Pattern recognition with fuzzy objective function algorithms, 1981), a fuzz-ification of the k-means or ISODATA algorithm. On the other hand, several research issues have been raised regarding both the objective function to be minimized and the optimization constraints, which help to identify proper cluster shape (Jain et al., ACM Computing Survey 31(3):264-323, 1999). This paper addresses the issue of clustering by evaluating the distance of fuzzy sets in a feature space. Especially, the fuzzy clustering optimization problem is reformulated when the distance is rather given in terms of divergence distance, which builds a bridge to the notion of probabilistic distance. This leads to a modified fuzzy clustering, which implicitly involves the variance-covariance of input terms. The solution of the underlying optimization problem in terms of optimal solution is determined while the existence and uniqueness of the solution are demonstrated. The performances of the algorithm are assessed through two numerical applications. The former involves clustering of Gaussian membership functions and the latter tackles the well-known Iris dataset. Comparisons with standard fuzzy c-means (FCM) are evaluated and discussed.
机译:聚类算法将数据集划分为一组类/簇,其中将相似的数据对象分配给同一簇。当聚类之间的边界定义不正确时,会导致同一数据对象属于多个类别的情况,模糊聚类的概念变得很重要。在本课程中,每个数据都属于给定类别,且具有从0到1之间的某个隶属度。最著名的模糊聚类算法是Bezdek引入的模糊c-均值(带有模糊目标函数算法的模式识别,1981), -k-均值或ISODATA算法的标准化。另一方面,关于要最小化的目标函数和优化约束的问题已经引起了一些研究问题,这些问题有助于识别合适的聚类形状(Jain等人,ACM Computing Survey 31(3):264-323,1999 )。本文通过评估特征空间中模糊集的距离来解决聚类问题。特别地,当根据散度距离给出距离时,重新构造了模糊聚类优化问题,这为概率距离的概念搭建了一座桥梁。这导致修改后的模糊聚类,其中隐含了输入项的方差-协方差。确定了最优解方面的基础优化问题的解,同时证明了该解的存在性和唯一性。该算法的性能通过两个数值应用程序进行评估。前者涉及高斯隶属度函数的聚类,而后者涉及众所周知的虹膜数据集。与标准模糊c均值(FCM)的比较进行了评估和讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号