首页> 外文会议>International Conference on Advanced Cognitive Technologies and Applications >Missing Categorical Data Imputation for FCM Clusterings of Mixed Incomplete Data
【24h】

Missing Categorical Data Imputation for FCM Clusterings of Mixed Incomplete Data

机译:缺少混合不完整数据的FCM群集的分类数据归档

获取原文

摘要

The Data mining is related to human congnitive ability, and one of popular method is fuzzy clustering. The focus of fuzzy c-means (FCM) clustering method is normally used on numerical data. However, most data existing in databases are both categorical and numerical. To date, clustering methods have been developed to analyze only complete data. Although we, sometimes, encounter data sets that contain one or more missing feature values (incomplete data) in data intensive classification systems, traditional clustering methods cannot be used for such data. Thus, we study this theme and discuss clustering methods that can handle mixed numerical and categorical incomplete data. In this paper, we propose some algorithms that use the missing categorical data imputation method and distances between numerical data that contain missing values. Finally, we show through a real data experiment that our proposed method is more effective than without imputation, when missing ratio becomes higher.
机译:数据挖掘与人类突出能力相关,流行方法之一是模糊聚类。模糊C-Means(FCM)聚类方法的焦点通常用于数值数据。但是,数据库中存在的大多数数据都是分类和数值。迄今为止,已经开发了群集方法来分析完整数据。虽然我们有时,遇到包含一个或多个缺失特征值(不完整数据)的数据集,但是在数据密集型分类系统中,传统的聚类方法不能用于此类数据。因此,我们研究了这个主题并讨论了可以处理混合数值和分类不完​​整数据的聚类方法。在本文中,我们提出了一些使用缺失的分类数据载旋方法和包含缺失值的数字数据之间的距离的算法。最后,我们通过真实的数据实验表明,当缺失的比率变高时,我们所提出的方法比毫无归发的毫无效益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号