首页> 外文会议>International Joint Conference on Artificial Intelligence >Dependency Clustering of Mixed Data with Gaussian Mixture Copulas
【24h】

Dependency Clustering of Mixed Data with Gaussian Mixture Copulas

机译:利用高斯混合金属混合数据的依赖性聚类

获取原文

摘要

Heterogeneous data with complex feature dependencies is common in real-world applications. Clustering algorithms for mixed - continuous and discrete valued - features often do not adequately model dependencies and are limited to modeling meta-Gaussian distributions. Copulas, that provide a modular parameterization of joint distributions, can model a variety of dependencies but their use with discrete data remains limited due to challenges in parameter inference. In this paper we use Gaussian mixture copulas, to model complex dependencies beyond those captured by meta-Gaussian distributions, for clustering. We design a new, efficient, semiparametric algorithm to approximately estimate the parameters of the copula that can fit continuous, ordinal and binary data. We analyze the conditions for obtaining consistent estimates and empirically demonstrate performance improvements over state-of-the-art methods of correlation clustering on synthetic and benchmark datasets.
机译:具有复杂特征依赖性的异构数据在现实世界中是常见的。用于混合连续和离散值的聚类算法通常不充分模型依赖性,并且仅限于建模元 - 高斯分布。 Copulas,提供联合分布的模块化参数化,可以模拟各种依赖性,但由于参数推断中的挑战,它们与离散数据的使用仍然有限。在本文中,我们使用高斯混合Copulas来模拟由Meta-Gaussian分布捕获的复杂依赖性,以进行聚类。我们设计了一种新的,高效的半占算法,以估计可以适合连续,序号和二进制数据的Copula的参数。我们分析了获得一致估计的条件,并经验证明了在合成和基准数据集上的最先进的相关聚类方法上的性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号