首页> 外文会议>Principles of data mining and knowledge discovery >ZigZag,a new clustering algorithm to analyze categorical variable cross-classification tables
【24h】

ZigZag,a new clustering algorithm to analyze categorical variable cross-classification tables

机译:ZigZag,一种用于分析分类变量交叉分类表的新聚类算法

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes ZigZag,a new clustering algorithm,that works on categorical variable cross-classification tables.Zigzag creates simultaneously two partitions of row and column categories in accordance with the equivalence relation "to have the same conditional mode".These two partitions are associated one to one and onto,creating by that way row-column clusters.Thus,we have an efficient KDD tool which we can apply to any database .Moreover,ZigZag visualizes predictive association for nominal data in the sense of Guttman variable Y conditionally to an other X consists in choosing the conditionally most probable category of Y when knowing X and the power of this rule is evaluated by the mean proportional reduction in error denoted by #lambda#_Y/X.It would appear then that the mapping furnished by ZigZag plays for nominal data the same role as the scattered diagram and the curves of conditional means or the straight regression line play for quantitative data,the first increased with the values of #lambda#_Y/X and #lambda#_X/Y,the second increased with the correlation ratio or the R~2.
机译:本文提出了一种新的聚类算法ZigZag,该算法可用于分类变量交叉分类表。Zigzag根据等价关系“具有相同的条件模式”同时创建行和列类别的两个分区。这两个分区相关联因此,我们有一个有效的KDD工具,可以将其应用于任何数据库。此外,ZigZag可视化地将Guttman变量Y的名义数据的预测关联可视化为一个其他X包括在知道X时选择Y的条件最有可能的类别,并且该规则的功效由#lambda#_Y / X表示的平均误差平均比例减少来评估,然后出现ZigZag提供的映射对于名义数据,与散布图具有相同的作用,对于定量数据,条件均值曲线或直线回归线起着一定的作用,随着#lambda#_Y / X和#lambda#_X / Y的值,第二随着相关比或R〜2的增加而增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号