Maximum entropy based significance of itemsets

Nikolaj Tatti

首页> 外文期刊>Knowledge and Information Systems >Maximum entropy based significance of itemsets

【24h】

Maximum entropy based significance of itemsets

机译：基于最大熵的项集重要性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of defining the significance of an itemset. We say that the itemset is significant if we are surprised by its frequency when compared to the frequencies of its sub-itemsets. In other words, we estimate the frequency of the itemset from the frequencies of its sub-itemsets and compute the deviation between the real value and the estimate. For the estimation we use Maximum Entropy and for measuring the deviation we use Kullback–Leibler divergence. A major advantage compared to the previous methods is that we are able to use richer models whereas the previous approaches only measure the deviation from the independence model. We show that our measure of significance goes to zero for derivable itemsets and that we can use the rank as a statistical test. Our empirical results demonstrate that for our real datasets the independence assumption is too strong but applying more flexible models leads to good results.

机译：我们考虑定义项目集重要性的问题。我们说，如果与子项目集的频率相比我们对它的频率感到惊讶，则该项目集很重要。换句话说，我们从子项目集的频率估计项目集的频率，并计算实际值与估计值之间的偏差。对于估计，我们使用最大熵，对于偏差，我们使用Kullback-Leibler散度。与以前的方法相比，一个主要优点是我们能够使用更丰富的模型，而以前的方法仅测量与独立模型的偏差。我们表明，对于衍生项目集，我们的重要性度量为零，并且可以将等级用作统计检验。我们的经验结果表明，对于我们的真实数据集，独立性假设过强，但应用更灵活的模型会产生良好的结果。

著录项

来源
《Knowledge and Information Systems》 |2008年第1期|p.57-77|共21页
作者
Nikolaj Tatti;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Binary data mining; Itemsets; Maximum entropy;

机译：二进制数据挖掘;项目集;最大熵;

相似文献

外文文献
中文文献
专利

1. Maximum entropy based significance of itemsets [J] . Nikolaj Tatti Knowledge and information systems . 2008,第1期

机译：基于最大熵的项集重要性
2. Criterion of Significance Level for Selection of Order of Spectral Estimation of Entropy Maximum [J] . V. V. Savchenko, A. V. Savchenko Radioelectronics and Communications Systems . 2019,第5期

机译：熵最大值谱估计阶数的选择的重要度标准
3. Significance of higher moments for complete characterization of the travel time probability density function in heterogeneous porous media using the maximum entropy principle [J] . Hrvoje Gotovac, Vladimir Cvetkovic, Roko Andricevic Water resources research . 2010,第5期

机译：使用最大熵原理，更高矩对于完全表征非均质多孔介质中传播时间概率密度函数的意义
4. Maximum Entropy Based Significance of Itemsets [C] . Nikolaj Tatti International Conference on Data Mining . 2007

机译：基于熵的最大熵的重要性
5. Statistical machine translation: Maximum entropy based translation models and search algorithms. [D] . Garcia Varea, Ismael. 2003

机译：统计机器翻译：基于最大熵的翻译模型和搜索算法。
6. Agricultural Water Resources Management Using Maximum Entropy and Entropy-Weight-Based TOPSIS Methods [O] . Mo Li, Hao Sun, Vijay P. Singh, 2019

机译：采用最大熵和基于熵权的Topsis方法的农业水资源管理
7. Maximum entropy based significance of itemsets [O] . Nikolaj Tatti 2008

机译：基于最大熵的项集的重要性

Maximum entropy based significance of itemsets

摘要

著录项

相似文献

相关主题

期刊订阅