首页> 中文期刊> 《计算机应用与软件》 >互信息、冗余与广义似然比研究

互信息、冗余与广义似然比研究

         

摘要

统计学里有很多描述变量间相关性的方法,大部分都要求随机变量必须服从某一或某些概率分布,要么就是满足一定的假设。互信息是基于熵来测量随机变量间的相关性的,它不需要随机变量满足任何特定分布亦或者是特殊的前提假设。一些研究中,冗余也已经作为一种类似于互信息的方法用以评价变量间的相互关系。对冗余和互信息的概念进行深入研究,并使之用以多维的分类数据。研究发现,在几种独立对数线性模型下,分类数据的互信息和冗余可以表示为广义似然比的函数。广义似然比对样本容量是非常敏感的,但是分类数据的互信息和冗余却并不取决于样本容量而是取决于单元概率。因此互信息和冗余可以用作评价分类数据间的相关关系,既不需要特殊的前提假设又不受样本容量的影响。通过示例验证,针对多维数据,冗余又优于互信息。%There are a lot of methods concerning the description of dependency between random variables,most of which oblige random variables either obey one or a few probability distributions or satisfy certain assumptions.Mutual Information bases on entropy to measure the correlation between random variables.It doesn’t require random variables to satisfy any specific distribution or certain assumptions.In some researches,redundancy has already been used as a similar method of mutual information to estimate the relationship between the variables. The thesis explores the concept of redundancy and mutual information,and applies both to multidimensional categorized data.Researches dis-cover that for several independent log-linear models,the mutual information and redundancy of the categorized data can be expressed as a function of general likelihood ratio.General likelihood ratio is very sensitive to the size of data sample,but the mutual information and redun-dancy of the classification data does not depend on the size of data sample but on unit probability.Therefore mutual information and redundan-cy can evaluate the correlation between categorized data,requiring neither specified assumptions nor sample capacity.Case demonstrations validate that,for multidimensional data,redundancy is superior to mutual information.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号