首页> 外文期刊>Biocybernetics and biomedical engineering >Exploiting Stein's paradox in analysing sparse data from genome-wide association studies
【24h】

Exploiting Stein's paradox in analysing sparse data from genome-wide association studies

机译:利用斯坦因悖论分析全基因组关联研究中的稀疏数据

获取原文
获取原文并翻译 | 示例
           

摘要

Unbiased estimation appeared to be an accepted golden standard of statistical analysis ever until the Stein's discovery of a surprising phenomenon attributable to multivariate spaces. So called Stein's paradox arises in estimating the mean of a multivariate standard normal random variable. Stein showed that both natural and intuitive estimate of a multivariate mean given by the observed vector itself is not even admissible and may be improved upon under the squared-error loss when the dimension is greater or equal to three. Later Stein and his student James developed so called 'James-Stein estimator', a shrunken estimate of the mean, which had uniformly smaller risk for all values in the parameter space. The paradox first appeared both unintuitive and even unacceptable, but later it was recognised as one of the most influential discoveries of all times in statistical science. Today the 'shrinkage principle' literally permeates the statistical technology for analysing multivariate data, and in its application is not exclusively confined to estimating the mean, but also the covariance structure of multivariate data. We develop shrinkage versions of both the linear and quadratic discriminant analysis and apply them to sparse multivariate gene expression data obtained at the Centre for Biomedical Informatics (CBI) in Prague. (C) 2014 Nalecz Institute of Biocybemetics and Biomedical Engineering. Published by Elsevier Urban & Partner Sp. z o.o. All rights reserved.
机译:自从斯坦因发现多元空间令人惊讶的现象以来,无偏估计似乎已成为统计分析的公认黄金标准。所谓的斯坦因悖论产生于估计多元标准正态随机变量的均值。 Stein表明,由观察到的向量本身给出的多元均值的自然和直观估计甚至都不可接受,并且当维数大于或等于3时,在平方误差损失下可能会得到改善。后来,斯坦因和他的学生詹姆斯开发了所谓的“詹姆斯-斯坦因估计器”,这是对均值的缩小估计,对于参数空间中的所有值,风险均较小。悖论最初看起来既不直观,甚至令人无法接受,但后来被公认为统计科学史上最有影响力的发现之一。如今,“收缩原理”从字面上渗透到用于分析多元数据的统计技术中,并且在其应用中不仅限于估计均值,还包括多元数据的协方差结构。我们开发线性和二次判别分析的缩减版本,并将其应用于在布拉格的生物医学信息学中心(CBI)获得的稀疏多变量基因表达数据。 (C)2014 Nalecz生物仿制药和生物医学工程研究所。由Elsevier Urban&Partner Sp。动物园。版权所有。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号