...
【24h】

Stability

机译:稳定性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to "reasonable" perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, while robust statistics methods deal with perturbations to models. In this article, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability for interpretable and reliable encoding models from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statistical inference, such as sensitivity analysis and effect detection. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performance across over 2,000 voxels. Last, a novel "stability" argument is seen to drive new results that shed light on the intriguing interactions between sample to sample variability and heavier tail error distribution (e.g., double-exponential) in high-dimensional regression models with p predictors and n independent samples. In particular, when p → κ ∈ (0.3, 1) and the error distribution is double-exponential, the Ordinary Least Squares (OLS) is a better estimator than the Least Absolute Deviation (LAD) estimator.
机译:任何科学发现都必须具有可重复性。现代科学发现常常依赖于对高维数据的统计分析。相对于对数据和所用模型的“合理”扰动,可重复性至少表现为统计结果的稳定性。折刀,自举和交叉验证基于对数据的扰动,而健壮的统计方法则处理对模型的扰动。在本文中,论证了统计数据稳定的重要性。首先,我们激发了对来自脑功能磁共振成像信号的可解释且可靠的编码模型的稳定性的必要性。其次,我们在文献中发现了有力的证据来证明稳定性在统计推断中的核心作用,例如敏感性分析和效果检测。第三,为拉索提出了一种基于估计稳定性(ES)的平滑参数选择器ES-CV,以使稳定性能够承担交叉验证(CV)。然后,在编码模型中利用ES-CV将预测因子的数量减少60%,而在2,000多个体素上几乎没有损失(1.3%)的预测性能。最后,人们看到了一种新颖的“稳定性”论点,从而得出了新的结果,阐明了在具有p个预测变量且n个独立变量的高维回归模型中,样品之间的变异性和较重的尾部误差分布(例如,双指数)之间的有趣相互作用。样品。特别是,当p / n→κ∈(0.3,1)并且误差分布是双指数时,普通最小二乘(OLS)比最小绝对偏差(LAD)估计器更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号