首页> 外文期刊>Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology >Simulation‐selection‐extrapolation: Estimation in high‐dimensional errors‐in‐variables models
【24h】

Simulation‐selection‐extrapolation: Estimation in high‐dimensional errors‐in‐variables models

机译:仿真选择 - 外推:高维误差的估计变量模型

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract Errors‐in‐variables models in high‐dimensional settings pose two challenges in application. First, the number of observed covariates is larger than the sample size, while only a small number of covariates are true predictors under an assumption of model sparsity. Second, the presence of measurement error can result in severely biased parameter estimates, and also affects the ability of penalized methods such as the lasso to recover the true sparsity pattern. A new estimation procedure called SIMulation‐SELection‐EXtrapolation (SIMSELEX) is proposed. This procedure makes double use of lasso methodology. First, the lasso is used to estimate sparse solutions in the simulation step, after which a group lasso is implemented to do variable selection. The SIMSELEX estimator is shown to perform well in variable selection, and has significantly lower estimation error than naive estimators that ignore measurement error. SIMSELEX can be applied in a variety of errors‐in‐variables settings, including linear models, generalized linear models, and Cox survival models. It is furthermore shown in the Supporting Information how SIMSELEX can be applied to spline‐based regression models. A simulation study is conducted to compare the SIMSELEX estimators to existing methods in the linear and logistic model settings, and to evaluate performance compared to naive methods in the Cox and spline models. Finally, the method is used to analyze a microarray dataset that contains gene expression measurements of favorable histology Wilms tumors.
机译:摘要在高维设置中的变量误差模型在应用中构成了两个挑战。首先,观察到的协变量的数量大于样本量,而在模型稀疏性的假设下,只有少量协变量是真正的预测因子。其次,测量误差的存在可能导致严重偏置的参数估计,并且还影响惩罚方法,例如套索以恢复真正的稀疏性模式的能力。提出了一种名为Simulation-Selection-外推(Simselex)的新估计过程。此程序会双倍使用套索方法。首先,套索用于估计模拟步骤中的稀疏解决方案,之后实现了组套索以进行变量选择。 SIMSELEX估计器显示在变量选择中表现良好,并且比忽略测量误差的NAIVE估计值显着降低估计误差。 SIMSELEX可以应用于各种错误变量设置,包括线性模型,广义线性模型和COX生存模型。此外,在支持信息中示出了SimseLex如何应用于基于样条的回归模型。进行了模拟研究以将SIMSELEX估计与线性和逻辑模型设置中的现有方法进行比较,并与COX和花键模型中的天真方法进行评估。最后,该方法用于分析含有有利组织学毒性肿瘤的基因表达测量的微阵列数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号