首页> 外文期刊>Computational statistics & data analysis >An extended variable inclusion and shrinkage algorithm for correlated variables
【24h】

An extended variable inclusion and shrinkage algorithm for correlated variables

机译:相关变量的扩展变量包含与收缩算法

获取原文
获取原文并翻译 | 示例
           

摘要

The problem of variable selection for linear regression in a high dimension model is considered. A new method, called Extended-VISA (Ext-VISA), is proposed to simultaneously select variables and encourage a grouping effect where strongly correlated predictors tend to be in or out of the model together. Moreover, Ext-VISA is capable of selecting a sparse model while avoiding the overshrinkage of a Lasso-type estimator. It combines the idea of the VISA algorithm which avoids the overshrinkage problem of regression coefficients and those of the Lasso-type estimators, based on ~(?1)+ ~(?2) penalty, that overcome the limitation of the grouping effect of Lasso in high dimension. It is based on a modified VISA algorithm, so it is also computationally efficient. Three interesting cases of Ext-VISA are examined. The first case is Smooth-VISA (SVISA), where the variations among successive regression coefficients are low. The second case is VISA-Net (VNET), where the correlations between predictors are taken into account. The third case is Laplacian-VISA (LVISA), where the predictors are measured on an undirected graph. A theoretical property on sparsity inequality of Ext-VISA is established. A detailed simulation study in small and high dimensional settings is performed, which illustrates the advantages of the new approach in relation to several other possible methods. Finally, we apply VNET, SVISA and LVISA to a GC-retention data set.
机译:考虑了高维模型中线性回归的变量选择问题。提出了一种称为扩展-VISA(Ext-VISA)的新方法,该方法可以同时选择变量并在具有强相关性的预测变量倾向于同时进入或退出模型的情况下提高分组效果。此外,Ext-VISA能够选择稀疏模型,同时避免套索型估计器的过度收缩。它结合了VISA算法的思想,该算法避免了回归系数和基于套索(〜1)+〜(?2)罚分的套索型估计量的过度收缩问题,克服了套索的分组效应的局限性高尺寸。它基于改进的VISA算法,因此计算效率也很高。研究了Ext-VISA的三个有趣案例。第一种情况是Smooth-VISA(SVISA),其中连续回归系数之间的差异很小。第二种情况是VISA-Net(VNET),其中考虑了预测变量之间的相关性。第三种情况是Laplacian-VISA(LVISA),其中的预测变量是在无向图上测量的。建立了Ext-VISA稀疏不等式的理论性质。在小尺寸和高尺寸环境下进行了详细的仿真研究,这说明了新方法相对于其他几种可能方法的优势。最后,我们将VNET,SVISA和LVISA应用于GC保留数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号