Objective This study proposes two-stage analysis strategy to combine the advantages of two types of methods in order to provide a method guidance for the genetic association study.Methods SKAT,LASSO,EN and two-stage strategies(SKAT+EN,SKAT+LASSO,EN+SKAT,LASSO+SKAT)as well as bi-level variable selection models(cMCP,Gel)are used in the data of the genetic analysis workshop 18 to compare their application performance.Results At the gene level show that the method of SKAT has the highest average sensitivity and average Youden index.The rate of gene of these statistical methods except the method of SKAT are associated with the number of SNPs within the gene and the proportion of explained variance of DBP.The result at the SNP level indicate that the method of EN has highest sensitivity.The highest Youden index is counted by EN+SKAT method and the second is EN method.The gene of MAP4 and SNPs that is the largest contribution to DBP all selected by the various statistical analysis.Conclusion The combination of the methods of EN and SKAT could screen few number variants that associate with phenotypes in big data.This methods not only has high sensitivity but also has restraint false positives,it could provide some clues for the future studies of genetic mechanisms.%目的 本研究提出两阶段分析策略,将SKAT与惩罚回归模型联合应用,为遗传关联研究提供方法学选择的依据和指导.方法 本研究使用遗传分析工作组18的数据,分别采用SKAT,LASSO,EN,cMCP,Gel以及两阶段统计分析策略(SKAT+EN,SKAT+LASSO,EN+SKAT,LASSO+SKAT)进行关联性分析.结果 在基因水平,SKAT法的平均灵敏度与约登指数最高.除SKAT法外,其余统计策略的关联基因选出率均与对结局方差解释的比例和基因中包含SNPs的数目存在关联.在SNPs水平,EN法与EN+SKAT的灵敏度与约登指数最高.不同的统计策略均能把对结局效应贡献最大的真关联基因MAP4与SNPs筛选出来.结论 SKAT和EN联合分析策略能够在数百万SNPs中筛选主要的疾病关联基因与SNPs,并在基因水平上统计推断,有着较高灵敏度,同时还能控制严重的假阳性错误,为遗传关联研究提供了一种较为高效的统计分析策略.
展开▼