首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >An Associative Classification Based Approach for Detecting SNP-SNP Interactions in High Dimensional Genome
【24h】

An Associative Classification Based Approach for Detecting SNP-SNP Interactions in High Dimensional Genome

机译:基于关联分类的高维基因组SNP-SNP相互作用检测方法

获取原文

摘要

There have been many studies that depict genotype-phenotype relationships by identifying genetic variants associated with a specific disease. Researchers focus more attention on interactions between SNPs that are strongly associated with disease in the absence of main effect. In this context, a number of machine learning and data mining tools are applied to identify the combinations of multi-locus SNPs in higher order data. However, none of the current models can identify useful SNP-SNP interactions for high dimensional genome data. Detecting these interactions is challenging due to bio-molecular complexities and computational limitations. The goal of this research was to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for two locus epistasis interactions using simulated data. The datasets were generated for 5 different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 23,400 datasets were generated and several experiments are conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the previous approaches. Though associative classification showed only relatively small improvement in accuracy for balanced datasets, it outperformed existing approaches in higher order multi-locus interactions in imbalanced datasets.
机译:已经有许多研究通过鉴定与特定疾病相关的遗传变异来描述基因型与表型的关系。研究人员将更多的注意力集中在与SNP之间的相互作用上,这些SNP在没有主要作用的情况下与疾病密切相关。在这种情况下,许多机器学习和数据挖掘工具被应用于识别高阶数据中的多位点SNP的组合。但是,当前的模型都无法识别出用于高维基因组数据的有用的SNP-SNP相互作用。由于生物分子的复杂性和计算限制,检测这些相互作用具有挑战性。这项研究的目的是实施关联分类,并研究其在平衡和不平衡数据集中检测上位性的有效性。使用模拟数据对两种基因座上位相互作用进行了评估。通过改变遗传力,次要等位基因频率和样本量,为5种不同的外显功能生成了数据集。总共生成了23,400个数据集,并进行了一些实验来确定疾病引起SNP的相互作用。所提出的方法与以前的方法进行分类的准确性进行了比较。尽管关联分类对平衡数据集的准确性仅显示了相对较小的提高,但在不平衡数据集中更高阶的多位元交互中,其表现优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号