首页> 外文学位 >Genetic programming optimized neural networks for identifying gene-gene interactions.
【24h】

Genetic programming optimized neural networks for identifying gene-gene interactions.

机译:遗传程序设计优化了用于识别基因与基因相互作用的神经网络。

获取原文
获取原文并翻译 | 示例

摘要

The identification and characterization of susceptibility genes for common complex human diseases presents several difficult challenges for human geneticists. Many disease susceptibility genes exhibit effects that are dependent partially or solely on interactions with other genes. These interactions, known as epistasis, are difficult to detect using traditional statistical methods due to several important limitations [Templeton 2000]. The reason for the difficulty in identifying interactions is that in high-dimensions, many contingency table cells are empty which leads to large standard errors and coefficient estimates [Hosmer and Lemeshow 2000]. This is sometimes referred to as the curse of high-dimensionality. To deal with this issue, one can collect a very large sample size to reduce the number of empty cells. This can however, be prohibitively expensive. The other alternative is to develop new statistical methods that have improved power to identify high-order interactions in relatively small sample sizes. Many groups have used neural networks as a new statistical approach. Neural networks are a supervised pattern recognition method commonly used in many fields for data mining. Defining the NN architecture is crucial for success in data mining. This can be challenging when the underlying model of the data is unknown. Therefore, we will use genetic programming (GP) to optimize the architecture of the NN (GPNN). Through simulation studies, we will validate this new statistical approach and estimate the power of this method for detecting interactions. We will then compare the performance of this approach with that of a traditional neural network methodology. Finally, we will analyze two different breast cancer case-control data sets with the optimal neural network approach to detect gene-gene interactions associated with sporadic breast cancer. The goal of this study is to develop a new statistical methodology that has improved power for detecting gene-gene interactions in common, complex diseases and demonstrate its utility in both simulated data and real case-control data.
机译:对常见的复杂人类疾病的易感基因的鉴定和表征对人类遗传学家提出了一些困难的挑战。许多疾病易感基因表现出的效应部分或完全取决于与其他基因的相互作用。由于一些重要的限制,这些相互作用(称为上位性)很难使用传统的统计方法进行检测[Templeton 2000]。难以确定相互作用的原因是,在高维中,许多列联表均是空的,这会导致较大的标准误和系数估计[Hosmer and Lemeshow 2000]。有时这被称为高维诅咒。为了解决这个问题,可以收集非常大的样本量以减少空单元格的数量。但是,这可能会非常昂贵。另一种选择是开发新的统计方法,该方法具有增强的能力,可以在相对较小的样本量中识别高阶相互作用。许多小组已经使用神经网络作为一种新的统计方法。神经网络是一种监督模式识别方法,通常在许多领域中用于数据挖掘。定义NN体系结构对于数据挖掘的成功至关重要。当数据的基础模型未知时,这可能具有挑战性。因此,我们将使用遗传编程(GP)来优化NN(GPNN)的体系结构。通过仿真研究,我们将验证这种新的统计方法,并估计该方法用于检测交互的功能。然后,我们将比较这种方法与传统神经网络方法的性能。最后,我们将使用最佳神经网络方法分析两个不同的乳腺癌病例对照数据集,以检测与散发性乳腺癌相关的基因-基因相互作用。这项研究的目的是开发一种新的统计方法,该方法可提高检测常见,复杂疾病中基因与基因相互作用的能力,并证明其在模拟数据和实际病例对照数据中的效用。

著录项

  • 作者

    Ritchie, Marylyn DeRiggi.;

  • 作者单位

    Vanderbilt University.;

  • 授予单位 Vanderbilt University.;
  • 学科 Biology Genetics.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 163 p.
  • 总页数 163
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 遗传学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号