首页> 外文学位 >Quantitative, predictive modeling of biochemical networks: A machine learning and information-theoretic approach.
【24h】

Quantitative, predictive modeling of biochemical networks: A machine learning and information-theoretic approach.

机译:生化网络的定量预测模型:一种机器学习和信息理论方法。

获取原文
获取原文并翻译 | 示例

摘要

In the post-genomic era, developments in high-throughput technologies and statistical inference have automated the task of enumerating all of the cellular components and their interactions, generating an unprecedented volume of data about biochemical networks. While providing more detail than just genomic sequences, the interaction data alone does not generate new biological insight. Quantitative analysis of biochemical networks can be used to elucidate mechanisms of normal and pathological cell function, make experimental predictions, and provide novel biological insights. In this dissertation we apply tools from machine learning and information theory to analyze such networks at multiple scales, building quantitative and predictive models to resolve three open questions in biology.; At the scale of small networks, we investigate the input-output relation of stochastic transduction networks and find that all such networks can transmit input information optimally. Our analysis poses a potential solution for the "cross-talk" dilemma whereby multiple distinct inputs converging onto a single pathway reliably trigger appropriate output responses. Networks can robustly transmit information in the presence of large (tenfold) parameter perturbations, and simultaneously adapt their behavior without significantly compromising transmission fidelity. We demonstrate that some features of the network can impart an advantage to the network for noise regulation and therefore information transmission. Predictions are validated using a database of known transcription factors.; Using counts of subgraphs appearing in known protein-protein interaction networks, we exploit a recently developed machine learning algorithm to discriminate between proposed network evolution mechanisms. We then classify the protein protein interaction networks of D. melanogaster and S. cerevisaie as one of these mechanisms. Our results predict that the protein networks most closely resemble a particular duplication-mutation mechanism. The predictions are confident (with high probability), consistent (over species and feature spaces), and robust (with respect to errors in the interaction data). In addition, these results provide statistical validation of recent evidence from comparative genomic sequence data demonstrating the importance of functional preservation and paralog self-linking.; Finally, at the scale of the global organization of the network, we adapt a principled, information-theoretic, unsupervised learning algorithm to identify modules in a network. Testing the approach on synthetic networks, we find that the algorithm compares favorably to the standard module discovery technique in terms of identifying the correct modules and determining the true number of modules. We present the first quantitative, parameter-free definition of network modularitya dimensionless number between 0 and 1, which we then use to demonstrate that the E. coli genetic regulatory network is modular. Applications to a social network of physics collaborators and the E. coli network are presented.
机译:在后基因组时代,高通量技术和统计推断的发展已使枚举所有细胞成分及其相互作用的任务自动化,从而产生了前所未有的有关生化网络的数据。尽管提供的不仅仅是基因组序列,但相互作用数据本身并不能产生新的生物学见解。生化网络的定量分析可用于阐明正常和病理细胞功能的机制,做出实验性预测并提供新颖的生物学见解。在本文中,我们应用了机器学习和信息理论的工具,以多尺度分析此类网络,建立了定量和预测模型,以解决生物学中的三个开放性问题。在小型网络的规模上,我们研究了随机转导网络的输入输出关系,发现所有这些网络都可以最佳地传输输入信息。我们的分析为“串扰”难题提出了一种潜在的解决方案,在该难题中,多个不同的输入会聚到一条路径上,从而可靠地触发适当的输出响应。网络可以在存在较大(十倍)参数扰动的情况下稳健地传输信息,并同时适应其行为,而不会显着影响传输保真度。我们证明了网络的某些功能可以为网络带来噪声调节和信息传输的优势。使用已知转录因子的数据库验证预测。使用出现在已知蛋白质-蛋白质相互作用网络中的子图计数,我们利用最近开发的机器学习算法来区分建议的网络进化机制。然后,我们将D. melanogaster和S. cerevisaie的蛋白质蛋白质相互作用网络分类为这些机制之一。我们的结果预测,蛋白质网络与特定的复制突变机制最相似。这些预测是自信的(很有可能),一致的(在物种和特征空间上)和鲁棒的(相对于交互数据中的错误)。此外,这些结果提供了来自比较基因组序列数据的最新证据的统计验证,证明了功能保存和旁系同源物自连接的重要性。最后,在网络的全球组织规模上,我们采用了一种有原则的,信息论的,无监督的学习算法来识别网络中的模块。在综合网络上测试该方法,我们发现该算法在识别正确的模块和确定模块的真实数量方面,与标准模块发现技术相比具有优势。我们提出了网络模块化的第一个无参数的定量的,无参数的定义,它是介于0和1之间的无量纲数,然后我们用它来证明大肠杆菌遗传调控网络是模块化的。介绍了在物理合作者社交网络和大肠杆菌网络中的应用。

著录项

  • 作者

    Ziv, Etay.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Engineering Biomedical.; Biology Bioinformatics.; Biophysics General.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 224 p.
  • 总页数 224
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物医学工程;生物物理学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号