首页> 外文学位 >Functional classification of divergent protein sequences and molecular evolution of multi-domain proteins.
【24h】

Functional classification of divergent protein sequences and molecular evolution of multi-domain proteins.

机译:差异蛋白序列的功能分类和多域蛋白的分子进化。

获取原文
获取原文并翻译 | 示例

摘要

Transmembrane proteins and multi-domain proteins together make up more than 80% of the total proteins in any eukaryotic proteome. Therefore accurately classifying such proteins into functional classes is an important task. Furthermore, understanding the molecular evolution of multi-domain proteins is important because it shows how various domains fuse to form more complex proteins, and acquire new functions possibly affecting the organismal level of evolution. In this thesis, I first investigated the performance of several protein classifiers using one of the most divergent transmembrane protein families, the G-protein-coupled receptor (GPCR) superfamily, as an example. Alignment-free classifiers based on support vector machines using simple amino acid compositions were effective in remote-similarity detection even from short fragmented sequences. While a support vector machine using local pairwise-alignment scores showed very well-balanced performance, profile hidden Markov models were generally highly specific and well suited for classifying well-established protein family members. We suggested that different types of protein classifiers should be applied to gain the optimal mining power. Including some of these methods, combinations of multiple protein classification methods were applied to identify especially divergent plant GPCRs (or seven-transmembrane receptors) from the Arabidopsis thaliana genome. We identified 394 proteins as the candidates and provided a prioritized list including 54 proteins for further investigation. For multi-domain protein families, the distribution of urea amidolyase, urea carboxylase, and sterol-sensing domain (SSD) proteins across kingdoms was investigated. Molecular evolutionary analysis showed that the urea amidolyase genes currently found only in fungi among eukaryotes are the results of a horizontal gene transfer event from proteobacteria. Urea carboxylase genes currently found in fungi and other limited organisms were also likely derived from another ancestral gene in bacteria. Finally we showed the possibility of the bacterial origin of the eukaryotic SSD-containing proteins and that these ancestral sequences evolved into four different SSD-containing proteins acquiring specific functions. Two groups of SSD-containing proteins seemed to have been formed before the divergence of fungal and metazoan lineages by domain acquisition.
机译:在任何真核蛋白质组中,跨膜蛋白和多域蛋白共同构成了总蛋白的80%以上。因此,准确地将这些蛋白质分类为功能类别是重要的任务。此外,了解多结构域蛋白的分子进化很重要,因为它显示了各种结构域如何融合形成更复杂的蛋白,并获得可能影响生物进化水平的新功能。在本文中,我首先以最分歧的跨膜蛋白家族之一即G蛋白偶联受体(GPCR)超家族为例,研究了几种蛋白分类器的性能。基于支持向量机的简单氨基酸组成的无比对分类器即使在短片段序列中也能有效地进行远程相似性检测。虽然使用局部成对比对得分的支持向量机显示出非常均衡的性能,但隐藏的马尔可夫谱图通常具有很高的特异性,非常适合对成熟的蛋白质家族成员进行分类。我们建议应使用不同类型的蛋白质分类器以获得最佳挖掘能力。包括这些方法中的一些方法,多种蛋白质分类方法的组合被应用于从拟南芥基因组中鉴定特别发散的植物GPCR(或七跨膜受体)。我们确定了394种蛋白为候选蛋白,并提供了包括54种蛋白的优先列表以供进一步研究。对于多域蛋白家族,研究了尿素酰胺酶,尿素羧化酶和固醇感测域(SSD)蛋白在不同王国之间的分布。分子进化分析表明,目前仅在真核生物中的真菌中发现的尿素酰胺酶基因是蛋白细菌水平基因转移事件的结果。目前在真菌和其他有限生物中发现的尿素羧化酶基因也可能源自细菌中的另一个祖先基因。最后,我们证明了真核生物含有SSD蛋白质的细菌起源的可能性,并且这些祖先序列演变成四种具有特定功能的不同SSD蛋白质。在通过结构域获取使真菌谱系和后生谱系分化之前,似乎已经形成了两组含SSD的蛋白。

著录项

  • 作者

    Strope, Pooja K.;

  • 作者单位

    The University of Nebraska - Lincoln.;

  • 授予单位 The University of Nebraska - Lincoln.;
  • 学科 Biology Molecular.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 224 p.
  • 总页数 224
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号