首页> 外文会议>The 9th World Multi-Conference on Systemics, Cybernetics and Informatics(WMSCI 2005) vol.8 >Genome-wide protein structural annotation database, GTOP and a correction of the proportions of structural annotations in the bacterial genomes
【24h】

Genome-wide protein structural annotation database, GTOP and a correction of the proportions of structural annotations in the bacterial genomes

机译:全基因组蛋白质结构注释数据库,GTOP和细菌基因组中结构注释比例的校正

获取原文
获取原文并翻译 | 示例

摘要

Large-scale genome projects generate an unprecedented number of protein sequences. Predicting die 3D structures of sequences provides important clues as to their functions. We constructed the Genomes TO Protein structures and functions (GTOP) database, containing protein fold predictions of a huge number of sequences. Predictions are mainly carried out with the homology search program PSI-BLAST, currently the most popular among high-sensitivity profile search methods. Some genomes analyzed m GTOP show die exceptionally tow percentages of structural annotations. Length distributions of amino acid sequences indicated that the annotated genes in these genomes could contain ORFs by chance, which are just open for translation but not real protein-coding sequences. We corrected the total numbers of genes in these genomes by reviewing ORFs without sequence similarity to any known protein, which are referred to as "orphan" genes, leading to the correction of proportions of structural annotations.
机译:大规模的基因组计划产生了前所未有的蛋白质序列。预测序列的3D结构可提供有关其功能的重要线索。我们建立了基因组TO蛋白质结构和功能(GTOP)数据库,其中包含大量序列的蛋白质折叠预测。预测主要使用同源性搜索程序PSI-BLAST进行,该程序是当前在高灵敏度概图搜索方法中最受欢迎的方法。在GTOP中分析的一些基因组显示出异常多的百分比的结构注释。氨基酸序列的长度分布表明,这些基因组中带注释的基因可能偶然包含ORF,这些ORF仅开放进行翻译,而没有真正的蛋白质编码序列。我们通过审查与任何已知蛋白质没有序列相似性的ORF(称为“孤儿”基因)来纠正这些基因组中基因的总数,从而校正结构注释的比例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号