首页> 外文期刊>Fly >A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.
【24h】

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

机译:用于注释和预测果蝇w1118单核苷酸多态性SnpEff:SNPs的程序; iso-2; iso-3。

获取原文
获取原文并翻译 | 示例
           

摘要

We describe a new computer program, SnpEff, for rapidly categorizing the effects of variants in genome sequences. Once a genome is sequenced, SnpEff annotates variants based on their genomic locations and predicts coding effects. Annotated genomic locations include intronic, untranslated region, upstream, downstream, splice site, or intergenic regions. Coding effects such as synonymous or non-synonymous amino acid replacement, start codon gains or losses, stop codon gains or losses, or frame shifts can be predicted. Here the use of SnpEff is illustrated by annotating ~356,660 candidate SNPs in ~117 Mb unique sequences, representing a substitution rate of ~1/305 nucleotides, between the Drosophila melanogaster w(1118); iso-2; iso-3 strain and the reference y(1); cn(1) bw(1) sp(1) strain. We show that ~15,842 SNPs are synonymous and ~4,467 SNPs are non-synonymous (N/S ~0.28). The remaining SNPs are in other categories, such as stop codon gains (38 SNPs), stop codon losses (8 SNPs), and start codon gains (297 SNPs) in the 5'UTR. We found, as expected, that the SNP frequency is proportional to the recombination frequency (i.e., highest in the middle of chromosome arms). We also found that start-gain or stop-lost SNPs in Drosophila melanogaster often result in additions of N-terminal or C-terminal amino acids that are conserved in other Drosophila species. It appears that the 5' and 3' UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus. As genome sequencing is becoming inexpensive and routine, SnpEff enables rapid analyses of whole-genome sequencing data to be performed by an individual laboratory.
机译:我们描述了一种新的计算机程序SnpEff,用于快速分类基因组序列中变体的影响。对基因组进行测序后,SnpEff将根据变体的基因组位置对其进行注释,并预测其编码效果。带注释的基因组位置包括内含子,非翻译区,上游,下游,剪接位点或基因间区。可以预测编码效果,例如同义或非同义氨基酸替换,起始密码子获得或损失,终止密码子获得或损失或移码。在这里,通过注释果蝇w(1118)之间〜117 Mb唯一序列中的〜356,660个候选SNP(代表〜1/305个核苷酸的取代率)来说明SnpEff的使用。 iso-2; iso-3应变和参考y(1); cn(1)bw(1)sp(1)应变。我们显示〜15,842个SNP是同义词,而〜4,467个SNP是非同义词(N / S〜0.28)。其余的SNP属于其他类别,例如5'UTR中的终止密码子增益(38 SNP),终止密码子丢失(8 SNP)和起始密码子增益(297 SNP)。正如我们所料,我们发现SNP频率与重组频率成正比(即在染色体臂的中间最高)。我们还发现,果蝇中的SNP的开始增益或终止丢失通常会导致添加其他果蝇物种中保守的N端或C端氨基酸。似乎5'和3'UTRs是果蝇属进化过程中改变蛋白质末端的遗传变异的库。随着基因组测序变得越来越便宜和常规,SnpEff使得可以由单个实验室对全基因组测序数据进行快速分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号