首页> 外文期刊>PLoS One >Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak
【24h】

Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak

机译:甲型甲型A / H5N8爆发过程中全基因组测序和少数核苷酸变异(MSNV)分析的测序方法和数据处理管道的比较

获取原文
           

摘要

As high-throughput sequencing technologies are becoming more widely adopted for analysing pathogens in disease outbreaks there needs to be assurance that the different sequencing technologies and approaches to data analysis will yield reliable and comparable results. Conversely, understanding where agreement cannot be achieved provides insight into the limitations of these approaches and also allows efforts to be focused on areas of the process that need improvement. This manuscript describes the next-generation sequencing of three closely related viruses, each analysed using different sequencing strategies, sequencing instruments and data processing pipelines. In order to determine the comparability of consensus sequences and minority (sub-consensus) single nucleotide variant (mSNV) identification, the biological samples, the sequence data from 3 sequencing platforms and the *.bam quality-trimmed alignment files of raw data of 3 influenza A/H5N8 viruses were shared. This analysis demonstrated that variation in the final result could be attributed to all stages in the process, but the most critical were the well-known homopolymer errors introduced by 454 sequencing, and the alignment processes in the different data processing pipelines which affected the consistency of mSNV detection. However, homopolymer errors aside, there was generally a good agreement between consensus sequences that were obtained for all combinations of sequencing platforms and data processing pipelines. Nevertheless, minority variant analysis will need a different level of careful standardization and awareness about the possible limitations, as shown in this study.
机译:随着高通量测序技术的越来越广泛地用于分析疾病爆发的病原体,需要保证不同的测序技术和数据分析方法将产生可靠和可比的结果。相反,理解无法实现协议,提供对这些方法的局限性的洞察力,并且还允许努力专注于需要改进的过程的区域。该稿件描述了三种密切相关的病毒的下一代测序,每次使用不同的测序策略,测序仪器和数据处理管道分析。为了确定共识序列和少数群体(副共识)单核苷酸变异(MSNV)鉴定的可比性,生物样本,来自3个测序平台的序列数据和3的原始数据的* .BAM质量整理对准文件3共享流感A / H5N8病毒。该分析表明,最终结果的变化可能归因于该过程中的所有阶段,但最关键的是454序列引入的众所周知的均聚物误差,以及不同数据处理管道中的对准过程影响MSNV检测。然而,除了均聚物误差,一般是在测序平台和数据处理管道的所有组合获得的共识序列之间存在良好的一致性。然而,少数群体变异分析需要不同程度的仔细标准化和对可能局限性的认识,如本研究所示。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号