...
首页> 外文期刊>Source Code for Biology Medicine >ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets
【24h】

ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets

机译:ROVER变异调用者:应用于基于PCR的大规模并行测序数据集的读对重叠体贴变异调用软件

获取原文
           

摘要

Background We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. Results ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. Methods ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a ‘call’ to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening. Conclusions ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users.
机译:背景技术最近,我们描述了Hi-Plex,这是一种用于大规模并行测序(MPS)的高度复用,基于PCR的靶标富集系统,该系统可对文库大小进行统一定义,以便随后的配对末端测序可实现读取对的完全重叠。因此,来自Hi-Plex的数据集的变体调用可以依赖于在两个读对读取中出现的变体的识别,从而可以严格过滤测序化学诱导的错误。这些原则是ROVER软件(源自Read Overlap PCR-MPS变异调用者)的基础,我们最近使用该软件报告了乳腺癌易感基因PALB2中遗传突变的筛选。在这里,我们描述了基于ROVER的算法及其用法。结果ROVER使用户能够快速,准确地从以PCR为靶标,重叠的配对末端MPS数据集中识别遗传变异。该软件的开源可用性和阈值可定制性使广泛的PCR-MPS用户可以访问。方法ROVER是用Python实现的,并且可以在所有流行的类似POSIX的操作系统(Linux,OS X)上运行。该软件接受制表符分隔的文本文件,该文件列出基于指定的基因组构建用于靶标富集的靶标特异性引物的坐标。它还接受因映射到相同基因组构建而产生的比对序列文件。 ROVER识别给定读对代表的扩增子,并使用作图坐标和引物坐标删除引物序列。它考虑了关于引物插入序列的重叠阅读对。仅当在读取对的两个读取中均观察到变体时,信号才有助于包含或不包含变体的读取对的计数。用户定义的阈值告知必须进行观察的变体的最小读对数和比例,才能进行“呼叫”。 ROVER还报告了跨扩增子的覆盖深度,以方便鉴定可能需要进一步筛选的任何区域。结论ROVER可以促进快速准确的遗传变异,要求广泛的PCR-MPS用户。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号