首页> 外文会议>Algorithms in bioinformatics >Maximum Likelihood Estimation of Incomplete Genomic Spectrum from HTS Data
【24h】

Maximum Likelihood Estimation of Incomplete Genomic Spectrum from HTS Data

机译:HTS数据不完整基因组谱的最大似然估计

获取原文
获取原文并翻译 | 示例

摘要

High-throughput sequencing makes possible to process samples containing multiple genomic sequences and then estimate their frequencies or even assemble them. The maximum likelihood estimation of frequencies of the sequences based on observed reads can be efficiently performed using expectation-maximization (EM) method assuming that we know sequences present in the sample. Frequently, such knowledge is incomplete, e.g., in RNA-seq not all iso-forms are known and when sequencing viral quasispecies their sequences are unknown. We propose to enhance EM with a virtual string and incorporate it into frequency estimation tools for RNA-Seq and quasispecies sequencing. Our simulations show that EM enhanced with the virtual string estimates string frequencies more accurately than the original methods and that it can find the reads from missing quasispecies thus enabling their reconstruction.
机译:高通量测序使处理包含多个基因组序列的样品成为可能,然后估计其频率甚至组装它们。假设我们知道样本中存在的序列,可以使用期望最大化(EM)方法有效地执行基于观察到的读数的序列频率的最大似然估计。通常,这种知识是不完整的,例如,在RNA序列中,并非所有同种型都是已知的,并且在对病毒准种进行测序时,其序列是未知的。我们建议使用虚拟字符串增强EM,并将其合并到用于RNA-Seq和准物种测序的频率估计工具中。我们的仿真结果表明,使用虚拟字符串增强的EM比原始方法估计的字符串频率更准确,并且它可以从缺失的准物种中找到读数,从而可以对其进行重构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号