【24h】

PLDSRC: A Multi-threaded Compressor/Decompressor for Massive DNA Sequencing Data

机译:PLDSRC:用于海量DNA测序数据的多线程压缩器/解压缩器

获取原文
获取原文并翻译 | 示例

摘要

To face the rapid growth of DNA sequencing data, it is of great importance to study high efficiency compression techniques to reduce the cost of storing the massive amount of sequencing data. In this paper, we propose a parallel DNA data compressor/decompress or, PLDSRC, based on the famous serial DSRC software. We first analyze the compression and decompression algorithm in DSRC and identity three basic operations, namely read, work, and write. Then a single pipeline parallel algorithm is proposed to accelerate the compression/decompression procedure. To further exploit today's popular multi-core, multi-socket systems based on the non-uniform memory access (NUMA) architecture, we extend the single pipeline approach to the multi-pipeline case. Experiments on two different platforms are done and show that PLDSRC in both single and multiple pipeline forms is able to speed up DNA sequencing data compression/decompression greatly, while maintaining the same compressing ratio. Examples indicate that the maximum speedup of PLDSRC on compressing and decompressing is respectively around 24.71x and 22.00x, as compared to the serial DSRC software.
机译:面对DNA测序数据的快速增长,研究高效压缩技术以降低存储大量测序数据的成本非常重要。在本文中,我们基于著名的串行DSRC软件,提出了并行DNA数据压缩/解压缩或PLDSRC。我们首先分析DSRC中的压缩和解压缩算法,并确定三个基本操作,即读取,工作和写入。然后提出了一种单管道并行算法来加速压缩/解压缩过程。为了进一步利用基于非统一内存访问(NUMA)架构的当今流行的多核,多插槽系统,我们将单管道方法扩展到多管道情况。在两个不同平台上进行的实验表明,单管道和多管道形式的PLDSRC能够大大加快DNA测序数据的压缩/解压缩,同时保持相同的压缩率。示例表明,与串行DSRC软件相比,PLDSRC在压缩和解压缩时的最大加速分别约为24.71x和22.00x。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号