首页> 外文会议>IEEE Asia Pacific Conference on Circuits and Systems >Efficient data transfer scheme using word-pair-encoding-based compression for large-scale text-data processing
【24h】

Efficient data transfer scheme using word-pair-encoding-based compression for large-scale text-data processing

机译:使用基于单词对编码的压缩的高效数据传输方案,用于大规模文本数据处理

获取原文

摘要

Large-scale data processing is very common in many fields such as data-mining, genome mapping, etc. To accelerate such processing, Graphic Accelerator Units (GPU) and FPGAs (Feild-Programmable Gate-Array) are used. However, the large data transfer time between the accelerator and the host computer is a huge performance bottleneck. In this paper, we use a word-pair-encoding method to compress the data down to 25% of its original size. The encoded data can be decoded from any position without decoding the whole data file. For some algorithms, the encoded data can be processed without decoding. Using Burrows-Wheeler algorithm based text search, we show that the data amount and transfer time can be reduced by over 70%.
机译:大规模数据处理在诸如数据挖掘,基因组图谱等许多领域中非常普遍。为了加速这种处理,使用了图形加速器单元(GPU)和FPGA(Feild可编程门阵列)。但是,加速器和主机之间的大量数据传输时间是巨大的性能瓶颈。在本文中,我们使用词对编码方法将数据压缩到原始大小的25%。可以从任何位置解码编码的数据,而无需解码整个数据文件。对于某些算法,可以在不解码的情况下处理已编码的数据。使用基于Burrows-Wheeler算法的文本搜索,我们发现数据量和传输时间可减少70%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号