首页> 外文期刊>Indian Journal of Science and Technology >Design and Implementation of GPU-based File Similarity Evaluation System
【24h】

Design and Implementation of GPU-based File Similarity Evaluation System

机译:基于GPU的文件相似度评估系统的设计与实现

获取原文
           

摘要

Background/Objectives: Recently, storage systems and backup systems are popularly used and the number of duplicated data is increased drastically. To minimize data storage size and efficient use of network bandwidth, we proposed deduplication systems and file similarity measurement schemes with GPGPU scheme. The GPGPUs are applied to file similarity measurement for computation speedup. Methods/Statistical Analysis: To cope with the problem accompanying the parallelization of the measurement, we compare two implementations with shared memory and preprocessing. In addition, we propose an alternative to Rabin fingerprinting algorithm to lessen the computational burden of the algorithm to the GPUs. We compare the performance of the systems in time elapsed for several files. Findings: First, we found through experiments that the preprocessing was slightly faster than the shared memory scheme for the overlapped region of consecutive data segments which were assigned to different cores. This region should be shared by two cores for fingerprinting. By adapting GPGPU parallelization with the preprocessing technique for file similarity measurement, the proposed system outperformed the systems with a multi-core CPU. Also, it gets faster for the bigger file. In addition, we made the system three times faster by adapting an alternative to Rabin fingerprinting algorithm. It eliminates the computational burden of the algorithm and provides comparable results to the system with the latter. Improvements: The procedure will be beneficial to de-duplication system in determining file similarity and finding duplicated regions of two files. We achieved speedup in the measurement of file similarity by parallelization on GP-GPUs with two methods for overlaps of consecutive data segments and an alternative fingerprinting algorithm.
机译:背景/目的:最近,存储系统和备份系统得到了广泛使用,并且重复数据的数量急剧增加。为了最小化数据存储大小和有效利用网络带宽,我们提出了重复数据删除系统和具有GPGPU方案的文件相似性测量方案。 GPGPU用于文件相似度测量,以提高计算速度。方法/统计分析:为了解决伴随测量并行化的问题,我们将共享内存和预处理的两种实现方式进行了比较。此外,我们提出了Rabin指纹算法的替代方案,以减轻算法对GPU的计算负担。我们比较了几个文件在时间上的系统性能。发现:首先,我们通过实验发现,对于分配给不同内核的连续数据段的重叠区域,预处理比共享内存方案稍快一些。此区域应由两个内核共享以进行指纹识别。通过使GPGPU并行化与用于文件相似性测量的预处理技术相适应,所提出的系统优于具有多核CPU的系统。此外,对于较大的文件,它变得更快。此外,通过采用替代Rabin指纹算法的方法,我们使系统速度提高了三倍。它消除了算法的计算负担,并为使用后者的系统提供了可比的结果。改进:该程序将有助于重复数据删除系统确定文件相似性并查找两个文件的重复区域。通过在GP-GPU上进行并行化,我们使用两种方法来对连续数据段进行重叠,并采用了另一种指纹识别算法,从而提高了文件相似性的测量速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号