首页> 外文会议>IEEE Intl Conf on Ubiquitous Computing amp;amp;amp;amp;amp;amp; Communications >Efficient Online Stream Deduplication for Network Block Storage
【24h】

Efficient Online Stream Deduplication for Network Block Storage

机译:高效的网络块存储重复数据删除

获取原文

摘要

Deduplication is an effective technique to optimize storage utilization in data centers and cloud storage systems. It splits data into chunks and then identifies whether chunks are unique or not. Fixed-size chunking (FSC) is widely used in deduplication, which defines the chunk boundary with a fixed interval of bytes. Although it is simple and efficient, FSC may cause boundary shift issue, which usually decreases deduplication rate. Content-defined chunking (CDC) has been proposed to solve this problem. However, there are two challenges to apply CDC in deduplication for network block storage. One challenge is how to establish a mapping scheme between the stream offsets of a deduplicated chunk and its block address; the other challenge is to design an efficient index structure to organize metadata of data chunks on the disk. In this paper, we design two structures to solve the mapping problem and implement two backends to store metadata on network block storage devices, which are based on B+ trees and hash table, respectively. In order to achieve a better search performance on the disk, we reduce the size of the hash table and shrink the lookup range. We evaluate our schemes by real-world workloads. The experimental results show that our schemes have an excellent search performance at an acceptable cost of spatial sacrifice.
机译:重复数据删除是优化数据中心和云存储系统中的存储利用率的有效技术。它将数据拆分为块,然后识别块是否是唯一的。固定大小的块(FSC)广泛用于重复数据删除,其定义了具有固定间隔的块边界。虽然它简单富有高效,但FSC可能导致边界移位问题,这通常会降低重复数据删除率。已提出内容定义的块(CDC)来解决这个问题。但是,在网络块存储的重复数据删除中应用CDC存在两个挑战。一个挑战是如何在重复数据删除块及其块地址的流偏移之间建立映射方案;其他挑战是设计一个有效的索引结构来组织磁盘上的数据块的元数据。在本文中,我们设计了两个结构来解决映射问题,并在网络块存储设备上实施两个后端,以分别基于B +树和散列表。为了在磁盘上获得更好的搜索性能,我们减少了哈希表的大小并缩小了查找范围。我们通过现实世界的工作负载评估我们的计划。实验结果表明,我们的方案以可接受的空间牺牲成本具有优异的搜索性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号