首页> 外文期刊>International Journal of Computer Systems Science & Engineering >Towards an accurate evaluation of deduplicated storage systems
【24h】

Towards an accurate evaluation of deduplicated storage systems

机译:准确评估重复数据删除存储系统

获取原文
获取原文并翻译 | 示例
           

摘要

Deduplication has proven to be a valuable technique for eliminating duplicate data in backup and archival systems and is now being applied to new storage environments with distinct requirements and performance trade-offs. Namely, deduplication system are now targeting large-scale cloud computing storage infrastructures holding unprecedented data volumes with a significant share of duplicate content. It is however hard to assess the usefulness of deduplication in particular settings and what techniques provide the best results. In fact, existing disk I/O benchmarks follow simplistic approaches for generating data content leading to unrealistic amounts of duplicates that do not evaluate deduplication systems accurately. Moreover, deduplication systems are now targeting heterogeneous storage environments, with specific duplication ratios, that benchmarks must also simulate. We address these issues with DEDISbench, a novel micro-benchmark for evaluating disk I/O performance of block based deduplication systems. As the main contribution, DEDISbench generates content by following realistic duplicate content distributions extracted from real datasets. Then, as a second contribution, we analyze and extract the duplicates found on three real storage systems, proving that DEDISbench can easily simulate several workloads. The usefulness of DEDISbench is shown by comparing it with Bonnie++ and lOzone open-source disk I/O micro-benchmarks on assessing two open-source deduplication systems, Opendedup and Lessfs, using Ext4 as a baseline. Our results lead to novel insight on the performance of these file systems.
机译:事实证明,重复数据删除是消除备份和档案系统中重复数据的一项有价值的技术,目前正应用于具有不同需求和性能折衷的新存储环境。即,重复数据删除系统现在针对的是大规模云计算存储基础架构,该基础架构拥有空前的数据量以及大量重复内容。但是,很难评估重复数据删除在特定环境中的有用性,以及哪种技术可以提供最佳结果。实际上,现有的磁盘I / O基准遵循简单的方法来生成数据内容,从而导致不切实际的重复量,从而无法准确评估重复数据删除系统。此外,重复数据删除系统现在针对具有特定重复率的异构存储环境,基准也必须对其进行模拟。我们使用DEDISbench解决了这些问题,DEDISbench是一种用于评估基于块的重复数据删除系统的磁盘I / O性能的新型微型基准。作为主要贡献,DEDISbench通过遵循从真实数据集中提取的实际重复内容分布来生成内容。然后,作为第二个贡献,我们分析并提取了在三个实际存储系统上发现的重复项,证明DEDISbench可以轻松模拟多个工作负载。通过将DEDISbench与Bonnie ++和lOzone开源磁盘I / O微基准进行比较,显示出DEDISbench的有用性,该基准以Ext4为基准评估两个开源重复数据删除系统Opendedup和Lessfs。我们的结果使人们对这些文件系统的性能有了新的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号