【24h】

JVM-Bypass for Efficient Hadoop Shuffling

机译:JVM旁路高效Hadoop Shuffling

获取原文

摘要

Hadoop employs Java-based network transport stack on top of the Java Virtual Machine (JVM) for its data shuffling and merging purposes. Our examination reveals that JVMintroduces a significant amount of overhead to data processing capability of the native interface. Furthermore, JVM constrains the use of high-performance networking mechanisms such as RDMA (Remote Direct Memory Access) which has established itself as an effective data movement technology in many networking environments because of its low-latency, high bandwidth, low CPU utilization, and energy efficiency. In this paper, we introduce a plug-in library called JVM-Bypass Shuffling (JBS) for Hadoopdata shuffling. JBS helps Hadoop data shuffling by avoiding Java-based transport protocols, removing the overhead and limitations of the JVM. In addition, we design JBS as a portable library that can leverage both TCP/IP and RDMA on different network systems such as InfiniBand and 1/10 Gigabit Ethernet. We have designed and implemented JBS as part of Hadoop acceleration. It has been transferred to Mellanox as the software product UDA(Unstructured Data Accelerator) and used to enable our studies on a variety of merging algorithms. Our performance evaluation demonstrates that JBS can effectively reduce the execution time of Hadoop jobs by up to 66.3% and lower the CPU utilization by 48.1%.
机译:Hadoop在Java虚拟机(JVM)之上使用基于Java的网络传输堆栈,以实现其数据洗牌和合并目的。我们的检查揭示了JVMIntRoduts对本机接口的数据处理能力的大量开销。此外,JVM限制了利用高性能网络机制,例如RDMA(远程直接存储器访问),该机制在许多网络环境中已经在许多网络环境中建立了一种有效的数据移动技术,因为其低延迟,高带宽,低CPU利用率和低CPU利用率和能源效率。在本文中,我们介绍了一个名为JVM-Bypass Shuffling(JBS)的插件库,用于HadoopData Shuffling。 JB通过避免基于Java的传输协议,删除JVM的开销和限制来帮助Hadoop数据进行混组。此外,我们将JBS设计为便携式库,可以利用TCP / IP和RDMA在不同网络系统上,例如Infiniband和1/10千兆以太网。我们设计并实施了JBS作为Hadoop加速的一部分。它已被转移到Mellanox作为软件产品UDA(非结构化数据加速器),并用于启用我们对各种合并算法的研究。我们的性能评估表明JBS可以有效地将Hadoop作业的执行时间降低到66.3%,并降低CPU利用率48.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号