【24h】

A Case Study of Accelerating Apache Spark with FPGA

机译:用FPGA加速Apache Spark的案例研究

获取原文
获取原文并翻译 | 示例

摘要

Apache Spark is an efficient distributed computing framework for big data processing. It supports in-memory computation of RDDs (Resilient Distributed Dataset) and provides a provision of reusability, fault tolerance, and real-time stream processing. However, the tasks in Spark framework are only performed on CPU. The low degree of parallelism and power inefficiency of CPU may restrict the performance and scalability of the cluster. In order to improve the performance and power dissipation of the data center, heterogeneous accelerators such as FPGA, GPU, MIC (Many Integrated Core) exhibit more efficient performance than the general-purpose processor in big data processing. In this work, we propose a framework to integrate FPGA accelerator into a Spark cluster. We use FPGA to accelerate the Spark tasks developed with Python, and in this way, the main computing load is performed on FPGA instead of CPU. We illustrate the performance of the FPGA based Spark framework with a case study of 2D-FFT algorithm acceleration. The results showed that FPGA based Spark implementation acquires 1.79x speedup than CPU implementation.
机译:Apache Spark是用于大数据处理的高效分布式计算框架。它支持RDD(弹性分布式数据集)的内存计算,并提供了可重用性,容错性和实时流处理功能。但是,Spark框架中的任务仅在CPU上执行。 CPU的低并行度和低功耗可能会限制群集的性能和可伸缩性。为了提高数据中心的性能和功耗,在大数据处理中,FPGA,GPU,MIC(许多集成内核)等异构加速器比通用处理器具有更高的性能。在这项工作中,我们提出了一个将FPGA加速器集成到Spark集群中的框架。我们使用FPGA来加速使用Python开发的Spark任务,这样,主要的计算负载就在FPGA而不是CPU上执行。我们以2D-FFT算法加速为例,说明了基于FPGA的Spark框架的性能。结果表明,基于FPGA的Spark实施比CPU实施快1.79倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号