首页> 外文会议>ACM/IEEE Annual International Symposium on Computer Architecture >Bonsai: High-Performance Adaptive Merge Tree Sorting
【24h】

Bonsai: High-Performance Adaptive Merge Tree Sorting

机译:盆景:高性能自适应合并树排序

获取原文

摘要

Sorting is a key computational kernel in many big data applications. Most sorting implementations focus on a specific input size, record width, and hardware configuration. This has created a wide array of sorters that are optimized only to a narrow application domain.In this work we show that merge trees can be implemented on FPGAs to offer state-of-the-art performance over many problem sizes. We introduce a novel merge tree architecture and develop Bonsai, an adaptive sorting solution that takes into consideration the off-chip memory bandwidth and the amount of on-chip resources to optimize sorting time. FPGA programmability allows us to leverage Bonsai to quickly implement the optimal merge tree configuration for any problem size and memory hierarchy.Using Bonsai, we develop a state-of-the-art sorter which specifically targets DRAM-scale sorting on AWS EC2 F1 instances. For 4-32 GB array size, our implementation has a minimum of 2.3x, 1.3x, 1.2x and up to 2.5x, 3.7x, 1.3x speedup over the best designs on CPUs, FPGAs, and GPUs, respectively. Our design exhibits 3.3x better bandwidth-efficiency compared to the best previous sorting implementations. Finally, we demonstrate that Bonsai can tune our design over a wide range of problem sizes(megabyte to terabyte) and memory hierarchies including DDR DRAMs, high-bandwidth memories (HBMs) and solid-state disks (SSDs).
机译:在许多大数据应用程序中,排序是关键的计算内核。大多数排序实现都将重点放在特定的输入大小,记录宽度和硬件配置上。这创造了各种各样的分类器,这些分类器仅针对狭窄的应用领域进行了优化。在这项工作中,我们证明了可以在FPGA上实现合并树,从而在许多问题规模上提供最新的性能。我们介绍了一种新颖的合并树体系结构,并开发了Bonsai(一种自适应排序解决方案),该解决方案考虑了片外内存带宽和片上资源的数量来优化排序时间。 FPGA的可编程性使我们能够利用Bonsai快速实现针对任何问题大小和内存层次的最佳合并树配置。使用Bonsai,我们开发了一种最新的排序器,专门针对AWS EC2 F1实例上的DRAM规模排序。对于4-32 GB的阵列大小,我们的实现分别比CPU,FPGA和GPU上的最佳设计分别具有最低2.3倍,1.3倍,1.2倍和最高2.5倍,3.7倍,1.3倍的加速。与以前的最佳排序实现方式相比,我们的设计具有3.3倍的带宽效率。最后,我们证明Bonsai可以在各种问题大小(兆字节至TB)和包括DDR DRAM,高带宽存储器(HBM)和固态磁盘(SSD)在内的存储器层次结构中调整设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号