首页> 外文会议>International Symposium on Parallel and Distributed Computing >Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators Based on a Domain-Specific Language for Medical Imaging
【24h】

Automatic Optimization of In-Flight Memory Transactions for GPU Accelerators Based on a Domain-Specific Language for Medical Imaging

机译:基于用于医学成像的域特定语言,自动优化GPU加速器的飞行内存交易

获取原文

摘要

An efficient memory bandwidth utilization for GPU accelerators is crucial for memory bound applications. In medical imaging, the performance of many kernels is limited by the available memory bandwidth since only a few operations are performed per pixel. For such kernels only a fraction of the compute power provided by GPU accelerators can be exploited and performance is predetermined by memory bandwidth. As a remedy, this paper investigates the optimal utilization of available memory bandwidth by means of increasing in-flight memory transactions. Instead of doing this manually for different GPU accelerators, the required CUDA and OpenCL code is automatically generated from descriptions in a Domain-Specific Language (DSL) for the considered application domain. Moreover, the DSL is extended to also support global reduction operators. We show that the generated target-specific code improves bandwidth utilization for memory-bound kernels significantly. Moreover, competitive performance compared to the GPU back end of the widely used image processing library OpenCV can be achieved.
机译:GPU加速器的有效内存带宽利用率对于内存绑定应用是至关重要的。在医学成像中,许多内核的性能受到可用的存储器带宽的限制,因为每像素仅执行少数操作。对于这种内核,只能利用GPU加速器提供的计算功率的一小部分,并且通过内存带宽预定性能。作为补救措施,本文通过增加飞行中的内存事务来调查可用内存带宽的最佳利用。而不是为不同的GPU加速器手动执行此操作,而是从所考虑的应用程序域中的域特定语言(DSL)的描述中自动生成所需的CUDA和OpenCL代码。此外,DSL扩展到还支持全局还原运营商。我们表明生成的目标特定代码显着提高了内存绑定内核的带宽利用率。此外,与广泛使用的图像处理库OpenCV的GPU后端相比,可以实现竞争性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号