首页> 外文期刊>IEICE Transactions on Information and Systems >GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems
【24h】

GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems

机译:GPU-Chariot:用于在多GPU系统上运行的流应用程序的编程框架

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a stream programming framework, named GPU-chariot, for accelerating stream applications running on graphics processing units (GPUs). The main contribution of our framework is that it realizes efficient software pipelines on multi-GPU systems by enabling out-of-order execution of CPU functions, kernels, and data transfers. To achieve this out-of-order execution, we apply a runtime scheduler that not only maximizes the utilization of system resources but also encapsulates the number of GPUs available in the system. In addition, we implement a load-balancing capability to flow data efficiently through multiple GPUs. Furthermore, a callback interface enables overlapping execution of functions in third-party libraries. By using kernels with different performance bottlenecks, we show that our out-of-order execution is up to 20% faster than in-order execution. Finally, we conduct several case studies on a 4-GPU system and demonstrate the advantages of GPU-chariot over a manually pipelined code. We conclude that GPU-chariot can be useful when developing stream applications with software pipelines on multiple GPUs and CPUs.
机译:本文提出了一种名为GPU-chariot的流编程框架,用于加速在图形处理单元(GPU)上运行的流应用程序。我们框架的主要贡献在于,它可以通过无序执行CPU功能,内核和数据传输来在多GPU系统上实现高效的软件管道。为了实现这种无序执行,我们应用了一个运行时调度程序,该调度程序不仅可以最大程度地利用系统资源,而且可以封装系统中可用的GPU数量。此外,我们实现了负载平衡功能,以使数据有效地流经多个GPU。此外,回调接口允许重叠执行第三方库中的功能。通过使用具有不同性能瓶颈的内核,我们证明了无序执行比无序执行快20%。最后,我们在4-GPU系统上进行了一些案例研究,并展示了GPU战车相对于手动流水线代码的优势。我们得出的结论是,在带有多个GPU和CPU上的软件管道的流应用程序开发中,GPU战车可能会有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号