首页> 外文会议>2013 IEEE 31st International Conference on Computer Design >Increasing GPU throughput using kernel interleaved thread block scheduling
【24h】

Increasing GPU throughput using kernel interleaved thread block scheduling

机译:使用内核交错线程块调度来提高GPU吞吐量

获取原文
获取原文并翻译 | 示例

摘要

The number of active threads required to achieve peak application throughput on graphics processing units (GPUs) depends largely on the ratio of time spent on computation to the time spent accessing data from memory. While compute-intensive applications can achieve peak throughput with a low number of threads, memory-intensive applications might not achieve good throughput even at the maximum supported thread count. In this paper, we study the effects of scheduling work from multiple applications on the same GPU core. We claim that interleaving workload from different applications on a GPU core can improve the utilization of computational units and reduce the load on memory subsystem. Experiments on 17 application pairs from the Rodinia benchmark suite show that overall throughput increases by 7% on average.
机译:在图形处理单元(GPU)上达到峰值应用程序吞吐量所需的活动线程数在很大程度上取决于计算时间与从内存访问数据所花费的时间之比。尽管计算密集型应用程序可以在较少的线程数量下实现峰值吞吐量,但内存密集型应用程序即使在支持的最大线程数下也可能无法实现良好的吞吐量。在本文中,我们研究了在同一GPU内核上从多个应用程序调度工作的影响。我们声称,将GPU内核上来自不同应用程序的工作负载进行交错可以提高计算单元的利用率并减少内存子系统上的负载。 Rodinia基准套件中17个应用程序对的实验表明,总体吞吐量平均提高了7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号