...
首页> 外文期刊>Computer architecture news >Energy-efficient Mechanisms for Managing Thread Context in Throughput Processors
【24h】

Energy-efficient Mechanisms for Managing Thread Context in Throughput Processors

机译:吞吐量处理器中管理线程上下文的节能机制

获取原文
获取原文并翻译 | 示例
           

摘要

Modern graphics processing units (GPUs) use a large number of hardware threads to hide both function unit and memory access latency. Extreme multithreading requires a complicated thread scheduler as well as a large register file, which is expensive to access both in terms of energy and latency. We present two complementary techniques for reducing energy on massively-threaded processors such as GPUs. First, we examine register file caching to replace accesses to the large main register file with accesses to a smaller structure containing the immediate register working set of active threads. Second, we investigate a two-level thread scheduler that maintains a small set of active threads to hide ALU and local memory access latency and a larger set of pending threads to hide main memory latency. Combined with register file caching, a two-level thread scheduler provides a further reduction in energy by limiting the allocation of temporary register cache resources to only the currently active subset of threads. We show that on average, across a variety of real world graphics and compute workloads, a 6-entry per-thread register file cache reduces the number of reads and writes to the main register file by 50% and 59% respectively. We further show that the active thread count can be reduced by a factor of 4 with minimal impact on performance, resulting in a 36% reduction of register file energy.
机译:现代图形处理单元(GPU)使用大量硬件线程来隐藏功能单元和内存访问延迟。极端的多线程需要复杂的线程调度程序以及大的寄存器文件,这在能量和延迟方面都是昂贵的。我们提出了两种互补的技术,可以减少诸如GPU之类的大线程处理器的能耗。首先,我们检查寄存器文件缓存,以访问包含活动线程的立即寄存器工作集的较小结构来替换对大型主寄存器文件的访问。其次,我们研究了一个两级线程调度程序,该调度程序维护一小组活动线程以隐藏ALU和本地内存访问延迟,并维护一组较大的挂起线程以隐藏主内存延迟。结合寄存器文件缓存,两级线程调度程序通过将临时寄存器缓存资源的分配限制为仅当前活动的线程子集,从而进一步降低了能耗。我们显示,平均而言,在各种现实世界的图形和计算工作负载中,每线程6项寄存器文件高速缓存分别减少了对主寄存器文件的读取和写入次数,分别减少了50%和59%。我们进一步表明,活动线程数可以减少4倍,而对性能的影响最小,从而导致寄存器文件能量减少了36%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号