首页> 外文会议>2013 IEEE 31st International Conference on Computer Design >FastLanes: An FPGA accelerated GPU microarchitecture simulator
【24h】

FastLanes: An FPGA accelerated GPU microarchitecture simulator

机译:FastLanes:FPGA加速的GPU微体系结构模拟器

获取原文
获取原文并翻译 | 示例

摘要

Graphic Processing Units (GPUs) have emerged as a new general purpose computing platform that attracts significant research efforts. Currently, GPU architecture research resorts to time-consuming software simulations to evaluate microarchitecture innovations. In this paper, we propose FastLanes, an FPGA based simulator for a generic GPU microarchitecture, to enable hardware-accelerated simulation. FastLanes consists of a function model and a timing model, both implemented on FPGA. The functional model implements the full functionality of a multiprocessor of GPU and emulates multiple multiprocessors via time-division multiplexing. We develop a hybrid implementation strategy in which certain GPU logic is directly mapped to FPGA while the other logic is simulated by reusing the same FPGA logic. A corresponding context shifting mechanism is proposed to store execution states of threads from FPGA to external on-board memory, and vice versa. Such a mechanism makes it possible to simulate hundreds of GPU cores on a single FPGA evaluation board. Driven by the functional simulation results, the timing model considers the detailed configuration of GPU microarchitecture to derive the performance evaluation. A compiler tool-chain is also developed to allow the execution of NVIDIA GPU binary on FastLanes. Experimental results prove that FastLanes outperforms its software equivalent by up to 2 orders of magnitude.
机译:图形处理单元(GPU)已经成为吸引人们大量研究工作的新型通用计算平台。当前,GPU体系结构研究依靠耗时的软件仿真来评估微体系结构创新。在本文中,我们提出FastLanes,这是一种用于通用GPU微体系结构的基于FPGA的仿真器,以实现硬件加速的仿真。 FastLanes由功能模型和时序模型组成,两者均在FPGA上实现。该功能模型实现了GPU多处理器的全部功能,并通过时分复用模拟了多个多处理器。我们开发了一种混合实施策略,其中某些GPU逻辑直接映射到FPGA,而其他逻辑通过重用相同的FPGA逻辑进行仿真。提出了一种相应的上下文转换机制来存储从FPGA到外部板载存储器的线程的执行状态,反之亦然。这种机制使在单个FPGA评估板上模拟数百个GPU内核成为可能。在功能仿真结果的驱动下,时序模型考虑了GPU微体系结构的详细配置,以得出性能评估。还开发了一个编译器工具链,以允许在FastLanes上执行NVIDIA GPU二进制文件。实验结果证明,FastLanes的性能比同类软件高出2个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号