This paper presents a novel technique for cycle-accurate simulation of the Central Processing Unit (CPU) of a modern superscalar processor, the UltraSPARC III Cu processor. The technique is based on adding a module to an existing fetch-decode-execute style of CPU simulator, rather than the traditional method of fully modelling the CPU microarchitecture. It is also suitable for accurate SMP modelling. The main functions of the module are the simulation of instruction grouping, register interlocks and the store buffer. Its simple table-driven implementation permits easy modification for exploring microarchitectural variations. The technique results in a 40% loss of simulation speed, instead of a 10 times or greater performance loss by fully implementing the detailed micro-architecture. The technique is validated against an actual UltraSPARC III Cu processor, and achieves high levels of accuracy over a range of scientific benchmarks.
本文提出了一种新颖的,用于近距离处理器的中央处理单元(CPU)的循环准确模拟,超轨道III CU处理器。该技术基于将模块添加到现有的CPU模拟器的现有获取解码 - 执行样式,而不是完全建模CPU微架构的传统方法。它也适用于精确的SMP建模。模块的主要功能是仿真指令分组,寄存器互锁和商店缓冲区。其简单的表驱动实现允许轻松修改探索微体系结构的变化。通过充分实现详细的微架构,该技术导致模拟速度损失40%,而不是10倍或更高的性能损失。该技术针对实际UltraSparc III Cu处理器验证,并在一系列科学基准中实现了高度的精度。 p>
机译:通过代码生成,在CPU和GPU上的高密度比下不混溶的液体模拟的高效晶格Boltzmann多相模拟
机译:在CPU上进行有效的浅水模拟:实现,可视化,验证和确认
机译:混合CPU-GPU集群上的高效沉积盆地模拟
机译:UltraSPARC III CPU的高效精确周期仿真
机译:超高效倒置变质III-V多结太阳能电池的设计,建模和仿真。
机译:CPU-GPU混合平台可实现高效的尖峰神经网络仿真
机译:在CPU和GPU上通过代码生成高效的晶格Boltzmann多相模拟不混溶流体
机译:高效蒙特卡罗模拟技术。第三卷。方差减少