首页> 外文会议>International Symposium on Performance Analysis of Systems and Software >Low-Overhead Dynamic Instruction Mix Generation Using Hybrid Basic Block Profiling
【24h】

Low-Overhead Dynamic Instruction Mix Generation Using Hybrid Basic Block Profiling

机译:使用混合基本块分析的低开销动态指令混合生成

获取原文

摘要

Dynamic instruction mixes form an important part of the toolkits of performance tuners, compiler writers, and CPU architects. Instruction mixes are traditionally generated using software instrumentation, an accurate yet slow method, that is normally limited to user-mode code. We present a new method for generating instruction mixes using the Performance Monitoring Unit (PMU) of the CPU. It has very low overhead, extends coverage to kernel-mode execution, and causes only a very modest decrease in accuracy, compared to software instrumentation. In order to achieve this level of accuracy, we develop a new PMU-based data collection method, Hybrid Basic Block Profiling (HBBP). HBBP uses simple machine learning techniques to choose, on a per basic block basis, between data from two conventional sampling methods, Event Based Sampling (EBS) and Last Branch Records (LBR). We implement a profiling tool based on HBBP, and we report on experiments with the industry standard SPEC CPU2006 suite, as well as with two large-scale scientific codes. We observe an improvement in runtime compared to software instrumentation of up to 76x on the tested benchmarks, reducing wait times from hours to minutes. Instruction attribution errors average 2.1%. The results indicate that HBBP provides a favorable tradeoff between accuracy and speed, making it a suitable candidate for use in production environments.
机译:动态指令混合构成了性能调节器,编译器编写器和CPU架构师工具包的重要组成部分。传统上,指令混合是使用软件工具生成的,这是一种准确而缓慢的方法,通常仅限于用户模式代码。我们提出了一种使用CPU的性能监视单元(PMU)生成指令混合的新方法。与软件工具相比,它的开销非常低,将覆盖范围扩展到内核模式执行,并且仅导致准确性的非常适度的降低。为了达到此级别的准确性,我们开发了一种新的基于PMU的数据收集方法,即混合基本块概要分析(HBBP)。 HBBP使用简单的机器学习技术在每个基本块的基础上,从基于事件的采样(EBS)和最后分支记录(LBR)这两种常规采样方法的数据之间进行选择。我们实现了基于HBBP的性能分析工具,并报告了使用行业标准SPEC CPU2006套件以及两个大型科学代码进行的实验。与通过测试基准测试的软件工具相比,我们观察到运行时的改进高达76倍,从而将等待时间从数小时缩短至数分钟。指令归因错误平均为2.1%。结果表明,HBBP在精度和速度之间提供了良好的折衷,使其成为在生产环境中使用的合适候选者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号