首页> 外文学位 >Optimizing the cache performance of non-numeric applications.
【24h】

Optimizing the cache performance of non-numeric applications.

机译:优化非数字应用程序的缓存性能。

获取原文
获取原文并翻译 | 示例

摘要

The latency of accessing instructions and data from the memory subsystem is an increasingly crucial performance bottleneck in modern computer systems. While cache hierarchies are an important first step, they alone cannot solve the problem. Further, though a variety of latency-hiding techniques have been proposed, their success has been largely limited to regular, numeric applications. Few promising latency-hiding techniques that can handle irregular, non-numeric codes have been proposed, in spite of the popularity of such codes in computer applications.; This dissertation investigates hardware and software techniques for coping with the instruction-access latency and data-access latency in non-numeric applications. To deal with instruction-access latency, we propose cooperative instruction prefetching , a novel technique which significantly outperforms state-of-the-art instruction prefetching schemes by being able to prefetch more aggressively and much further ahead of time while at the same time substantially reducing the amount of useless prefetches.; To cope with data-access latency, we investigate three complementary techniques. First, we study how to use compiler-inserted data prefetching to tolerate the latency of accessing pointer-based data structures. To schedule prefetches early enough, we design three prefetching schemes to overcome the pointer-chasing problem associated with these data structures, and we automate them in an optimizing research compiler. Second, we study how to safely perform an important class of locality optimizations, namely dynamic data layout optimizations, in non-numeric codes. Specifically, we propose the use of an architectural mechanism called memory forwarding which can guarantee the safety of data relocation, thereby enabling many aggressive data layout optimizations (which also facilitate prefetching) that cannot be safely performed using current hardware or compiler technology. Finally, in an effort to minimize the overheads of latency tolerance techniques, we propose new cache miss prediction techniques based on correlation profiling. By correlating cache miss behaviors with dynamic execution contexts, these techniques can accurately isolate dynamic miss instances and so pay the latency tolerance overhead only when there would have been cache misses.; Detailed design considerations and experimental evaluations are provided for our proposed techniques, confirming them as viable solutions for coping with memory latency in non-numeric applications.
机译:从内存子系统访问指令和数据的等待时间是现代计算机系统中越来越关键的性能瓶颈。尽管缓存层次结构是重要的第一步,但仅靠它们不能解决问题。此外,尽管已经提出了各种延迟隐藏技术,但是它们的成功在很大程度上限于常规的数字应用。尽管这种代码在计算机应用中很流行,但很少有人提出可以处理不规则的非数字代码的有希望的潜伏期隐藏技术。本文研究了用于解决非数字应用程序中指令访问延迟数据访问延迟的硬件和软件技术。为了处理指令访问延迟,我们提出了合作式指令预取的新技术,该技术可以更主动地,更提前地进行预取,从而大大优于最新的指令预取方案。同时大大减少了无用的预取次数。为了应对数据访问延迟,我们研究了三种补充技术。首先,我们研究如何使用插入编译器的数据预取容忍访问基于指针的数据结构的延迟。为了足够早地调度预取,我们设计了三种预取方案来克服与这些数据结构相关的指针追赶问题,并在优化的研究编译器中将其自动化。其次,我们研究如何安全地以非数字代码执行重要的一类位置优化,即动态数据布局优化。具体来说,我们建议使用一种称为内存转发的体系结构机制,该机制可以保证数据重定位的安全性,从而实现许多激进的数据布局优化(这也有助于预取),而这些优化是使用当前硬件无法安全执行的或编译器技术。最后,为了最大程度地减少等待时间容忍技术的开销,我们提出了一种基于 correlation profiling 的新的缓存未命中预测技术。通过使高速缓存未命中行为与动态执行上下文相关联,这些技术可以准确地隔离动态未命中实例,因此仅当存在高速缓存未命中时才支付等待时间公差开销。为我们提出的技术提供了详细的设计注意事项和实验评估,确认了它们是应对非数字应用程序中内存延迟的可行解决方案。

著录项

  • 作者

    Luk, Chi-Keung.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2000
  • 页码 217 p.
  • 总页数 217
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号