首页> 外文会议>2015 International Conference on Parallel Architecture and Compilation >RC3: Consistency Directed Cache Coherence for x86-64 with RC Extensions
【24h】

RC3: Consistency Directed Cache Coherence for x86-64 with RC Extensions

机译:RC3:具有RC扩展的x86-64的一致性定向缓存一致性

获取原文
获取原文并翻译 | 示例

摘要

The recent convergence towards programming language based memory consistency models has sparked renewed interest in lazy cache coherence protocols. These protocols exploit synchronization information by enforcing coherence only at synchronization boundaries via self-invalidation. In effect, such protocols do not require sharer tracking which benefits scalability. On the downside, such protocols are only readily applicable to a restricted set of consistency models, such as Release Consistency (RC), which expose synchronization information explicitly. In particular, existing architectures with stricter consistency models (such as x86-64) cannot readily make use of lazy coherence protocols without either: changing the architecture's consistency model to (a variant of) RC at the expense of backwards compatibility, or adapting the protocol to satisfy the stricter consistency model, thereby failing to benefit from synchronization information. We show an approach for the x86-64 architecture, which is a compromise between the two. First, we propose a mechanism to convey synchronization information via a simple ISA extension, while retaining backwards compatibility with legacy codes and older microarchitectures. Second, we propose RC3, a scalable hardware cache coherence protocol for RCtso, the resulting memory consistency model. RC3 does not track sharers, and relies on self-invalidation on acquires. To satisfy RCtso efficiently, the protocol reduces self-invalidations transitively using per-L1 timestamps only. RC3 outperforms a conventional lazy RC protocol by 12%, achieving performance comparable to a MESI directory protocol for RC optimized programs. RC3's storage overhead per cache line scales logarithmically with increasing core count, and reduces on-chip coherence storage overheads by 45% compared to a related approach specifically targeting TSO.
机译:最近对基于编程语言的内存一致性模型的融合引发了人们对惰性缓存一致性协议的新兴趣。这些协议通过仅通过自我失效在同步边界强制执行一致性来利用同步信息。实际上,此类协议不需要共享者跟踪,这有利于可伸缩性。不利的一面是,此类协议仅适用于有限的一组一致性模型,例如Release Consistency(RC),该模型显式公开了同步信息。特别是,具有更严格的一致性模型的现有体系结构(例如x86-64)不能轻易使用延迟一致性协议,而无需:要么以向后兼容为代价将体系结构的一致性模型更改为RC(RC的变体),要么改编协议不能满足更严格的一致性模型,从而无法从同步信息中受益。我们展示了x86-64体系结构的一种方法,这是两者之间的折衷方案。首先,我们提出了一种通过简单的ISA扩展来传达同步信息的机制,同时保留了与旧代码和较旧的微体系结构的向后兼容性。其次,我们提出了RC3,这是一种针对RCtso的可伸缩硬件缓存一致性协议,它是由此产生的存储器一致性模型。 RC3不跟踪共享者,而是依赖于收购的自我失效。为了有效地满足RCtso,该协议仅使用每L1时间戳可传递地减少自失效。 RC3的性能比传统的惰性RC协议高出12%,可实现与RC优化程序的MESI目录协议相当的性能。与专门针对TSO的相关方法相比,RC3的每条缓存线的存储开销会随着核数的增加而按对数比例扩展,并将片上一致性存储开销减少了45%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号