首页> 外文会议>Annual international symposium on Computer architecture;International symposium on Computer architecture >Architectural support for scalable speculative parallelization in shared-memory multiprocessors
【24h】

Architectural support for scalable speculative parallelization in shared-memory multiprocessors

机译:共享内存多处理器中对可伸缩的推测并行化的架构支持

获取原文

摘要

Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on single-chip multiprocessors (CMPs), whose effectiveness is necessarily limited by their small size. Very few schemes have attempted this technique in the context of scalable shared-memory systems.

In this paper, we present and evaluate a new hardware scheme for scalable speculative parallelization. This design needs relatively simple hardware and is efficiently integrated into a cache-coherent NUMA system. We have designed the scheme in a hierarchical manner that largely abstracts away the internals of the node. We effectively utilize a speculative CMP as the building block for our scheme.

Simulations show that the architecture proposed delivers good speedups at a modest hardware cost. For a set of important non-analyzable scientific loops, we report average speedups of 4.2 for 16 processors. We show that support for per-word speculative state is required by our applications, or else the performance suffers greatly.

机译:

推测性并行化积极地执行编译器无法完全并行化的并行代码。过去的硬件方案建议主要集中在单芯片多处理器(CMP)上,其有效性必然受到其小尺寸的限制。在可伸缩共享内存系统中很少有方案尝试过这种技术。

在本文中,我们提出并评估了用于可扩展的推测性并行化的新硬件方案。该设计需要相对简单的硬件,并且可以有效地集成到与缓存相关的NUMA系统中。我们以分层的方式设计了该方案,该方案在很大程度上抽象了节点的内部结构。我们有效地利用了投机性CMP作为我们计划的基础。 仿真表明,所提出的体系结构以适度的硬件成本实现了良好的加速。对于一组重要的不可分析的科学循环,我们报告16个处理器的平均加速比为4.2。我们证明了我们的应用程序需要对每个单词的推测状态提供支持,否则性能会受到很大影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号