首页> 外文会议>International Symposium on Performance Analysis of Systems and Software >Evaluating Performance Tradeoffs on the Radeon Open Compute Platform
【24h】

Evaluating Performance Tradeoffs on the Radeon Open Compute Platform

机译:在Radeon开放计算平台上评估性能折衷

获取原文

摘要

GPUs have been shown to deliver impressive computing performance, while also providing high energy efficiency, across a wide range of high-performance and embedded system workloads. However, limited support for efficient communication and synchronization between the CPU and the GPU impacts our ability to fully exploit the benefits of heterogeneous systems. Recently, the Heterogeneous System Architecture (HSA) was introduced to address these issues with synchronization and communication, but given the low-level nature of HSA, it was not easily adopted by the broader programming community. In 2016, AMD described the Radeon Open Compute (ROC) platform that brings high-level programming frameworks such as OpenCL, HC++, and HIP to end users. These high-level programming frameworks offer a simpler programming experience by wrapping complex HSA APIs, while still delivering the power of HSA. To date, there has been little evaluation of the potential performance benefits and trade-offs of leveraging the ROC platform. In this work, we evaluate the performance of the ROC platform using the Hetero-Mark and DNNMark benchmark suites. Equipped with Hetero-Mark, we compare the performance of different programming frameworks, including OpenCL, HC++, and HIP on both integrated APUs and discrete GPUs. We also present three new CPU-GPU collaborative patterns and employ three new benchmarks to evaluate system-level atomics. With DNNMark and a new DNN Face Detection benchmark, we evaluate the performance of ROC libraries including rocBLAS and MIOpen. We also provide guidance on best practices to programmers when developing applications leveraging the ROC platform.
机译:GPU已被证明可以提供令人印象深刻的计算性能,同时还提供高能效,跨各种高性能和嵌入式系统工作负载。然而,有限的支持CPU和GPU之间的有效通信和同步影响我们充分利用异构系统的益处的能力。最近,引入了异构系统架构(HSA)以解决这些问题的同步和通信,但考虑到HSA的低级性质,更广泛的编程社区并不容易采用。 2016年,AMD描述了Radeon Open Compute(ROC)平台,它带来了高级编程框架,如OpenCL,HC ++和HIP,以最终用户。这些高级编程框架通过包装复杂的HSA API提供了更简单的编程体验,同时仍提供HSA的电源。迄今为止,对利用ROC平台的潜在绩效福利和权衡几乎没有评估。在这项工作中,我们使用异标标记和DNNMark基准套件评估ROC平台的性能。我们配备了异标,我们比较了不同编程框架的性能,包括OpenCL,HC ++和Integrated APU和离散GPU上的HIP。我们还提出了三种新的CPU-GPU协作模式,并雇用了三个新的基准来评估系统级原子。使用DNNMark和新的DNN面部检测基准,我们评估ROC图书馆的性能,包括Rocblas和Miopen。在开发利用ROC平台的应用程序时,我们还提供有关对程序员的最佳实践的指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号