Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

Hernandez Mario; Cebrian Juan M.; Cecilia Jose M.; Garcia Jose M.

首页> 外文期刊>International Journal of High Performance Computing Applications >Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

【24h】

Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

机译：在KNC Xeon Phi架构上卸载钢板核的策略：精度与性能

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ever-increasing computational requirements of HPC and service provider applications are becoming a great challenge for hardware and software designers. These requirements are reaching levels where the isolated development on either computational field is not enough to deal with such challenge. A holistic view of the computational thinking is therefore the only way to success in real scenarios. However, this is not a trivial task as it requires, among others, of hardware-software codesign. In the hardware side, most high-throughput computers are designed aiming for heterogeneity, where accelerators (e.g. Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), etc.) are connected through high-bandwidth bus, such as PCI-Express, to the host CPUs. Applications, either via programmers, compilers, or runtime, should orchestrate data movement, synchronization, and so on among devices with different compute and memory capabilities. This increases the programming complexity and it may reduce the overall application performance. This article evaluates different offloading strategies to leverage heterogeneous systems, based on several cards with the first-generation Xeon Phi coprocessors (Knights Corner). We use a 11-point 3-D Stencil kernel that models heat dissipation as a case study. Our results reveal substantial performance improvements when using several accelerator cards. Additionally, we show that computing of an approximate result by reducing the communication overhead can yield 23% performance gains for double-precision data sets.

机译：HPC和服务提供商应用程序的不断增长的计算要求对于硬件和软件设计人员来说是一个巨大的挑战。这些要求达到了在任何计算领域的隔离发布不足以处理此类挑战的水平。因此，计算思维的整体视图是在真实情景中取得成功的唯一途径。但是，这不是一个琐碎的任务，因为它需要硬件 - 软件代码。在硬件方面，大多数高吞吐量计算机都是针对异质性的，其中加速器（例如图形处理单元（GPU），现场可编程门阵列（FPGA）通过高带宽总线连接，例如PCI -Express，到主机CPU。应用程序，编译器或运行时，应协调具有不同计算和内存功能的设备之间的数据移动，同步等。这增加了编程复杂性，并且可以降低整体应用程序性能。本文评估了不同的卸载策略，以利用异构系统，基于具有第一代Xeon Phi协处理器（骑士角）的多个卡片。我们使用11点3-D模板内核，模拟散热作为案例研究。我们的结果显示使用多个加速卡时的实质性改进。此外，我们表明，通过减少通信开销来计算近似结果可以为双精度数据集产生23％性能增益。

著录项

来源
《International Journal of High Performance Computing Applications》 |2020年第2期|199-207|共9页
作者
Hernandez Mario; Cebrian Juan M.; Cecilia Jose M.; Garcia Jose M.;
展开▼
作者单位

Univ Autonoma Guerrero Fac Ingn Guerrero Mexico;

BSC Barcelona Spain;

Univ Catolica San Antonio Murcia Dept Ingn Informat Murcia Spain;

Univ Murcia Dept Ingn & Tecnol Comp Murcia Spain;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Offloading computation; Stencil codes; approximate computing; heterogeneous computing;

机译：卸载计算;模版代码;近似计算;异构计算;

相似文献

外文文献
中文文献
专利

1. Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL [J] . Cebrian Juan M., Cecilia Jose M., Hernandez Mario, Computers & mathematics with applications . 2017,第10期

机译：英特尔至强融核上基于3-D模具的应用程序的代码现代化策略：KNC和KNL
2. Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor [J] . LukaszSzustak, KrzysztofRojek, TomaszOlas, Scientific programming . 2015,第4期

机译：MPDATA异构模板计算适应英特尔至强融核协处理器
3. Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor [J] . Szustak Lukasz, Rojek Krzysztof, Olas Tomasz, Scientific programming . 2015,第期

机译：MPDATA异构模板计算适应英特尔至强融核协处理器
4. Understanding the performance of stencil computations on Intel's Xeon Phi [C] . Peraza Joshua, Tiwari Ananta, Laurenzano Michael, IEEE International Conference on Cluster Computing . 2013

机译：了解英特尔至强融核上模版计算的性能
5. Porting and Tuning Numerical Kernels in Real-World Applications to Many-Core Intel Xeon Phi Accelerators. [D] . Khuvis, Samuel. 2016

机译：将实际应用中的数字内核移植和调整到多核Intel Xeon Phi加速器。
6. Comparative Performance Analysis of Intel Xeon Phi GPU and CPU: A Case Study from Microscopy Image Analysis [O] . George Teodoro, Tahsin Kurc, Jun Kong, -1

机译：英特尔至强融核GPU和CPU的比较性能分析：以显微镜图像分析为例
7. Code modernization strategies to 3-D Stencil-based applications on Intel Xeon Phi: KNC and KNL [O] . Juan M. Cebrián, José M. Cecilia, Mario Hernández, 2017

机译：基于3-D模板应用的代码现代化策略在英特尔Xeon Phi上的应用：KNC和KNL

Offloading strategies for Stencil kernels on the KNC Xeon Phi architecture: Accuracy versus performance

摘要

著录项

相似文献

相关主题

期刊订阅