A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs

机译：在大于GPU的存储容量的域上进行模板计算的并行优化方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The problem size of the stencil computation on GPU cluster is limited by the memory capacity GPUs, which is typically smaller than that of host memories. This paper proposes and evaluates parallel optimization method for stencil computation to achieve scalability, larger problem size than the memory capacity of GPUs and high performance. It uses 2D decomposition to achieve scalability over GPUs. Then it enables bigger sub-domain on each GPU to achieve bigger problem size. It applies temporal blocking method to improve memory access locality of stencil computation and reuses former result to solve redundant problem to get higher performance. Evaluation of stencil simulation on 3D domain shows that our new method for 7-point and 19-point on GPUs achieves good scalability which is 1.45 times and 1.72 times better than other methods on average.

机译：GPU群集上模板计算的问题大小受到GPU的存储容量的限制，GPU的容量通常小于主机存储器的容量。本文提出并评估了模板计算的并行优化方法，以实现可扩展性，比GPU的存储容量更大的问题大小以及更高的性能。它使用2D分解来实现GPU的可扩展性。然后，它在每个GPU上启用更大的子域，以实现更大的问题规模。它应用时间阻塞的方法来提高模具计算的存储访问局部性，并重用以前的结果来解决冗余问题以获得更高的性能。对3D域上的模版仿真的评估表明，我们针对GPU上的7点和19点的新方法具有良好的可伸缩性，平均比其他方法高出1.45倍和1.72倍。

著录项

来源
《IEEE International Conference on Cluster Computing》|2013年|1-8|共8页
会议地点
作者
Jin Guanghao; Endo Toshio; Matsuoka Satoshi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
GPU cluster; memory capacity; parallel optimization; stencil computation; temporal blocking;

机译：GPU集群;内存容量;并行优化;模板计算;时间阻塞;

相似文献

外文文献
中文文献
专利

1. Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs [J] . Thiago Carrijo Nasciutti, Jairo Panetta, Pedro Pais Lopes Concurrency, practice and experience . 2019,第18期

机译：评估减少GPGPU中模板计算的全局内存访问的优化
2. Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs [J] . Thiago Carrijo Nasciutti, Jairo Panetta, Pedro Pais Lopes Concurrency, practice and experience . 2019,第18期

机译：评估减少GPGPU中的模板计算的GlobalMemory访问的优化
3. ACCELERATING STENCIL COMPUTATION ON GPGPU BY NOVEL MAPPING METHOD BETWEEN THE GLOBAL MEMORY AND THE SHARED MEMORY [J] . Mo Tieqiang, Li Renfa Computing and informatics . 2018,第3期

机译：全局内存和共享内存之间通过新颖的映射方法在GPGPU上加速钢笔计算
4. A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs [C] . Jin Guanghao, Endo Toshio, Matsuoka Satoshi IEEE International Conference on Cluster Computing . 2013

机译：对GPU的内存容量大的域上模板计算的并行优化方法
5. Optimization of Stencil Computations on GPUs [D] . Rawat, Prashant Singh. 2018

机译：在GPU上优化模板计算
6. Novel Hybrid GPU–CPU Implementation of Parallelized Monte Carlo Parametric Expectation Maximization Estimation Method for Population Pharmacokinetic Data Analysis [O] . C. M. Ng 2013

机译：人口药代动力学数据分析的并行蒙特卡洛参数期望最大化估计的新型混合GPU-CPU实现
7. 3.5d blocking optimization for stencil computations on modern CPUs and GPUs [O] . Anthony Nguyen, Nadathur Satish, Jatin Chhugani, 2013

机译：用于现代CPU和GPU上模版计算的3.5d块优化

A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs

摘要

著录项

相似文献

相关主题

期刊订阅