首页> 外文会议>IEEE International Conference on Cluster Computing >A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs
【24h】

A parallel optimization method for stencil computation on the domain that is bigger than memory capacity of GPUs

机译:在大于GPU的存储容量的域上进行模板计算的并行优化方法

获取原文

摘要

The problem size of the stencil computation on GPU cluster is limited by the memory capacity GPUs, which is typically smaller than that of host memories. This paper proposes and evaluates parallel optimization method for stencil computation to achieve scalability, larger problem size than the memory capacity of GPUs and high performance. It uses 2D decomposition to achieve scalability over GPUs. Then it enables bigger sub-domain on each GPU to achieve bigger problem size. It applies temporal blocking method to improve memory access locality of stencil computation and reuses former result to solve redundant problem to get higher performance. Evaluation of stencil simulation on 3D domain shows that our new method for 7-point and 19-point on GPUs achieves good scalability which is 1.45 times and 1.72 times better than other methods on average.
机译:GPU群集上模板计算的问题大小受到GPU的存储容量的限制,GPU的容量通常小于主机存储器的容量。本文提出并评估了模板计算的并行优化方法,以实现可扩展性,比GPU的存储容量更大的问题大小以及更高的性能。它使用2D分解来实现GPU的可扩展性。然后,它在每个GPU上启用更大的子域,以实现更大的问题规模。它应用时间阻塞的方法来提高模具计算的存储访问局部性,并重用以前的结果来解决冗余问题以获得更高的性能。对3D域上的模版仿真的评估表明,我们针对GPU上的7点和19点的新方法具有良好的可伸缩性,平均比其他方法高出1.45倍和1.72倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号