首页> 中文期刊> 《计算机工程与设计》 >FMM算法中PP问题在GPU上的研究与实现

FMM算法中PP问题在GPU上的研究与实现

         

摘要

For the shortcomings of many current implementation of PP problem in fast multipole method in GPU, such as, load imbalance and the computational scale restricted by the size of video memory, a new method is presented based on CUDA computing platform, In order to suit to the heterogeneous architecture built up by CPU and GPU, paralleling data in Box, opening buffer memory and pipeline on multi-thread and other method are taken to take full advantage of the parallelism with CUDA programming model to accelerate the PP problem. Experiments prove that the simulation using CUDA to accelerate the process of PP problem significantly decreased the consumed time, and the whole FMM simulation significantly increased the efficiency, and is suitable for various kinds real-time simulation in N-body problem.%针对目前快速多极子算法中PP问题在图形处理器上实现的缺点,如负载不平衡和计算规模受显存大小的限制等,提出了一种新的基于统一计算设备架构平台的实现方法.采取以Box为并行单位、在内存中开辟缓冲区与多线程流水计算等方式,使其适合于CPU和GPU组成的异构体系结构,充分利用CUDA编程模型的高并行性加速PP问题.实验结果表明,采用CUDA加速后,PP问题的计算时间明显降低,提高了整个FMM模拟效率,适合于各种多体问题的实时模拟.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号