...
首页> 外文期刊>Geoscientific Model Development Discussions >Parallel I∕O in Flexible Modelling System (FMS) and Modular Ocean Model 5 (MOM5)
【24h】

Parallel I∕O in Flexible Modelling System (FMS) and Modular Ocean Model 5 (MOM5)

机译:柔性建模系统(FMS)和模块化海洋模型5(MOM5)并行I / O.

获取原文
           

摘要

We present an implementation of parallel I∕O in the Modular Ocean Model (MOM), a numerical ocean model used for climate forecasting, and determine its optimal performance over a range of tuning parameters. Our implementation uses the parallel API of the netCDF library, and we investigate the potential bottlenecks associated with the model configuration, netCDF implementation, the underpinning MPI-IO library/implementations and Lustre filesystem. We investigate the performance of a global 0.25° resolution model using 240 and 960CPUs. The best performance is observed when we limit the number of contiguous I∕O domains on each compute node and assign one MPI rank to aggregate and to write the data from each node, while ensuring that all nodes participate in writing this data to our Lustre filesystem. These best-performance configurations are then applied to a higher 0.1° resolution global model using 720 and 1440CPUs, where we observe even greater performance improvements. In all cases, the tuned parallel I∕O implementation achieves much faster write speeds relative to serial single-file I∕O, with write speeds up to 60 times faster at higher resolutions. Under the constraints outlined above, we observe that the performance scales as the number of compute nodes and I∕O aggregators are increased, ensuring the continued scalability of I∕O-intensive MOM5 model runs that will be used in our next-generation higher-resolution simulations.
机译:我们在模块化海洋模型(MOM)中呈现了一个平行I / O的实施,这是一种用于气候预测的数值海洋模型,并在一系列调谐参数中确定其最佳性能。我们的实现使用NetCDF库的并行API,并调查与模型配置,NetCDF实现,Underpinning MPI-IO库/实现和Luster文件系统相关联的潜在瓶颈。我们研究了240和960CPUS的全局0.25°分辨率模型的性能。当我们限制每个计算节点上的连续I / O域的数量并将一个MPI等级分配给聚合并从每个节点编写数据时,观察到最佳性能,同时确保所有节点参与将此数据写入我们的Lustre文件系统。然后,这些最佳性能配置将使用720和1440cpus应用于更高的0.1°分辨率全局模型,在那里我们观察到更大的性能改进。在所有情况下,调谐并行I / O实现相对于串行单文件I / O实现更快的写入速度,在更高的分辨率下的写入速度快高达60倍。在上面概述的约束下,我们观察到作为计算节点和I / O聚合器的数量的性能尺度增加,确保在我们的下一代更高一代中使用的I / O密集型MOM5模型运行的持续可扩展性分辨率模拟。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号