首页> 外文期刊>SIAM Journal on Computing >Analysis of delays caused by local synchronization
【24h】

Analysis of delays caused by local synchronization

机译:本地同步引起的延迟分析

获取原文
获取原文并翻译 | 示例
           

摘要

Synchronization is often necessary in parallel computing, but it can create delays whenever the receiving processor is idle, waiting for the information to arrive. This is especially true for barrier, or global, synchronization, in which every processor must synchronize with every other processor. Nonetheless, barriers are the only form of synchronization explicitly supplied in OpenMP, and they occur whenever collective communication operations are used in MPI. Many applications do not actually require global synchronization; local synchronization, in which a processor synchronizes only with those processors from or to which information or resources are needed, is often adequate. However, when tasks take varying amounts of time, the behavior of a system under local synchronization is more difficult to analyze since processors do not start tasks at the same time. We show that when the synchronization dependencies form a directed cycle and the task times are geometrically distributed with p = 0.5, then as the number of processors tends to infinity the processors are working 2-√2 ≈ 0.59% of the time. Under global synchronization, however, the time to complete each task is unbounded, increasing logarithmically with the number of processors. Similar results apply for p ≠ 0.5. We also present some of the combinatorial properties of the synchronization problem with geometrically distributed tasks on an undirected cycle. Nondeterministic synchronization is also examined, where processors decide randomly at the beginning of each task which neighbors(s) to synchronize with. We show that the expected number of task dependencies for random synchronization on an undirected cycle is the same as for deterministic synchronization on a directed cycle. Simulations are included to extend the analytic results. They show that more heavy-tailed distributions can actually create fewer delays than less heavy-tailed ones if the number of processors is small for some random-neighbor synchronization models. The results also show the rate of convergence to the steady state for various task distributions and synchronization graphs.
机译:同步在并行计算中通常是必需的,但是只要接收处理器空闲,等待信息到达,它就会产生延迟。对于屏障同步或全局同步尤其如此,在这种同步中,每个处理器必须与每个其他处理器同步。但是,障碍是OpenMP中明确提供的唯一同步形式,并且只要在MPI中使用了集体通信操作,就会出现障碍。许多应用程序实际上并不需要全局同步。本地同步通常是足够的,其中处理器仅与需要信息或资源的处理器进行同步。但是,当任务花费的时间不同时,由于处理器不会同时启动任务,因此更难以分析本地同步下的系统行为。我们显示出,当同步依赖性形成一个有向周期并且任务时间以p = 0.5几何分布时,则随着处理器数量趋于无穷大,处理器将在2-√2≈0.59%的时间内工作。但是,在全局同步下,完成每个任务的时间是无限的,与处理器数量成对数增加。类似的结果适用于p≠0.5。我们还介绍了无向循环上几何分布任务的同步问题的一些组合性质。还检查了不确定性同步,其中处理器在每个任务开始时随机决定与哪个邻居进行同步。我们表明,在无向循环上进行随机同步的预期任务依赖性数量与在有向循环上进行确定性同步的期望数量相同。包括仿真以扩展分析结果。他们表明,如果某些随机邻居同步模型的处理器数量较少,则更多重尾分布实际上可以创建的延迟比重尾分布少。结果还显示了各种任务分布和同步图的收敛到稳态的速率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号