首页> 外文期刊>Journal of industrial and management optimization >PERFORMANCE OPTIMIZATION OF PARALLEL-DISTRIBUTED PROCESSING WITH CHECKPOINTING FOR CLOUD ENVIRONMENT
【24h】

PERFORMANCE OPTIMIZATION OF PARALLEL-DISTRIBUTED PROCESSING WITH CHECKPOINTING FOR CLOUD ENVIRONMENT

机译:带检查点的云环境并行分布式处理的性能优化

获取原文
获取原文并翻译 | 示例
           

摘要

In cloud computing, the most successful application framework is parallel-distributed processing, in which an enormous task is split into a number of subtasks and those are processed independently on a cluster of machines referred to as workers. Due to its huge system scale, worker failures occur frequently in cloud environment and failed subtasks cause a large processing delay of the task. One of schemes to alleviate the impact of failures is checkpointing method, with which the progress of a subtask is recorded as checkpoint and the failed subtask is resumed by other worker from the latest checkpoint. This method can reduce the processing delay of the task. However, frequent checkpointing is system overhead and hence the checkpoint interval must be set properly. In this paper, we consider the optimal number of checkpoints which minimizes the task-processing time. We construct a stochastic model of parallel-distributed processing with checkpointing and approximately derive explicit expressions for the mean task-processing time and the optimal number of checkpoints. Numerical experiments reveal that the proposed approximations are sufficiently accurate on typical environment of cloud computing. Furthermore, the derived optimal number of checkpoints outperforms the result of previous study for minimizing the task-processing time on parallel-distributed processing.
机译:在云计算中,最成功的应用程序框架是并行分布式处理,其中将巨大的任务拆分为多个子任务,并在称为工作器的机器集群上独立处理这些子任务。由于其庞大的系统规模,工作人员故障经常在云环境中发生,并且失败的子任务会导致任务的大量处理延迟。减轻故障影响的一种方案是检查点方法,该方法将子任务的进度记录为检查点,而失败的子任务由其他工作人员从最新检查点恢复。此方法可以减少任务的处理延迟。但是,频繁的检查点是系统的开销,因此必须正确设置检查点间隔。在本文中,我们考虑了可将任务处理时间最小化的最佳检查点数量。我们构造了带有检查点的并行分布式处理的随机模型,并为平均任务处理时间和最佳检查点数近似推导了明确的表达式。数值实验表明,所提出的近似值在典型的云计算环境下足够准确。此外,派生的最佳检查点数量优于先前的研究结果,该研究可以最大程度地减少并行分布式处理中的任务处理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号