首页> 外文会议>International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA'04) v.1; 20040621-20040624; Las Vegas,NV; US >A New Non-Blocking Counter-Based Coordinated Checkpointing Algorithm as a Migration Tool in a High Performance Dynamic Grid Scheduler
【24h】

A New Non-Blocking Counter-Based Coordinated Checkpointing Algorithm as a Migration Tool in a High Performance Dynamic Grid Scheduler

机译:一种新型的基于无阻塞计数器的协调检查点算法,作为高性能动态网格调度程序中的迁移工具

获取原文
获取原文并翻译 | 示例

摘要

Scheduling computations and resource management for Grid applications is challenging as they are distributed, heterogeneous in nature, owned by different individuals or organizations with their own policies, have different access and cost models, and have dynamically varying loads and availability. High Performance Schedulers promote the performance of individual applications by optimizing performance measures such as minimal execution time, resolution, or speed-up. A central focus of this paper involves addressing the problem of migrating task sets on to (and off of) the compute fabric in response to changes in the fabric makeup (such as dynamic resource availability) or to global priority changes within the fabric and/or task sets. This issue is complicated if a task set to be migrated consists of dependent (messaging) subtasks. We present a new non-blocking counter-based coordinated Checkpointing algorithm for a dependent/ messaging task set. This algorithm obtains a global consistent checkpoint that can be used to migrate these tasks upon any performance degradation to other available computing resources and resumes their execution using the last checkpoint image. Also there is no restriction on the frequency of inter-task messages
机译:网格应用程序的调度计算和资源管理具有挑战性,因为它们本质上是分布式的,异构的,由具有其自己的策略的不同个人或组织拥有,具有不同的访问和成本模型以及动态变化的负载和可用性。高性能调度程序通过优化性能指标(例如最小的执行时间,解决方案或加速)来提高单个应用程序的性能。本文的重点在于解决响应于结构组成的变化(例如动态资源可用性)或结构内部和/或全局优先级变化而将任务集迁移到计算结构上(和从计算结构上迁移)的问题。任务集。如果要迁移的任务集由相关的(消息)子任务组成,则此问题会变得很复杂。我们为从属/消息传递任务集提出了一种新的基于非阻塞计数器的协调检查点算法。该算法获得一个全局一致的检查点,该检查点可用于在性能下降时将这些任务迁移到其他可用的计算资源,并使用最后一个检查点映像恢复其执行。任务间消息的频率也没有限制

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号