首页> 外文会议>IEEE International Conference on Distributed Computing Systems >AdaptiveConfig: Run-Time Configuration of Cluster Schedulers for Cloud Short-Running Jobs
【24h】

AdaptiveConfig: Run-Time Configuration of Cluster Schedulers for Cloud Short-Running Jobs

机译:AdaptiveConfig:云短期运行作业的群集调度程序的运行时配置

获取原文

摘要

Cluster schedulers provide flexible resource sharing mechanism for short-running jobs, which occupy a majority of cloud jobs. A scheduler's configuration decides how to allocate resources among jobs and hence it is crucial to their performances. Today's cloud platforms usually rely on cluster administrators to set this configuration, thus it is difficult to optimally configure the scheduler so as to minimize the latencies of heterogeneous and dynamically changing jobs in the cloud. In this paper, we introduce AdaptiveConfig, a run-time configurator for cluster schedulers that automatically adapts to the changing workload and resource status. This includes: (1) an estimator to calculate jobs' performances under different configurations and various scheduling scenarios. The key idea here is to transform a scheduler's resource allocation mechanisms and their variable influence factors (configuration parameters, scheduling constraints, available resources, and workload status) into business rules and facts in a rule engine, thereby reasoning about these correlated factors in job performance estimation. (2) A run-time optimizer that efficiently searches the configuration space to find the optimal configuration for the current workload. We implemented AdaptiveConfig on the popular YARN Capacity and Fair schedulers and demonstrate its effectiveness using workloads of Facebook jobs, i.e. considerably reducing latencies by 2.22 times (and up to 4.50 times) with low optimization overheads.
机译:群集调度程序为短期运行的作业提供了灵活的资源共享机制,这些作业占用了大部分云作业。调度程序的配置决定了如何在作业之间分配资源,因此这对它们的性能至关重要。当今的云平台通常依靠群集管理员来设置此配置,因此很难以最佳方式配置调度程序,以最小化云中异构且动态变化的作业的延迟。在本文中,我们介绍了AdaptiveConfig,它是集群调度程序的运行时配置程序,可自动适应不断变化的工作负载和资源状态。其中包括:(1)估算器,用于计算不同配置和各种调度方案下的作业绩效。这里的关键思想是将调度程序的资源分配机制及其可变影响因素(配置参数,调度约束,可用资源和工作负载状态)转换为规则引擎中的业务规则和事实,从而推理出工作绩效中的这些相关因素。估计。 (2)运行时优化器,可以有效地搜索配置空间以找到当前工作负载的最佳配置。我们在流行的YARN Capacity and Fair调度程序上实现了AdaptiveConfig,并通过Facebook作业的工作量证明了AdaptiveConfig的有效性,即以较低的优化开销将延迟减少了2.22倍(最多4.50倍)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号