首页> 外文会议>International conference on very large data bases >Stubby: A Transformation-based Optimizer for MapReduce Workflows
【24h】

Stubby: A Transformation-based Optimizer for MapReduce Workflows

机译:Stubby:MapReduce工作流的基于转换的优化器

获取原文

摘要

There is a growing trend of performing analysis on large datasets using workflows composed of MapReduce jobs connected through producer-consumer relationships based on data. This trend has spurred the development of a number of interfaces-ranging from program-based to query-based interfaces-for generating MapReduce workflows. Studies have shown that the gap in performance can be quite large between optimized and unoptimized workflows. However, automatic cost-based optimization of MapReduce workflows remains a challenge due to the multitude of interfaces, large size of the execution plan space, and the frequent unavailability of all types of information needed for optimization. We introduce a comprehensive plan space for MapReduce workflows generated by popular workflow generators. We then propose Stubby, a cost-based optimizer that searches selectively through the subspace of the full plan space that can be enumerated correctly and costed based on the information available in any given setting. Stubby enumerates the plan space based on plan-to-plan transformations and an efficient search algorithm. Stubby is designed to be extensible to new interfaces and new types of optimizations, which is a desirable feature given how rapidly MapReduce systems are evolving. Stubby's efficiency and effectiveness have been evaluated using representative workflows from many domains.
机译:使用由基于数据通过生产者 - 消费者关系连接的MapReduce作业组成的工作流程对大型数据集进行了越来越多的趋势。这种趋势促使了从基于程序的基于查询的接口进行了许多接口的开发 - 用于生成MapReduce工作流程。研究表明,在优化和未优化的工作流之间的性能之间的间隙可能非常大。然而,由于众多的接口,大尺寸的执行计划空间以及优化所需的所有类型信息的频繁不可用性,MapReduce工作流程的自动成本优化仍然是一个挑战。我们为流行的工作流程生成器生成的MapReduce工作流程介绍了全面的计划空间。然后,我们提出了一种基于成本的优化器,可以通过完全计划空间的子空间搜索,这些优化器可以通过完全计划空间的子空间来基于任何给定设置中可用的信息来正确枚举和成本。 Stubby基于计划到计划转换和高效的搜索算法来枚举计划空间。 Stubby旨在可扩展到新的接口和新类型的优化类型,这是一个理想的功能,给出了MapReduce系统如何发展。已经使用来自许多域的代表性工作流程进行了评估了Stubby的效率和有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号