...
首页> 外文期刊>Big Data Research >A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments
【24h】

A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments

机译:异构环境中Hadoop的动态数据放置策略

获取原文
获取原文并翻译 | 示例
           

摘要

Cloud computing is a type of parallel distributed computing system that has become a frequently used computer application. MapReduce is an effective programming model used in cloud computing and large-scale data-parallel applications. Hadoop is an open-source implementation of the MapReduce model, and is usually used for data-intensive applications such as data mining and web indexing. The current Hadoop implementation assumes that every node in a cluster has the same computing capacity and that the tasks are data-local, which may increase extra overhead and reduce MapReduce performance. This paper proposes a data placement algorithm to resolve the unbalanced node workload problem. The proposed method can dynamically adapt and balance data stored in each node based on the computing capacity of each node in a heterogeneous Hadoop cluster. The proposed method can reduce data transfer time to achieve improved Hadoop performance. The experimental results show that the dynamic data placement policy can decrease the time of execution and improve Hadoop performance in a heterogeneous cluster.
机译:云计算是一种并行分布式计算系统,已经成为一种常用的计算机应用程序。 MapReduce是一种有效的编程模型,可用于云计算和大规模数据并行应用程序。 Hadoop是MapReduce模型的开源实现,通常用于数据密集型应用程序,例如数据挖掘和Web索引。当前的Hadoop实施假设集群中的每个节点都具有相同的计算能力,并且任务是数据本地的,这可能会增加额外的开销并降低MapReduce的性能。本文提出了一种数据放置算法来解决节点不平衡的工作量问题。所提出的方法可以基于异构Hadoop集群中每个节点的计算能力来动态地适应和平衡存储在每个节点中的数据。所提出的方法可以减少数据传输时间,从而提高Hadoop性能。实验结果表明,动态数据放置策略可以减少异构集群中的执行时间并提高Hadoop性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号