研究和分析Hadoop推测执行算法在异构环境下性能较差的问题,在深入研究源码的基础上提出改进算法。该算法根据系统负载情况自动调节后备任务的执行,实现系统负载均衡。采用Zaharia提出的历史平均剩余完成时间来估计剩余时间,并使用剩余时间值大于20%的方法来判断掉队者,进而得到更精确的掉队者队列。该算法在一定程度上提高了异构环境中推测执行的性能。%This article researches and analyzes the poor performance of the Hadoop speculation execution algorithm in heterogene-ous environments, and puts forward a new improvement algorithm after researching source code deeply.The new algorithm can adjust the execution of backup task automatically to make it balanced according to system load condition, and get more precise stragglers queues using the way of putting the residual time value greater than 0.2 in task queue to judge the stragglers, based on the historical average completion time proposed by Zaharia.The new algorithm to a certain extent improves the performance of speculation execution in the heterogeneous environments.
展开▼