首页> 外文学位 >Efficient data scheduling for real-time large-scale data-intensive distributed applications.
【24h】

Efficient data scheduling for real-time large-scale data-intensive distributed applications.

机译:针对实时大规模数据密集型分布式应用程序的高效数据调度。

获取原文
获取原文并翻译 | 示例

摘要

Data staging is a method for organizing data transfer in heterogeneous distributed computing environments. Real-time large-scale data intensive applications are distributed applications that frequently generate, utilize and access large-scale data sets with constraints for completion times. These applications are currently emerging in many areas of science and engineering. The data-intensive operations performed by these applications---including transfer of large data sets (order of 100's of Gigabyte transfer), can quickly consume network and compute resources and hence incur degraded overall performance for the applications in these distributed networked environments. This research proposes several solutions that enable efficient data staging to solve this vital problem in heterogeneous distributed computing. Our solutions, mainly, maximize the satisfiability of the applications by designing efficient data scheduling heuristics for such systems. Two optimization models for this maximization were proposed: one model for optimizing the overall satisfiability and the other for the optimizing autonomous applications.; In this research we introduce and develop deterministic solutions for the problem. We propose three main algorithms: Two for dynamic setting and one for static. Our main static data scheduling heuristic, called Concurrent Scheduling over Extended Partial Path (CS/EPP) heuristic, is developed based on assumed data staging model for optimizing overall satisfiability. CS/EPP is based on EPP which is an independent heuristic that computes schedules dynamically for online input data sets. The resulting schedules are adjusted as the various data sets are staged in the system. CS/EPP, also, allows concurrent staging of various data sets and hence improves the optimization procedure.; A static non-greedy heuristic called Blocking Analysis Concurrent Scheduling (BACS) is also proposed in this research for improving the data staging performance at the cost of computational complexity when applications arrive in batches. Two heuristic methods are proposed for BACS that allow adjusting the cost of the algorithm.; We conclude this dissertation by pointing out avenues for extending this research by investigation of the stochastic dynamic approach for data staging and the task mapping problem for large-scale data intensive applications. The stochastic dynamic approach for data staging assumes that the applications arrival process can be modeled as a stochastic process. We proposed a Stochastic Data Scheduling (SDS) algorithm that utilizes estimates for delays on paths or routes for the staged data sets. Our estimates are produced based on modeling the staging environment as a network of finite queues and analyzing the queues in these paths.
机译:数据分段是一种用于在异构分布式计算环境中组织数据传输的方法。实时大规模数据密集型应用程序是分布式应用程序,它们经常生成,利用和访问受完成时间限制的大规模数据集。这些应用程序当前在许多科学和工程领域中出现。这些应用程序执行的数据密集型操作-包括大数据集的传输(千兆字节传输的100个数量级),可能会迅速消耗网络和计算资源,从而导致这些分布式网络环境中应用程序的整体性能下降。这项研究提出了几种解决方案,这些解决方案可以实现高效的数据分段,以解决异构分布式计算中的这一至关重要的问题。我们的解决方案主要通过为此类系统设计有效的数据调度试探法来最大程度地提高应用程序的可满足性。为此,提出了两种优化模型:一种用于优化总体可满足性的模型,另一种用于优化自治应用程序的模型。在这项研究中,我们介绍并开发了针对该问题的确定性解决方案。我们提出了三种主要算法:两种用于动态设置,一种用于静态。我们的主要静态数据调度启发式方法称为扩展部分路径上的并行调度(CS / EPP)启发式方法,是基于假定的数据分级模型开发的,用于优化总体满意度。 CS / EPP基于EPP,它是一种独立的启发式方法,可动态计算在线输入数据集的时间表。当在系统中暂存各种数据集时,将调整生成的计划。 CS / EPP还允许同时暂存各种数据集,从而改善了优化过程。在这项研究中,还提出了一种静态非贪婪启发式方法,称为“阻塞分析并发调度”(BACS),目的是在批量到达应用程序时以计算复杂性为代价来提高数据暂存性能。针对BACS,提出了两种启发式方法,可以调整算法的成本。结束语本文通过指出随机的动态数据分级方法和大规模数据密集型应用的任务映射问题,指出了扩展本研究的途径。用于数据分段的随机动态方法假设可以将应用程序到达过程建模为随机过程。我们提出了一种随机数据调度(SDS)算法,该算法利用估算值来估计分段数据集的路径或路由上的延迟。我们的估算是基于将临时环境建模为有限队列网络并分析这些路径中的队列而得出的。

著录项

  • 作者

    Eltayeb, Mohammed Soleiman.;

  • 作者单位

    The Ohio State University.;

  • 授予单位 The Ohio State University.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 159 p.
  • 总页数 159
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号