首页> 外文学位 >Compiler techniques for data parallel applications using very large multi-dimensional datasets.
【24h】

Compiler techniques for data parallel applications using very large multi-dimensional datasets.

机译:使用超大型多维数据集的数据并行应用程序的编译器技术。

获取原文
获取原文并翻译 | 示例

摘要

Processing and analyzing large volumes of data play an increasingly important role in many domains of scientific research. The goal of this work is to provide high-level language support for implementing such data-intensive applications. Our approach is based on the fact that a data-parallel framework provides a convenient interface to large multi-dimensional datasets resident on persistent storage. Allowing these applications to be programmed from a high-level environment facilitates the development process by allowing programmers to concentrate on the applications issues, leaving systems issues up to the compiler and run-time support.; In this dissertation, we designed a set of compiler analysis techniques for extracting information from data-parallel loops written in high-level object-oriented programming languages. We also designed a set of execution strategies that use the information extracted earlier to execute the loops on distributed memory machines. Our techniques targeted code generation for regular and irregular applications with dense and sparse datasets. We also developed strategies for reducing communication volume for the applications. Other techniques were also developed to contribute to the efficiency of the compiler generated code, for example, a conditional motion framework that helps alleviate some of the overhead incurred by our execution strategies.; We built a prototype compiler using Titanium, a Java dialect with data-parallel extensions, as front-end. Our compiler analyzes the parallel loops and makes extensive use of the Active Data Repository (ADR), the target run-time system. We used this prototype to evaluate the efficacy of our techniques and the performance of the compiler generated applications. We also studied further optimizations that can be applied to the compiler generated code and evaluated their performance.
机译:处理和分析大量数据在科学研究的许多领域中起着越来越重要的作用。这项工作的目标是为实现此类数据密集型应用程序提供高级语言支持。我们的方法基于以下事实:数据并行框架为驻留在持久性存储上的大型多维数据集提供了便捷的接口。允许从高层环境对这些应用程序进行编程,可以使程序员专注于应用程序问题,从而将系统问题留给编译器和运行时支持,从而简化了开发过程。本文设计了一套编译器分析技术,用于从以高级面向对象编程语言编写的数据并行循环中提取信息。我们还设计了一套执行策略,这些策略使用较早提取的信息在分布式存储计算机上执行循环。我们的技术针对具有密集和稀疏数据集的常规和不规则应用程序生成代码。我们还制定了减少应用程序通信量的策略。还开发了其他技术来提高编译器生成的代码的效率,例如,条件运动框架可帮助减轻我们的执行策略所产生的一些开销。我们使用Titanium(一种具有数据并行扩展的Java方言)作为前端来构建原型编译器。我们的编译器分析并行循环,并广泛使用目标运行时系统Active Data Repository(ADR)。我们使用这个原型来评估我们的技术的有效性以及编译器生成的应用程序的性能。我们还研究了可应用于编译器生成的代码的进一步优化,并评估了它们的性能。

著录项

  • 作者单位

    University of Maryland College Park.;

  • 授予单位 University of Maryland College Park.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 165 p.
  • 总页数 165
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号