首页> 外文学位 >Branch optimizations and instruction-level parallelism exploitation for dynamic superscalar and VLIW processors.

【24h】

Branch optimizations and instruction-level parallelism exploitation for dynamic superscalar and VLIW processors.

机译：动态超标量和VLIW处理器的分支优化和指令级并行性开发。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Wide issue superscalar and VLIW processors benefit from the utilization of instruction-level parallelism (ILP) to achieve high performance. However, if insufficient ILP is found, the performance potential of these processors reduces drastically. Branch instructions are one of the main problems towards not achieving high ILP. Branch instructions pose strict ordering constraints in a program which impedes any effort to achieve a desired overlap of instruction execution with them. To effectively exploit ILP in the presence of branches, the parallelism extraction phases require efficient handling of branch instructions and reduce any ill-effects that they impose.; This dissertation investigates two techniques, namely speculative execution and predicated execution to extract (and enhance) ILP in the presence of branch instructions. Speculative execution is a technique by which instructions are executed before the branch condition that controls them is evaluated (or known). This can increase the performance if some of the idle cpu cycles are now used to execute speculative instructions from the taken path. If instructions are speculated from the wrong path(s), the performance can suffer.; Speculative execution, by itself, is not enough to achieve high ILP. The fundamental limitation due to the presence of branch instructions in the instruction stream can still have detrimental effects. This motivates us to investigate the other technique, predicated execution. Predicated execution is an architectural technique which eliminates the branch instruction (and its associated penalty) at the expense of extra instructions which get dynamically executed. The predicated (or guarded) instruction executes conditionally depending on the value of an extra predicate operand which satisfies the condition that originally controls the execution of the branch instruction.; Excessive application of speculative and/or guarded execution can hurt ILP. The thesis aims at describing new techniques to combine the individual merits of speculative and guarded execution and improves ILP beyond existing techniques. New profile guided branch optimizations are proposed which help reduce branch overhead costs. Finally, we provide an experimental evaluation of predicated execution on dynamic superscalar architectures and propose different predication models as cost-effective alternatives to the full-predication approach.

机译：广泛使用的超标量和VLIW处理器得益于利用指令级并行性（ILP）来实现高性能。但是，如果找不到足够的ILP，则这些处理器的性能潜力将大大降低。分支指令是无法实现高ILP的主要问题之一。分支指令在程序中构成严格的排序约束，这阻碍了任何努力来实现与它们的指令执行的期望重叠。为了在存在分支的情况下有效地利用ILP，并行度提取阶段需要有效处理分支指令并减少它们施加的任何不良影响。本文研究了在分支指令存在的情况下提取（和增强）ILP的两种技术，即推测执行和谓词执行。推测执行是一种技术，通过该技术，可以在评估（或已知）控制它们的分支条件之前执行指令。如果现在将某些空闲cpu周期用于执行所采用路径中的推测指令，则可以提高性能。如果从错误的路径推测指令，则可能会降低性能。推测执行本身不足以实现高ILP。由于指令流中存在分支指令而造成的基本限制仍然会产生不利影响。这促使我们研究另一种技术，即谓词执行。谓词执行是一种体系结构技术，它消除了分支指令（及其相关的代价），但以动态执行的额外指令为代价。谓词（或受保护的）指令根据满足最初控制分支指令执行的条件的额外谓词操作数的值而有条件地执行。过度执行投机和/或受保护的执行可能会损害ILP。本文旨在描述结合投机和有保障的执行的个人优点的新技术，并在现有技术的基础上改进ILP。提出了新的配置文件引导的分支优化，可帮助减少分支的间接费用。最后，我们提供了对动态超标量体系结构上的谓词执行的实验评估，并提出了不同的谓词模型作为全谓词方法的经济高效的替代方案。

著录项

作者
Mantripragada, Srinivas.;
展开▼
作者单位

University of California, Irvine.;

展开▼
授予单位 University of California, Irvine.;
学科 Computer Science.
学位 Ph.D.
年度 2000
页码 150 p.
总页数 150
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient exploitation of instruction-level parallelism for superscalar processors by the conjugate register file scheme [J] . Meng-Chou Chang, Feipei Lai IEEE Transactions on Computers . 1996,第3期

机译：共轭寄存器文件方案对超标量处理器的指令级并行性的有效利用
2. Combined Application of Data Transfer and Storage Optimizing Transformations and Subword Parallelism Exploitation for Power Consumption and Execution Time Reduction in VLIW Multimedia Processors [J] . K. MASSELOS, F. CATTHOOR, C. E. GOUTIS, Journal of VLSI signal processing . 2004,第1期

机译：数据传输和存储优化转换与子字并行开发在减少VLIW多媒体处理器中的功耗和执行时间方面的结合应用
3. MLP-Aware Dynamic Instruction Window Resizing in Superscalar Processors for Adaptively Exploiting Available Parallelism [J] . Yuya KORA, Kyohei YAMAGUCHI, Hideki ANDO IEICE transactions on information and systems . 2014,第12期

机译：超标量处理器中可感知MLP的动态指令窗口大小调整，可自适应地利用可用的并行性
4. An asynchronous superscalar architecture for exploiting instruction-level parallelism [C] . Werner, T., Akella, . 2001

机译：利用指令级并行性的异步超标量体系结构
5. An asynchronous superscalar architecture for exploiting instruction-level parallelism [D] . Werner, Tony Lee. 2000

机译：用于利用指令级并行性的异步超高架构
6. Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures [O] . Fahad Saeed, Jason D. Hoffert, Trairak Pisitkun, -1

机译：利用多核体系结构利用线程级和指令级并行性对质谱数据进行聚类
7. Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism [O] . Jack L. Lo, Susan J. Eggers 1995

机译：通过可提高指令级并行度的编译器优化来改善均衡调度

Branch optimizations and instruction-level parallelism exploitation for dynamic superscalar and VLIW processors.

摘要

著录项

相似文献

相关主题

期刊订阅