首页> 中文期刊> 《西北工业大学学报》 >32位双发射双流水线结构RISC微处理器设计

32位双发射双流水线结构RISC微处理器设计

         

摘要

"龙腾R2"是西北工业大学自主研制的32位嵌入式RISC微处理器,与IBM公司的Power-阳50处理器pin-to-pin兼容.综合考虑面积、功耗、实时响应以及性能要求等因素,文章提出了一种应用于嵌入式处理器微架构设计的双发射双流水线结构.该结构的核心思想是在指令流水线前端处理阶段动态检测相邻指令的先后依赖关系.预先完成双发射判断.文中首先介绍了"龙腾R2"的微体系架构,然后重点讨论了基于双发射双流水结构的指令调度策略、相邻指令耦合关系、双发射下的相关处理以及精确异常考虑等.采用Mibeneh基准程序完成了性能评测,综合分析结果显示,该结构对算术计算类程序流加速明显,并且电路结构清晰,易于设计验证,同时发现优化存储系统结构是提升该处理器性能的关键.文章最后对"龙腾R2"的可测试性设计以及硅物理设计等关键技术进行了论述."龙腾R2"已流片成功,整个处理器采用SMIC 180nm CMOS工艺,芯片面积5.9 mm×6.7 mm,核心频率266 MHz,CBGA360封装.%A 32-bit embedded RISC processor is presented which is compatible with the PowerPC architecture. To make a balance between its area, power, performance and real-time response, a smart micro-architecture is proposed with the distinguishing feature of dual-issue and dual-pipeline. The key point of the architecture is that the instruction dispatching engine dynamically scans the data dependences between two consecutive instructions in the instruction queue and decides whether or not to perform dual-issue action in advance. The intelligent engine parses the mutual-dependence relationships of the two instructions at the head of the queue. If the destinations of the first instruction are not the sources of the second adjacent instruction, the two instructions are allowed to be dispatched in parallel; otherwise they must be dispatched in sequence. Two independent pipelines can execute the two instructions and the dedicated data-forward buses are adopted to resolve the dependences between the multi-instructions.We use the HDL to build the model to evaluate the micro-architecture for accelerating the arithmetic instructions,and we use the Mibench to evaluate the proposed architecture. It is demonstrated that the average IPC of the smart architecture reaches 1.5. Compared to the single issue architecture, it can effectively improve the performance;and compared to the OOO(Out-of-Order) architecture it can reduce the area, the power and the verification complexity. During the experiment, we also find that the memory latency is the performance botfieneck. Finally, we give a brief discussion about the test circuit and the physical design. The processor is fabricated by SMIC 1P6M 18Onm CMOS technology, the die area is 5.9 ×6.7m2 and the core frequency is 266 MHz..

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号