基于简化Trace的动态隐式断言执行 Dynamic Implicit Predication Based on Lite Trace Cache期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于简化Trace的动态隐式断言执行

引用本文：	唐遇星,邓鹍,窦勇,周兴铭.基于简化Trace的动态隐式断言执行[J].计算机学报,2007,30(11):1972-1981.

作者姓名：	唐遇星邓鹍窦勇周兴铭

作者单位：	国防科技大学计算机学院分布与并行处理国家重点实验室,长沙,410073

摘要：	分支指令与分支预测失败限制了处理器发掘指令级并行(ILP)的潜力.通过If-conversion或Predicated执行将程序中的控制相关转化为数据相关,能较好地降低分支预测开销.提出一种基于简化Trace结构的动态隐式断言执行机制(Dynamic Implicit Predication,DIP),而早期的相关研究主要集中于由编译器显式为宽发射处理器产生静态Predicated指令.无需编译器或者其他二进制工具的帮助,DIP可以在程序运行过程中识别可以进行断言变换的指令片断,完成指令转换与优化,并在以后的执行中使用优化后的指令Trace.基于SPEC2000模拟测试表明DIP可以有效避免错误的分支预测,提高并行度,单个程序的IPC平均提高10.3%,基准程序的平均加速比可达7.59%.
关键词：	指令级并行断言动态隐式断言执行踪迹缓冲流水线简化 Trace Cache 动态 Lite Based Predication Implicit 加速比基准程序并行度错误模拟测试使用优化转换变换行断识别运行过程二进制
修稿时间：	2005-04-29
Dynamic Implicit Predication Based on Lite Trace Cache

TANG Yu-Xing,DENG Kun,DOU Yong,ZHOU Xing-Ming.Dynamic Implicit Predication Based on Lite Trace Cache[J].Chinese Journal of Computers,2007,30(11):1972-1981.

Authors:	TANG Yu-Xing DENG Kun DOU Yong ZHOU Xing-Ming

Affiliation:	National Key Laboratory of Parallel and Distributed Processing, School of Computer, National University of Defense Technology, Changsha 410073

Abstract:	To exploit instruction level parallelism,modern microprocessor usually converts control dependences into data dependences.If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty.In this paper,a trace-based predicate mechanism named DIP(Dynamic Implicit Predication) is discussed.Previous predication execution depends on compiler to generate explicit predicated instructions.The candidates of if-conversion will be identified during dynamic execution.Classical trace cache has been modified to store DIP traces,which include instructions both from fall-through and target block behind the conditional branch.Hardware will add predication to DIP trace automatically.With the help of DIP,legacy applications can benefit from predication mechanism without recompiling source code.Simulation of DIP under various hardware configurations is presented in the paper.Results have shown promising performance improvement.For SPEC INT2000 benchmark,average IPC(Instruction Per Cycle) improvement achieves 10.3%,and average speedup of execution time is 7.59%.

Keywords:	ILP predication dynamic implicit predication trace cache pipelining
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏