首页 | 本学科首页   官方微博 | 高级检索  
     

面向按序执行处理器的预执行指导的数据预取方法
引用本文:党向磊,王箫音,佟冬,陆俊林,程旭,王克义.面向按序执行处理器的预执行指导的数据预取方法[J].电子学报,2012,40(11):2145-2151.
作者姓名:党向磊  王箫音  佟冬  陆俊林  程旭  王克义
作者单位:北京大学微处理器研究开发中心,北京 100871;北京大学微处理器及系统教育部工程研究中心,北京 100871
基金项目:"核高基"重大专项,中国博士后科学基金
摘    要:为提高按序执行处理器的访存性能,本文提出一种预执行指导的数据预取方法(PEDP).PEDP利用跨距预取器对规则的访存模式进行预取,并在发生L2 Cache失效后通过预执行后续指令对不规则的访存模式进行精确的预取,从而结合两者的优势提高预取覆盖率.同时,PEDP利用预执行过程中提前捕获的真实访存信息指导跨距预取器的预取过程.在预执行的指导下,跨距预取器可以对预执行能够产生的符合跨距访存模式的地址更早地发起预取请求,从而改善预取及时性.此外,为进一步优化上述指导过程,PEDP使用更新过滤器有效去除指导过程中对跨距预取器的有害更新,从而提高预取准确率.实验结果表明,在平均情况下,PEDP将基准处理器的性能提升33.0%.与跨距预取和预执行各自单独使用相比,PEDP将性能分别提高16.2%和7.3%.

关 键 词:数据预取  预执行  访存延迟包容  按序执行处理器  
收稿时间:2011-12-14

Pre-Execution Directed Prefetching for In-Order Processors
DANG Xiang-lei , WANG Xiao-yin , TONG Dong , LU Jun-lin , CHENG Xu , WANG Ke-yi.Pre-Execution Directed Prefetching for In-Order Processors[J].Acta Electronica Sinica,2012,40(11):2145-2151.
Authors:DANG Xiang-lei  WANG Xiao-yin  TONG Dong  LU Jun-lin  CHENG Xu  WANG Ke-yi
Affiliation:Microprocessor Research & Development Center,Peking University.Beijing 100871,China; Engineering Research Center of Microprocessor & System Ministry of Education,Peking University,Beijing 100871,China
Abstract:This paper proposes a pre-execution directed prefetching(PEDP) method to improve the memory latency tolerance of in-order processors.PEDP utilizes stride prefetching to handle regular access patterns and pre-execution to generate accurate prefetches regardless of the regularity of access patterns when a L2 cache miss occurs,which combines the advantages of the two techniques to improve the prefetch coverage.Meanwhile,PEDP captures actual memory access patterns during pre-execution to guide the stride prefetcher's update process.Under the guide of pre-execution,the stride prefetcher can issue prefetches earlier than pre-execution for addresses that can be generated by both of the two techniques,thus improving the prefetch timeliness.In addition,PEDP achieves improvement in prefetch accuracy by an update filter which effectively eliminates the harmful updates to the stride prefetcher during the guide process.Experimental results demonstrate that PEDP increases the performance by 33.0% over the baseline processor.Compared with stride prefetching and pre-execution,PEDP improves the performance by 16.2% and 7.3%,respectively.
Keywords:prefetching  pre-execution  memory latency tolerance  in-order processors
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号