首页 | 本学科首页   官方微博 | 高级检索  
     

龙芯2F上的访存优化
引用本文:苏波,李凯,徐志广,何颂颂. 龙芯2F上的访存优化[J]. 计算机系统应用, 2010, 19(1): 171-175
作者姓名:苏波  李凯  徐志广  何颂颂
作者单位:中国科学技术大学计算机科学与技术系,安徽,合肥,230027
基金项目:国家高技术研究发展计划(863)(2008AA010902)
摘    要:一般的数据处理程序中,计算时间在其中往往只起次要作用,因此访存方式是否有效对程序的性能影响很大。在基于龙芯2F处理器研制的高性能计算机系统KD-50-I上安装ATLAS,经测试其性能只达到龙芯2F理论峰值的30%。通过循环展开减少函数存储访问次数,增大计算访存比;采用数据分块、部分拷贝以增强访存局部性,减少cache失效;利用非阻塞cache加快内存访问速度等访存优化技术,将ATLAS性能提高50%以上。

关 键 词:ATLAS  KD-50-I  cache失效  非阻塞cache
收稿时间:2009-03-04

Optimization of Memory Access Based on Loongson2F
SU Bo,LI Kai,XU Zhi-Guang and HE Song-Song. Optimization of Memory Access Based on Loongson2F[J]. Computer Systems& Applications, 2010, 19(1): 171-175
Authors:SU Bo  LI Kai  XU Zhi-Guang  HE Song-Song
Affiliation:SU Bo,LI Kai,XU Zhi-Guang,HE Song-Song (School of Computer Science , Technology of USTC,Hefei 230027,China)
Abstract:In most cases, compared to computing time, memory access time takes a much larger proportion of program running time. Therefore, memory access approach can affect the program performance significantly. Testing results show that the performance of ATLAS transplanted on KD-50-I, which is based on Loongson 2F,reaches only 30% of its theoretical peak. In this paper, by exploiting Loop Unrolling technique to decrease memory access frequency, enhancing time and space locality to reduce cache misses and nonblocking cache mechanism to form memory access pipeline, the performance of optimized ATLAS can be improved to 50% higher.
Keywords:ATLAS   KD-50-I   cache miss   non-blocking cache
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号