共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
3.
4.
从推出苹果Ⅱ的1977年到推出IBMPC的1981年这5年间,奠定了微处理器计算机下一个15年发展的基础。半导体生产技术继续向前推进,刺激着微处理器的发展和VLSI的设计兴趣。微处理器在性能和能力上的进步,又激发出对微处理体系结构和高端微处理器计算机设计的兴趣。 相似文献
5.
6.
7.
偶尔有空的时候,我总是喜欢在一些大牌厂商里面跑来跑去,因为可以听到一些难得而且有趣的消息。像不久前我又十分偶然的听说nVIDIA的Gefaree2 MX200图形芯片快要停产了。如果这个消息是真的,那么由Geforce2 MX200抽身离开而挪出来的市场份额将是十分吸引人的。在与MX200性能相当的图形芯片中,除了 ATI的 Radeon VE之外还有SIS315,但就目前来说,SIS315完全无法与 Radeon VE相争。Radeon VE会否凭借这个机会一统低端显卡市场?或者。不久前,U-NIKA… 相似文献
8.
9.
10.
本详细讨论了X86及其兼容架构微处理器的低功耗设计技术,探讨了该技术和产品的现状和发展趋势,预见了低功耗微处理器的广泛应用前景。 相似文献
11.
Wei-WuHu Fu-XinZhang Zu-SongLi 《计算机科学技术学报》2005,20(2):0-0
The Godson project is the first attempt to design high performance general-purpose microprocessors in China. This paper introduces the microarchitecture of the Godson-2 processor which is a 64-bit, 4-issue, out-of-order execution RISC processor that implements the 64-bit MlPS-like instruction set. The adoption of the aggressive out-of-order execution techniques (such as register mapping, branch prediction, and dynamic scheduling) and cache techniques (such as non-blocking cache, load speculation, dynamic memory disambiguation) helps the Godson-2 processor to achieve high performance even at not so high frequency. The Godson-2 processor has been physically implemented on a 6-metal 0.18μm CMOS technology based on the automatic placing and routing flow with the help of some crafted library cells and macros. The area of the chip is 6,700 micrometers by 6,200 micrometers and the clock cycle at typical corner is 2.3ns. 相似文献
12.
13.
龙芯2号处理器设计和性能分析 总被引:16,自引:4,他引:16
介绍龙芯2号处理器设计及其性能测试结果.龙芯2号采用四发射超标量超流水结构。片内一级指令和数据高速缓存各64KB,片外二级高速缓存最多可达8MB.为了充分发挥流水线的效率,龙芯2号实现了先进的转移猜测、寄存器重命名、动态调度等乱序执行技术以及非阻塞的Cache访问和load Speculation等动态存储访问机制.龙芯2号处理器采用0.18gm的CMOS工艺实现,在正常电压下的最高工作频率为500MHz,500MHz时的实测功耗为3~5W.龙芯2号单精度峰值浮点运算速度为20亿a/秒,双精度浮点运算速度为10亿a/秒,SPECCPU2000的实测性能是龙芯1号的8~10倍,综合性能已经达到PentiumⅢ的水平.目前芯片样机能流畅运行完整的64位中文Linux操作系统,全功能的Mozilla浏览器、多媒体播放器和OpenOffice办公套件,可以满足绝大多数桌面应用的要求. 相似文献
14.
15.
Recovery requirements of branch prediction storage structures in the presence of mispredicted-path execution 总被引:1,自引:0,他引:1
Stéphan Jourdan Jared Stark Tse-Hao Hsing Yale N. Patt 《International journal of parallel programming》1997,25(5):363-383
Execution along mispredicted paths may or may not affect the accuracy of subsequent branch predictions if recovery mechanisms
are not provided to undo the erroneous information that is acquired by the branch prediction storage structures. In this paper,
we study four elements of the Two-Level Branch Predictor: the Branch Target Buffer (BTB), the Branch History Register (BHR),
the Pattern History Tables (PHTs), and the Return Address Stack (RAS). For each we determine whether a recovery mechanism
is needed, and, if so, show how to design a cost-effective one. Using five benchmarks from the SPECint92 suite, we show that
there is no need to provide recovery mechanisms for the BTB and the PHTs, but that performance is degraded by an average of
30% if recovery mechanisms are not provided for the BHR and RAS. 相似文献
16.
一种精确的分支预测微处理器模型 总被引:3,自引:0,他引:3
在当今深流水宽发射的微处理器中,为实现高性能,精确的分支预测是不可缺少的关键技术.分支预测失效将浪费大量的时钟周期,无法发挥乱序执行的效能.宽发射微处理器的有效性能同时还依赖指令窗口的大小和指令预取宽度.提出了一种新的更精确的支持分支预测和分支误预测周期损失的微处理器模型.根据指令的执行带宽为指令窗口中可用指令数的平方根统计规律,给出了一个更为精确的描述微处理器取指带宽、分支预测精度、分支误预测周期损失、指令窗口大小和IPC之间关系的算法,并讨论了这些参数的综合权衡以及这些参数对程序IPC的影响.由此可以确定依赖多个微处理器参数的取指带宽阈值和微处理器中几个关键参数的选取. 相似文献
17.
In theory, branch predictors with more complicated algorithms and larger data structures provide more accurate predictions. Unfortunately, overly large structures and excessively complicated algorithms cannot be implemented because of their long access delay. To date, many strategies have been proposed to balance delay with accuracy, but none has completely solved the issue. The architecture for ahead branch prediction (A2BP) separates traditional predictors into two parts. First is a small table located at the front-end of the pipeline, which makes the prediction brief enough even for some aggressive processors. Second, operations on complicated algorithms and large data structures for accurate predictions are all moved to the back-end of the pipeline. An effective mechanism is introduced for ahead branch prediction in the back-end and small table update in the front. To substantially improve prediction accuracy, an indirect branch prediction algorithm based on branch history and target path (BHTP) is implemented in A2BP. Experiments with the standard performance evaluation corporation (SPEC) benchmarks on gem5/SimpleScalar simulators demonstrate that A2BP improves average performance by 2.92% compared with a commonly used branch target buffer-based predictor. In addition, indirect branch misses with the BHTP algorithm are reduced by an average of 28.98% compared with the traditional algorithm. 相似文献
18.
The fundamental requirement for hard real-time systems is that task deadlines be never missed. As a consequence, knowing tasks worst case execution times (WCET) is crucial for such systems. Taking into account modern architectural features makes it possible to determine tighter WCET bounds than with program analysis that ignores such features. While effects of caches and pipelines on WCET analysis have been extensively studied, to our knowledge the effect of the branch prediction on WCET evaluation has not been studied yet. This paper describes a method for statically bounding the number of timing penalties due to erroneous branch predictions. The proposed method is based on static program analysis and branch target buffer modelling. It consists in collecting information on branch target buffer evolution by considering all possible execution paths of a program. Collected information can then be used to classify control transfer instructions so that their worst case branching cost can be estimated and incorporated into the program WCET. A method is also given to tightly predict the WCET of loops whose number of iterations depend on counter variables of outer loops. Experimental results show that the timing penalty due to wrong branch predictions estimated by the proposed technique is close to the real one, which demonstrates the practical applicability of our method. 相似文献
19.
The schedulability analysis of real-time embedded systems requires worst case execution time (WCET) analysis for the individual tasks. Bounding WCET involves not only language-level program path analysis, but also modeling the performance impact of complex micro-architectural features present in modern processors. In this paper, we statically analyze the execution time of embedded software on processors with speculative execution. The speculation of conditional branch outcomes (branch prediction) significantly improves a program's execution time. Thus, accurate modeling of control speculation is important for calculating tight WCET estimates. We present a parameterized framework to model the different branch prediction schemes. We further consider the complex interaction between speculative execution and instruction cache performance, that is, the fact that speculatively executed blocks can generate additional cache hits/misses. We extend our modeling to capture this effect of branch prediction on cache performance. Starting with the control flow graph of a program, our technique uses integer linear programming to estimate the program's WCET. The accuracy of our method is demonstrated by tight estimates obtained on realistic benchmarks. 相似文献