首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper describes three circuit technologies that have been developed for high-speed large-bandwidth on-chip DRAM secondary caches. They include a redundancy-array advanced activation scheme, a bus-assignment-exchangeable selector scheme and an address-zero access refresh scheme. By using these circuit technologies and new small subarray structures, a row-address access time of 12 ns and a row-address cycle time of 16 ns were obtained. An experimental chip made up of an 8-Mbyte DRAM and a 64-bit microprocessor was developed using 0.25-μm merged logic and DRAM process technology  相似文献   

2.
In this paper, we present the characterization and design of energy-efficient, on chip cache memories. The characterization of power dissipation in on-chip cache memories reveals that the memory peripheral interface circuits and bit array dissipate comparable power. To optimize performance and power in a processor's cache, a multidivided module (MDM) cache architecture is proposed to conserve energy in the bit array as well as the memory peripheral circuits. Compared to a conventional, nondivided, 16-kB cache, the latency and power of the MDM cache are reduced by a factor of 1.9 and 4.6, respectively. Based on the MDM cache architecture, the energy efficiency of the complete memory hierarchy is analyzed with respect to cache parameters in a multilevel processor cache design. This analysis was conducted by executing the SPECint92 benchmark programs with the miss ratios for reduced instruction set computer (RISC) and complex instruction set computer (CISC) machines  相似文献   

3.
The system, circuit, layout and device levels of an integrated cache memory (ICM), which includes 32 kbyte DATA memory with typical address to HIT delay of 18 ns and address to DATA delay of 23 ns, are described. The ICM offers the largest memory size and the fastest speed ever reported in a cache memory. The device integrates a 32 kbyte DATA INSTRUCTION memory, a 34 kbit TAG memory, an 8 kbit VALID flat, a 2 kbit least recently used (LRU) flag, comparators, and CPU interface logic circuits on a chip. The inclusion of the DATA memory is crucial in improving system cycle time. The device uses several novel circuit design technologies, including a double-word-line scheme, low-noise flush clear, a low-power comparator, noise immunity, and directly testable memory design. Its newly proposed way-slice architecture increases both flexibility and expandability  相似文献   

4.
An ultrahigh-speed 72-kb ECL-CMOS RAM macro for a 1-Mb SRAM with 0.65-ns address-access time, 0.80-ns write-pulse width, and 30.24-μm 2 memory cells has been developed using 0.3-μm BiCMOS technology. Two key techniques for achieving ultrahigh speed are an ECL decoder/driver circuit with a BiCMOS inverter and a write-pulse generator with a replica memory cell. These circuit techniques can reduce access time and write-pulse width of the 72-kb RAM macro to 71% and 58% of those of RAM macros with conventional circuits. In order to reduce crosstalk noise for CMOS memory-cell arrays driven at extremely high speeds, a twisted bit-line structure with a normally on MOS equalizer is proposed. These techniques are especially useful for realizing ultrahigh-speed, high-density SRAM's, which have been used as cache and control storages in mainframe computers  相似文献   

5.
A 6-ns cycle, 7.7-ns access cache memory and memory management unit (CAMMU) chip has been developed. The circuit includes two 5-ns 128-kb cache memories, two 4-ns 64-entry fully associative translation lookaside buffers (TLBs), two 4-ns 64-line tag RAMs, comparators, registers, and control logic. The TLB design contains a line encoder and valid bits with flash clear. Timing control allows read, write, associative accesses, and invalid search accesses with identical timings. The two caches time-share data input and sense amplifier circuits for improved density, and they are pipelined to allow a new access to start before the previous access is complete  相似文献   

6.
This paper describes a CMOS multiport static memory cell with which it is possible to use current-switching bipolar peripheral circuits to maintain small voltage swings throughout the read access path while retaining the high density of CMOS memory arrays. An experimental 32-word×32 bit three-port register file has been designed and implemented using this cell. The register file was fabricated in a 0.6-μm BiCMOS technology and operates from a single -3.3-V power supply with ECL-compatible I/O circuits. Under nominal operating conditions at 20°C, the measured pin-to-pin access time is 1.3 ns. The minimum write enable pulse width required is less than 1 ns, and the power dissipation, excluding the output buffers, is 650 mW at a clock rate of 100 MHz  相似文献   

7.
A 1-Mbit CMOS static RAM (SRAM) with a typical address access time of 9 ns has been developed. A high-speed sense amplifier circuit, consisting of a three-stage PMOS cross-coupled sense amplifier with a CMOS preamplifier, is the key to the fast access time. A parallel-word-access redundancy architecture, which causes no access time penalty, was also incorporated. A polysilicon PMOS load memory cell, which had a large on-current-to-off-current ratio, gave a much lower soft-error rate than a conventional high-resistance polysilicon load cell. The 1-Mbit SRAM, fabricated using a half-micrometer, triple-poly, and double-metal CMOS technology, operated at a single supply voltage of 5 V. An on-chip power supply converter was incorporated in the SRAM to supply a partial internal supply voltage of 4 V to the high-performance half-micrometer MOS transistors.<>  相似文献   

8.
A 256 K (32 K×8) CMOS static RAM (SRAM) which achieves an access time of 7.5 ns and 50-mA active current at 50-MHz operation is described. A 32-block architecture is used to achieve high-speed access and low power dissipation. To achieve faster access time, a double-activated-pulse circuit which generates the word-line-enable pulse and the sense-amplifier-enable pulse has been developed. The data-output reset circuit reduces the transition time and the noise generated by the output buffer. A self-aligned contact technology reduces the diffused region capacitance. This RAM has been fabricated in a twin-tub CMOS 0.8-μm technology with double-level polysilicon (the first level is polycide) and double-level metal. The memory cell size is 6.0×11.0 μm2 and the chip size is 4.38×9.47 mm 2  相似文献   

9.
A 1.8-V embedded 18-Mb DRAM macro with a 9-ns row-address-strobe access time and memory-cell area efficiency of 33% has been successfully developed with a single-side interface architecture, high-speed circuit design, and low-voltage design. In the high-speed circuit design, a multiword redundancy scheme and Y-select merged sense scheme are developed to achieve the performance goal. In the low-voltage design, a dual-complement charge-pump scheme and a decoupling capacitor utilizing a tantalum-oxide capacitor are developed to retain high performance at low supply voltage  相似文献   

10.
Address base-plus-offset summing is merged into the decode structure of this 64-KByte (512-Kbit), four-way set-associative cache. This address adder avoids time-consuming carry propagation by using an A+B=K equality test. The combined add and access operations are implemented using delayed-reset logic and a 0.25-μm process, This wave pipelined RAM achieves a 1.6-ns cycle time and 2.6-ns latency for the combined address add and cache access  相似文献   

11.
Describes a 1-Mbit high-speed DRAM (HSDRAM), which has a nominal random access time of less than 27 ns and a column access time of 12 ns with address multiplexing. A double-polysilicon double-metal CMOS technology having PMOS arrays inside n-wells was developed with an average 1.3- mu m feature size. The chip has also been fabricated in a 0.9*shrunken version with an area of 67 mm/sup 2/, showing a 22-ns access time. The chip power consumption is lower than 500 mW at 60-ns cycle time. This HSDRAM, which provides SRAM-like speed while retaining DRAM-like density, allows DRAMs to be used in a broad new range of applications.<>  相似文献   

12.
A high-speed 32×32-b parallel multiplier with an improved parallel structure using 0.8-μm CMOS triple-level-metal technology is discussed. A unit adder, a 4-2 compressor, enhances the parallelism of the multiplier array. A 25% reduction in the propagation delay time is achieved by using the compressor. The multiplier contains 27704 transistors with a 2.68-×2.71-mm2 die area. The multiplication time is 15 ns at 5 V with a power dissipation of 277 mW at 10-MHz operation. The triple-level-metal interconnection technology reduces the multiplier layout area. Compared with double-level-metal technology, a 27% chip size reduction is achieved  相似文献   

13.
A 32-b RISC/DSP microprocessor with reduced complexity   总被引:2,自引:0,他引:2  
This paper presents a new 32-b reduced instruction set computer/digital signal processor (RISC/DSP) architecture which can be used as a general purpose microprocessor and in parallel as a 16-/32-b fixed-point DSP. This has been achieved by using RISC design principles for the implementation of DSP functionality. A DSP unit operates in parallel to an arithmetic logic unit (ALU)/barrelshifter on the same register set. This architecture provides the fast loop processing, high data throughput, and deterministic program flow absolutely necessary in DSP applications. Besides offering a basis for general purpose and DSP processing, the RISC philosophy offers a higher degree of flexibility for the implementation of DSP algorithms and achieves higher clock frequencies compared to conventional DSP architectures. The integrated DSP unit provides instruction set support for highly specialized DSP algorithms. Subword processing optimized for DSP algorithms has been implemented to provide maximum performance for 16-b data types. While creating a unified base for both application areas, we also minimized transistor count and we reduced complexity by using a short instruction pipeline. A parallelism concept based on a varying number of instruction latency cycles made superscalar instruction execution superfluous  相似文献   

14.
本文完成了32位嵌入式RISC微处理器设计,其指令系统与MIPS32兼容.文章着重研究了该处理器的指令系统与整体架构,给出了核心模块设计,并采用Mentor Graphics公司ModelSim进行了功能仿真.最后,采用Altera公司提出的灵活、高效的片上系统设计方案 SOPC,结合Altera公司的FPGA,设计了专用实验电路,对自行设计的32位嵌入式RISC微处理器进行了正确性验证.  相似文献   

15.
Describes circuit techniques for fabricating a high-speed adder using pass-transistor logic. Double pass-transistor logic (DPL) is shown to improve circuit performance at reduced supply voltage. Its symmetrical arrangement and double-transmission characteristics improve the gate speed without increasing the input capacitance. A carry propagation circuit technique called conditional carry selection (CCS) is shown to resolve the problem of series-connected pass transistors in the carry propagation path. By combining these techniques, the addition time of a 32-b ALU can be reduced by 30% from that of an ordinary CMOS ALU. A 32-b ALU test chip is fabricated in 0.25-μm CMOS technology using these circuit techniques and is capable of an addition time of 1.5 ns at a supply voltage of 2.5 V  相似文献   

16.
32位RISC微处理器设计   总被引:1,自引:0,他引:1  
杨光  齐家月 《微电子学》2001,31(1):58-61
介绍了一种与Motorola-Mcore兼容的32位RISC结构微处理器核的设计。从该处理器的整体结构的划分,到处理器内部各单元的设计,进行了比较详尽的阐述,最后给出了设计的综合结果,并对该设计进行了软件仿真和硬件验证。  相似文献   

17.
一种带有流水线追踪器的JTAG ICE调试电路设计   总被引:1,自引:1,他引:0  
针对复旦大学自主开发的32位RISCCPU,设计了相应JTAG调试电路(In—Circuit Emulator)。为解决此RISCCPU中5级流水线导致的断点误停的问题,提出了一种新颖的带有分支预测功能的电路结构一“流水线追踪器”。此JTAG调试电路与IEEE1149.1标准兼容,具有设置断点、单步、查看或修改CPU寄存器/内存空间、在线FLASH编程等多种功能。  相似文献   

18.
The SOC implementation of a capacitive fingerprint sensor, which embeds the 32-bit microcontroller for performing an identification algorithm, is described for user authentication on small, thin, and portable equipment. The SOC is composed of a 160/spl times/192 pixel array with a sensor detection circuit and the embedded 32-bit RISC microcontroller. The proposed sensor detection circuit increases the voltage difference between a ridge and valley about 180% more than conventional detection circuit does and minimizes any electrostatic discharge influence by applying an effective isolation structure. The test chip was fabricated on a 0.35-/spl mu/m standard CMOS 1-poly 4-metal process.  相似文献   

19.
A 125 megabyte/s synchronous 32-bank 256-Mb DRAM has been developed by a bank-interleaving oriented multibank architecture including a shared-sense amplifier cache with an overlapped bank control for hidden precharge, phase-aligned timing pulse transmission, and voltage controlled negative conductance (VCNC) data-bus current sense amplifier  相似文献   

20.
介绍了一32位RISC嵌入式微处理器(取名为MoonCore)的5级流水线的结构,即取指&译码(IF&ID)、读寄存器堆(RF)、执行(EXEC)、访存(DMEM)和写回(WB),详细介绍了各个流水级的主要部件的设计并分析了流水线相关问题及解决办法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号