首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An intelligent cache based on a distributed architecture that consists of a hierarchy of three memory sections-DRAM (dynamic RAM), SRAM (static RAM), and CAM (content addressable memory) as an on-chip tag-is reported. The test device of the memory core is fabricated in a 0.6 μm double-metal CMOS standard DRAM process, and the CAM matrix and control logic are embedded in the array. The array architecture can be applied to 16-Mb DRAM with less than 12% of the chip overhead. In addition to the tag, the array embedded CAM matrix supports a write-back function that provides a short read/write cycle time. The cache DRAM also has pin compatibility with address nonmultiplexed memories. By achieving a reasonable hit ratio (90%), this cache DRAM provides a high-performance intelligent main memory with a 12 ns(hit)/34 ns(average) cycle time and 55 mA (at 25 MHz) operating current  相似文献   

2.
A 4-Mb cache dynamic random access memory (CDRAM), which integrates 16-kb SRAM as a cache memory and 4-Mb DRAM into a monolithic circuit, is described. This CDRAM has a 100-MHz operating cache, newly proposed fast copy-back (FCB) scheme that realizes a three times faster miss access time over with the conventional copy-back method, and maximized mapping flexibility. The process technology is a quad-polysilicon double-metal 0.7-μm CMOS process, which is the same as used in a conventional 4-Mb DRAM. The chip size of 82.9 mm2 is only a 7% increase over the conventional 4-Mb DRAM. The simulated system performance indicated better performance than a conventional cache system with eight times the cache capacity  相似文献   

3.
A 256 K-word×16-bit dynamic RAM with concurrent 16-bit error correction code (ECC) has been built in 0.8-μm CMOS technology, with double-level metal and surrounding high-capacitance cell. The cell measures 10.12 μm2 with a 90-fF storage capacitance. A duplex bit-line architecture used on the DRAM provides multiple-bit operations and the potential of high-speed data processing for ASIC memories. The ECC checks concurrently 16-bit data and corrects a 1-bit data error. This ECC method can be adapted to higher-bit ECC without expanding the memory array. The ratio of ECC area to the whole chip is 7.5%. The cell structure and the architecture allow for expansion to 16-Mb DRAM. The 4-Mb DRAM has a 70-ns RAS access time without ECC and a 90-ns RAS access time with ECC  相似文献   

4.
A single-ended static memory scheme combining advantages of both a one-transistor dynamic RAM (DRAM) cell and a six-transistor static RAM (SRAM) cell is proposed in this article. For the first time, optical bias is introduced, converting the classical complementary metal-oxide semiconductor (CMOS) RAM to an optoelectronic device. The cell structure is highly scalable and cost effective. Various approaches and schemes were applied to combine advantages of static and dynamic RAM memories, striving to shorten access times, lower power dissipation, and decrease cell area. This is particularly true for system-on-a-chip (SoC) and embedded memories. Here, the novel approach towards the same goal is proposed and simulated, introducing standard CMOS technology. A single-ended, three-transistor, fully static RAM cell is demonstrated.  相似文献   

5.
Ng  R. 《Spectrum, IEEE》1992,29(10):36-39
The search for new dynamic RAM (DRAM) technologies to reduce memory access time and so unleash computer performance is discussed. The typical memory hierarchy of internal registers, cache, main memory, and mass storage is described. DRAM technologies that aim at simplifying this hierarchy by speeding up main memory to the point where the need for a separate, external cache is moot are examined  相似文献   

6.
A submicron CMOS 1-Mb RAM with a built-in error checking and correcting (ECC) circuit is described. An advanced bidirectional parity code with a self-checking function is proposed to reduce the soft error rate. A distributed sense circuit makes it possible to implement a small memory cell size of 20 /spl mu/m/SUP 2/ in combination with a trench capacitor technique. The 1M word/spl times/1 bit device was fabricated on a 6.4/spl times/8.2 mm chip. The additional 98-kb parity cells and the built-in ECC circuit occupy about 12% of the whole chip area. The measured access time is 140 ns, including 20 ns ECC operation.  相似文献   

7.
This paper describes a system integrated memory with direct interface to CPU which integrates an SRAM, a DRAM, and control circuitry, including a tag memory (TAG). This memory realizes a computer system without glue chips, and thus enables a computer system which is low cost, low power, and compact size, but still with sufficient performance. Also fast clock cycle time and access time is realized using a newly proposed clock driver and internal signal generator. This memory is fabricated with a quad-polysilicon double-metal 0.55-μm CMOS process which is the same as used in a conventional 16-Mb DRAM. The chip size of 145.3 mm2 is only a 12% increase over the conventional 16-Mb DRAM. The maximum operating frequency is 90-MHz and the operating current at cache-bit is 156-mA. This memory is suitable for various types of computer systems such as personal digital assistants (PDA's), personal computer systems, and embedded controller applications  相似文献   

8.
BiCMOS circuit technology for a high-speed and large-capacity ECL-compatible static RAM (SRAM) is described. To obtain high-speed and low-power operation, a decoder with a pre-main decode configuration having an ECL-interface circuit and a word driver with BiCMOS inverter are proposed. A BiCMOS multiplexer with a single emitter-follower driver is also proposed. An optimization method for memory cell array configuration is presented that minimizes the total delay time and the total power dissipation of SRAMs. Circuit simulation results show that a 64-kbit ECL-compatible SRAM with an access time of less than 7 ns and a power dissipation of less than 1 W is obtainable  相似文献   

9.
A high-speed DRAM data transfer scheme between DRAM and logic parts in merged DRAM logic (MDL) designs is proposed with logically divided DRAM row address mapping. The proposed scheme results in a 20% faster write access and 40% faster read access. It can be used as a general design framework to maximise DRAM access speed in various MDL designs. A test chip has been fabricated by 0.16 μm DRAM technology, and the scheme has been verified in the design of a DRAM L2 cache memory  相似文献   

10.
This paper describes three circuit technologies that have been developed for high-speed large-bandwidth on-chip DRAM secondary caches. They include a redundancy-array advanced activation scheme, a bus-assignment-exchangeable selector scheme and an address-zero access refresh scheme. By using these circuit technologies and new small subarray structures, a row-address access time of 12 ns and a row-address cycle time of 16 ns were obtained. An experimental chip made up of an 8-Mbyte DRAM and a 64-bit microprocessor was developed using 0.25-μm merged logic and DRAM process technology  相似文献   

11.
A 1-Mbit CMOS static RAM (SRAM) with a typical address access time of 9 ns has been developed. A high-speed sense amplifier circuit, consisting of a three-stage PMOS cross-coupled sense amplifier with a CMOS preamplifier, is the key to the fast access time. A parallel-word-access redundancy architecture, which causes no access time penalty, was also incorporated. A polysilicon PMOS load memory cell, which had a large on-current-to-off-current ratio, gave a much lower soft-error rate than a conventional high-resistance polysilicon load cell. The 1-Mbit SRAM, fabricated using a half-micrometer, triple-poly, and double-metal CMOS technology, operated at a single supply voltage of 5 V. An on-chip power supply converter was incorporated in the SRAM to supply a partial internal supply voltage of 4 V to the high-performance half-micrometer MOS transistors.<>  相似文献   

12.
The feasibility of realizing an emitter-coupled-logic (ECL) interface 4-Mb dynamic RAM (DRAM) with an access time under 10 ns using 0.3-μm technology is explored, and a deep submicrometer BiCMOS VLSI using this technology is proposed. Five aspects of such a DRAM are covered. They are the internal power supply voltage scheme using on-chip voltage limiters, an ECL DRAM address buffer with a reset function and level converter, a current source for address buffers compensated for device parameter fluctuation, an overdrive rewrite amplifier for realizing a fast cycle time, and double-stage current sensing for the main amplifier and output buffer. Using these circuit techniques, an access time of 7.8 ns is expected with a supply current of 198 mA at a 16-ns cycle time  相似文献   

13.
An ultrahigh-speed 72-kb ECL-CMOS RAM macro for a 1-Mb SRAM with 0.65-ns address-access time, 0.80-ns write-pulse width, and 30.24-μm 2 memory cells has been developed using 0.3-μm BiCMOS technology. Two key techniques for achieving ultrahigh speed are an ECL decoder/driver circuit with a BiCMOS inverter and a write-pulse generator with a replica memory cell. These circuit techniques can reduce access time and write-pulse width of the 72-kb RAM macro to 71% and 58% of those of RAM macros with conventional circuits. In order to reduce crosstalk noise for CMOS memory-cell arrays driven at extremely high speeds, a twisted bit-line structure with a normally on MOS equalizer is proposed. These techniques are especially useful for realizing ultrahigh-speed, high-density SRAM's, which have been used as cache and control storages in mainframe computers  相似文献   

14.
The capacitance-resistance (CR) delay circuit technology assures full asynchronicity between memory cell array and peripheral circuits over a wide range of both operating and process conditions and thus realizes a fast access time. A noise compensation scheme is used to generate a constant delay even under the power supply line noise. The circuit was applied to a 4 Mbit dynamic RAM (DRAM) peripheral circuit. As a result, timing loss as well as malfunction could be successfully avoided, and 7 ns faster access time and 39 ns shorter cycle time, compared with a conventional design using normal inverter chains, have been achieved  相似文献   

15.
For every packet an IP router receives, it makes a routing decision based on the packet's destination address. The router's forwarding rate is usually limited by the rate at which it can make these decisions. We describe a new method for implementing route lookups in hardware. Our method can be implemented in the forwarding engine of a network processor or router using a small on-chip SRAM and an off-chip DRAM, and it achieves a rate of one lookup per DRAM random access time. We present our method and discuss an implementation that uses a DRAM with 64 ns random access time to give over 15 million lookups per second. Our tests show that the method performs well for realistic routing tables while using only modest amounts of memory.  相似文献   

16.
A 12 MHz data-cycle 4 Mb DRAM (dynamic RAM) with pipeline operation was designed and fabricated using 0.8 μm twin-tub CMOS technology. The pipeline DRAM outputs data corresponding to addresses that were accepted in the previous inverted random access storage (RAS) input cycle. The latter half of the previous read operation and the first half of the next read operation take place simultaneously, so the inverted RAS input cycle time is reduced. This pipeline DRAM technology needs no additional chip area and no process modification. A 95 ns inverted RAS input cycle time was obtained under worst conditions while this value is 125 ns for conventional DRAMs  相似文献   

17.
Recently, the level of realism in PC graphics applications has been approaching that of high-end graphics workstations, necessitating a more sophisticated texture data cache memory to overcome the finite bandwidth of the AGP or PCI bus. This paper proposes a multilevel parallel texture cache memory to reduce the required data bandwidth on the AGP or PCI bus and to accelerate the operations of parallel graphics pipelines in PC graphics cards. The proposed cache memory is fabricated by 0.16-μm DRAM-based SOC technology. It is composed of four components: an 8-MB DRAM L2 cache, 8-way parallel SRAM L1 caches, pipelined texture data filters, and a serial-to-parallel loader. For high-speed parallel L1 cache data replacement, the internal bus bandwidth has been maximized up to 75 GB/s with a newly proposed hidden double data transfer scheme. In addition, the cache memory has a reconfigurable architecture in its line size for optimal caching performance in various graphics applications from three-dimensional (3-D) games to high-quality 3-D movies  相似文献   

18.
The dual-sensing-latch circuit proposed here can solve the synchronization problem of the conventional wave-pipelined SRAM and the proposed source-biased self-resetting circuit reduces both the cycle and access time of cache SRAM's. A 16-kb SRAM using these circuit techniques was designed, and was fabricated with 0.25-μm CMOS technology. Simulation results indicate that this SRAM has a typical clock access time of 2.6 ns at 2.5-V supply voltage and a worst minimum cycle time of 2.6 ns  相似文献   

19.
《Microelectronics Journal》2015,46(11):1020-1032
This paper describes a new memristor crossbar architecture that is proposed for use in a high density cache design. This design has less than 10% of the write energy consumption than a simple memristor crossbar. Also, it has up to 3 times the bit density of an STT-MRAM system and up to 11 times the bit density of an SRAM architecture. The proposed architecture is analyzed using a detailed SPICE analysis that accounts for the resistance of the wires in the memristor structure. Additionally, the memristor model used in this work has been matched to specific device characterization data to provide accurate results in terms of energy, area, and timing. The proposed memory system was analyzed by modeling two different devices that vary in resistance range and switching time. This system does not require that the memristor devices have inherent diode effects which limit alternate current paths. Therefore this system is capable of utilizing a much broader class of devices.An architectural analysis has also been completed that shows how the memory system may perform as a cache memory. A hybrid cache structure was used to alleviate the long write latencies of memristor devices. This approach consisted of the tag array being made of SRAM cells while the data array was made of the memristor circuit proposed. This hybrid scheme allows multiple reads and writes to concurrently access different sub-arrays within a cache. The performance of these novel memristor based caches was compared to SRAM and STT-MRAM based caches through detailed simulations. The results show that the memristor caches are denser and allow better performance along with lower system power when compared to the STT-MRAM and SRAM caches.  相似文献   

20.
Spin-torque transfer RAM (STT-RAM) is a promising candidate to replace SRAM for larger Last level cache (LLC). However, it has long write latency and high write energy which diminish the benefit of adopting STT-RAM caches. A common observation for LLC is that a large number of cache blocks have never been referenced again before they are evicted. The write operations for these blocks, which we call dead writes, can be eliminated without incurring subsequent cache misses. To address this issue, a quantitative scheme called Feedback learn-ing based dead write termination (FLDWT) is proposed to improve energy efficiency and performance of STT-RAM based LLC. FLDWT dynamically learns the block access behavior by using data reuse distance and data access fre-quency, and then classifies the blocks into dead blocks and live blocks. FLDWT terminates dead write block requests and improves the estimation accuracy via feedback infor-mation. Compared with STT-RAM baseline in the last-level caches, experimental results show that our scheme achieves energy reduction by 44.6% and performance im-provement by 12% on average with negligible overhead.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号