首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper describes three circuit technologies that have been developed for high-speed large-bandwidth on-chip DRAM secondary caches. They include a redundancy-array advanced activation scheme, a bus-assignment-exchangeable selector scheme and an address-zero access refresh scheme. By using these circuit technologies and new small subarray structures, a row-address access time of 12 ns and a row-address cycle time of 16 ns were obtained. An experimental chip made up of an 8-Mbyte DRAM and a 64-bit microprocessor was developed using 0.25-μm merged logic and DRAM process technology  相似文献   

2.
This third-generation 1.1-GHz 64-bit UltraSPARC microprocessor provides 1-MB on-chip level-2 cache, 4-Gb/s off chip memory bandwidth, and a new 200 MHz JBus interface that supports one to four processors. The 87.5-million transistor chip is implemented in a seven-layer-metal copper 0.13-/spl mu/m CMOS process and dissipates 53 W at 1.3 V and 1.1 GHz.  相似文献   

3.
SOAR (Smalltalk on a RISC), a 32-bit microprocessor designed for the efficient execution of compiled Smalltalk, is described. The chip, implemented in 4-/spl mu/m single-level metal NMOS technologies, has a cycle time of 400 ns. Pipelining allows an instruction to start each cycle with the exception of loads and stores. The processor contains 35700 transistors, is 320/spl times/432 mil, dissipates 3 W, and is assembled in an 84-lead pin grid array package. A design methodology that included a large CAD effort and provided functioning chips on first silicon was used. The SOAR hardware environment is a SUN workstation that includes a custom SOAR board and extra memory.  相似文献   

4.
In this paper, we describe the design, interface, and architectural features of a 12-bit single-chip microprocessor, the IM6100, implemented with a silicon gate complementary MOS (SiG CMOS) process. The features of CMOS circuits for operation in difficult noise, temperature, and power environments are noted. The microprocessor recognizes the instruction set of the PDP-8/E minicomputer. It is also programmed I/O interface compatible with the PDP-8/E. A unique feature of the IM6100 is the provision for a "transparent" operator console. An all-CMOS processor system with 256 × 12 RAM, 1K × 12 ROM, and a programmable asynchronous serial interface port can be built with seven LSI devices. The system dissipates 60 mW at 5 V.  相似文献   

5.
A 4-MB L2 data cache was implemented for a 64-bit 1.6-GHz SPARC(r) RISC microprocessor. Static sense amplifiers were used in the SRAM arrays and for global data repeaters, resulting in robust and flexible timing operation. Elimination of the global clock grid over the SRAM array saves power, enabled by combining the clock information with array select signals. Redundancy was implemented flexibly, with shift circuits outside the main data array for area efficiency. The chip integrates 315 million transistors and uses an 8-metal-layer 90-nm CMOS process.  相似文献   

6.
The first two members in a family of 64-bit superscalar microprocessors are presented. The 130-nm processor, which was introduced first, offers 5-way instruction dispatch, support for 4-way integer and floating-point single-instruction multiple-data (SIMD) operations, a 512-kB second level (L2) cache, and a high-speed external bus. The 90-nm processor is a technology remap of the 130-nm design. It retains the features of the 130-nm processor and adds others, including a new power management facility. The architecture, device characteristics, power management, and thermal details of these two processors are described. In addition, the dataflow layout, aspects of the circuit design, clocking, and timing are discussed.  相似文献   

7.
A 32-bit microprocessor which supports demand paged virtual memory is described. The processor is implemented on a 290 mil/SUP 2/ chip containing approximately 60K transistors with a 3.5 /spl mu/m gate NMOS technology. The effect of software architecture and performance requirements on chip design is discussed, with particular emphasis on the special problems of supporting virtual memory. The methodology used to develop and verify the design is also discussed.  相似文献   

8.
A shared n-well layout technique is developed for the design of dual-supply-voltage logic blocks. It is demonstrated on a design of a 64-bit arithmetic logic unit (ALU) module in domino logic. The second supply voltage is used to lower the power of noncritical paths in the sparse, radix-4 64-bit carry-lookahead adder and in the loopback bus. A 3 mm/sup 2/ test chip in 0.18-/spl mu/m 1.8-V five-metal with local interconnect CMOS technology that contains six ALUs and test circuitry operates at 1.16 GHz at the nominal supply. For target delay increase of 2.8% energy savings are 25.3% using dual supplies, while for 8.3% increase in delay, 33.3% can be saved.  相似文献   

9.
Telemedicine applications, in some cases, need multipoint-to-multipoint communication. To meet the requirement of telemedicine communication, the development of a medical communication server is proposed in this paper. To make a working, as well as cost-effective, communication platform for the telemedicine applications, a specially designed communication server model is proposed in this work. This server is able to provide an effective multipoint-to-multipoint communication service for any level applications in telemedicine. The implementation program of this server is developed in a Windows '95 environment by using a winsocket. The trial application testing in a telemonitoring system is also presented to demonstrate the feasibility of taking such a structure with the server. By using the architecture of such a communication server in telemedicine applications, the multipoint-to-multipoint communication is easily managed and the communication processes are simplified and well controlled by the server  相似文献   

10.
A 1.3-GHz fifth-generation SPARC64 microprocessor   总被引:1,自引:0,他引:1  
A fifth-generation SPARC64 processor is fabricated in 130-nm partially depleted silicon-on-insulator CMOS with eight layers of Cu metallization. At V/sub dd/ = 1.2 V and T/sub a/ = 25/spl deg/C, it runs at 1.3 GHz and dissipates 34.7 W. The chip contains 191 M transistors with 19 M logic circuits in an area of 18.14 mm /spl times/ 15.99 mm and is covered with 5858 bumps, of which 269 are for I/O signals. It is mounted in a 1360-pin land-grid-array package. The 16-byte-wide system bus operates with a 260-MHz clock in single-data-rate or double-data-rate modes. This processor implements an error-detection mechanism for execution units and data path logic circuits in addition to on-chip arrays to detect data corruption. Intermittent errors detected in execution units and data paths are recovered via instruction retry. A soft barrier clocking scheme allows amortization of the clock skew and jitter over multiple cycles and helps to achieve high clock frequency. Tunability of the clock timing makes timing closure easier. A relatively small amount of custom circuit design and the use of mostly static circuits contributes to achieve short development time.  相似文献   

11.
The HP-PA8000 is a 180-MHz quad-issue custom VLSI implementation of the HP-PA 2.0 64-b architecture delivering 11.84 SPECint95 and 20.18 SPECfp95 with 3.8 million transistors integrated on a 17.68 mm×19.1 mm die in a 3.3-V, 0.5-μm CMOS process. Specialized clock circuits and extensive use of dynamic logic are key factors in this microprocessor's performance. Attention to clock analysis and distribution resulted in a 170 ps clock skew between any two clock nodes. This microprocessor utilizes a 56-entry instruction reorder buffer (IRE), register renaming, and dual functional units to fully exploit instruction level parallelism  相似文献   

12.
A 167 MHz 64 b VLSI CPU chip is described. The chip executes a 333-MFLOPS (peak) with an estimated system performance of 270SPECint92/380SPECfp92 (@167 MHz, 2 MB E-cache). The 17.7×17.8 mm die is fabricated with a 0.5 micron CMOS technology with four metal layers and contains 5.2 M transistors. The superscalar processor is capable of sustaining an execution rate of four instructions per cycle even in the presence of conditional branches and cache misses. Four fully pipelined 8×16 b multipliers and four single-cycle latency 16 b adders combine to speed up image processing, 2-D, 3-D graphics, video compression/decompression by up to an order of magnitude. High clock speed was obtained by the use of delayed reset logic, a new register file design; and novel comparators. Strict design methodology allowed fully functional first silicon which met all speed targets. The power dissipation of the chip is 28 W  相似文献   

13.
Two scaled versions of a 32-bit NMOS reduced-instruction-set computer CPU, called RISC II, have been implemented on two different processing lines using the simple layout rules of C.A. Mead and L.A. Conway (1980). The lambda values are 2 and 1.5 /spl mu/m, corresponding to drawn gate lengths of 4 and 3 /spl mu/m, respectively. The design utilizes a small set of simple instructions in conjunction with a large register file in order to provide high performance. This approach has resulted in two surprisingly powerful single-chip processors.  相似文献   

14.
This paper describes the architecture and design concepts of the AM29116 microprogrammable bipolar microprocessor with 100ns cycle time. Systems throughput and design for tradeoffs are analysed and a typical application - a high speed disc controller is described. Present and future parts for microprogrammed controller design are also reviewed.  相似文献   

15.
Sound engineering methodology, which has long been valued in hardware design, has been slower to develop in software design. This paper uses a case study of a small real-time system to discuss software design philosophies, with particular emphasis on the abstract machine view of systems. It demonstrates how the currently popular software design axioms of generality and modularity can be used to produce a software system that meets severe space constraints while remaining relatively portable across a family of microcomputers. These sorts of constraints have often been used to justify ad hoc design approaches in the past. The results of the project suggest that the use of such techniques actually make the meeting of many constraints easier than would a less organized approach. In addition, the reliability and maintainability of the resultant product is likely to be better.  相似文献   

16.
This quad-issue processor achieves 1-GHz operation through improved dynamic circuit techniques in critical paths and a more extensive on-chip memory system which scales in both bandwidth and latency. Critical logic paths use domino, delayed clocked domino, and logic embedded in dynamic flip-flops for minimum delay. A 64-KB sum-addressed memory data cache combines the address offset add with the cache decode, allowing the average memory latency to scale by more than the clock ratio. Memory bandwidth is improved by using wave pipelined SRAM designs for on-chip caches and a write cache for store traffic. Memory power is controlled without increased latency by use of delayed-reset logic decoders. The chip operates at 1000 MHz and dissipates less than 80 W from a 1.6-V supply. It contains 23 million transistors (12 million in RAM cells) on a 244 mm2 die  相似文献   

17.
A quad-issue custom VLSI microprocessor is described. This microprocessor implements the Alpha architecture and achieves an estimated performance of 13.3 SPECint9S and 18.4 SPECfp95 at 433 MHz. The 9.6 million transistor die measures 14.4 mm×14.5 mm, and is fabricated in a 0.35-μm, four-metal layer CMOS process. This chip dissipates less than 25 W at 433 MHz using a 2.0 V internal power supply. The design was leveraged from a prior 300-MHz, 3.3-V, 0.50-μm CMOS design. It includes several significant architectural enhancements and required circuit solutions for operation at 2.0 V. The chip will operate at nominal internal power supply voltages up to 2.5 V allowing improved performance at the cost of increased power consumption. At 2.5 V, the chip operates at 500 MHz and delivers 15.4 SPECint95 (est) and 21.1 SPECfp95 (est). This paper describes the chip implementation details and the strategy for efficiently migrating the existing design to the 0.35-μm technology  相似文献   

18.
A 400-MIPS/200-MFLOPS (peak) custom 64-b VLSI CPU is described. The chip is fabricated in a 0.75-μm CMOS technology utilizing three levels of metalization and optimized for 3.3-V operation. The die size is 16.8 mm×13.9 mm and contains 1.68 M transistors. The chip includes separate 8-kbyte instruction and data caches and a fully pipelined floating-point unit (FPU) that can handle both IEEE and VAX standard floating-point data types. It is designed to execute two instructions per cycle among scoreboarded integer, floating-point, address, and branch execution units. Power dissipation is 30 W at 200-MHz operation  相似文献   

19.
The first implementation of the IA-64 architecture achieves high performance by using a highly parallel execution core, while maintaining binary compatibility with the IA-32 instruction set. Explicitly parallel instruction computing (EPIC) design maximizes performance through hardware and software synergy. The processor contains 25.4 million transistors and operates at 800 MHz. The chip is fabricated in a 0.18-μm CMOS process with six metal layers and packaged in a 1012-pad organic land grid array using C4 (flip chip) assembly technology. A core speed back-side bus connects the processor to a 4-MB L3 cache  相似文献   

20.
Datapaths for media signal processing are typically built using programmable computational elements such as adders and multipliers, which can be run-time reconfigured to operate on simple integers with 8, 16, or 32 bits of precision. In this brief, a new high-speed energy-efficient reconfigurable adder for media signal processing is presented. The proposed circuit is based on carry-propagation schemes and can be partitioned to perform one 64-, two 32-, four 16-, and eight 8-bit additions. When the Austria Mikro System (AMS) 0.35 /spl mu/m 2-poly 3-metal 3.3 V CMOS (CSD) process is used to produce layout, a worst propagation delay of about 4.9 ns and an average energy dissipation of about 181 /spl mu/W/MHz are obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号