共查询到20条相似文献,搜索用时 640 毫秒
1.
Shyh-Jye Jou I-Yao Chung 《Electronics letters》1997,33(2):110-111
An implementation of self-timed circuits whose hardware and control signals are significantly reduced is proposed. A globally asynchronous locally synchronous design using the proposed self-timed circuits is also demonstrated. A design example shows that in this implementation less power is consumed with only a small circuit overhead 相似文献
2.
This paper describes the design and hardware results of a scannable pulse-to-static conversion register array for self-timed circuits. The circuits include a self-timed control circuit and a 64-bit register array, both designed utilizing self-resetting CMOS (SRCMOS) circuit techniques. The self-timed feature of the control block allows it to require only one system clock input. The evaluation, reset, and write-enable controls are all generated within the control macro. The register array is a level-sensitive scan design, which is compatible and complies with SRCMOS test modes. This type of register array can facilitate the synchronous/asynchronous interfaces, pipelined operation, power management, and testing of advanced digital systems employing a mixture of static and dynamic circuits to achieve low power and high performance 相似文献
3.
4.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(7):883-892
5.
This paper presents a novel variable-latency multiplier architecture, suitable for implementation as a self-timed multiplier core or as a fully synchronous multicycle multiplier core. The architecture combines a second-order Booth algorithm with a split carry save array pipelined organization, incorporating multiple row skipping and completion-predicting carry-select dual adder. The paper reports the architecture and logic design, CMOS circuit design and performance evaluation. In 0.35 μm CMOS, the expected sustainable cycle time for a 32-bit synchronous implementation is 2.25 ns. Instruction level simulations estimate 54% single-cycle and 46% two-cycle operations in SPEC95 execution. Using the same CMOS process, the 32-bit asynchronous implementation is expected to reach an average 1.76 ns throughput and 3.48 ns latency in SPEC95 execution 相似文献
6.
Sterling R. Whitaker Gary K. Maki Manjunath Shamanna 《International Journal of Electronics》2013,100(4):609-620
A VLSI architecture for synchronous sequential controllers is resented that has attractive qualities for roducing reliable circuits. In these circuits, one hardware implementation can realize any flow table with a maximum of 2n internal states and m inputs. A real time fault detection means is resented along with a strategy for verifying the correctness of the checking hardware. This self-check feature can be employed with no increase in hardware. The architecture can be modified to achieve fail-safe designs. With no increase in hardware, an adaptable circuit can be realized that allows replacement of faulty transitions with fault-free transitions 相似文献
7.
Deng Z.J. Yoshikawa N. Whiteley S.R. Van Duzer T. 《Applied Superconductivity, IEEE Transactions on》1999,9(1):7-17
As the operating speed of rapid single flux quantum (RSFQ) integrated circuits and systems increases, timing uncertainty from fabrication process variations makes global synchronization very hard. In this paper, the authors present a globally asynchronous, locally synchronous timing methodology for RSFQ digital design, which can solve the global synchronization problem. They also demonstrate the recent experimental results of some asynchronous circuits and systems implemented in RSFQ technology. Key components such as a self-timed shift register, a self-timed demultiplexor, a Muller-C element, a completion detector, and a clock generator have been designed and tested. High-speed operation has been confirmed up to 20 Gb/s for a prototype data buffer system, which consists of two self-timed shift registers and an on-chip 8-28-GHz clock generator 相似文献
8.
Gschwind M. Salapura V. Maurer D. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(2):241-250
Application-specific processors offer an attractive option in the design of embedded systems by providing high performance for a specific application domain. In this work, we describe the use of a reconfigurable processor core based on an RISC architecture as starting point for application-specific processor design. By using a common base instruction set, development cost can be reduced and design space exploration is focused on the application-specific aspects of performance. An important aspect of deploying any new architecture is verification which usually requires lengthy software simulation of a design model. We show how hardware emulation based on programmable logic can be integrated into the hardware/software codesign flow. While previously hardware emulation required massive investment in design effort and special purpose emulators, an emulation approach based on high-density field-programmable gate array (FPGA) devices now makes hardware emulation practical and cost effective for embedded processor designs. To reduce development cost and avoid duplication of design effort, FPGA prototypes and ASIC implementations are derived from a common source: We show how to perform targeted optimizations to fully exploit the capabilities of the target technology while maintaining a common source base 相似文献
9.
Chih-Ming Chang Shih-Lien Lu 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(3):370-378
Control-flow machines are sequential in nature, executing instructions in sequence through control of program counters, whereas data-flow machines execute instructions only as input operands are made available, a process directed at the parallelism inherent within programs. At the architecture level, data-flow machines execute instructions asynchronously. In contrast, at the implementation level, the synchronous design framework of computer systems which employs globally clocked timing discipline has reached its design limits owing to problems of clock distribution. Therefore, renewed interest has been expressed in the design of computer systems based upon an asynchronous (or self-timed) approach free of the discipline imposed by the global clock. Thus, the design of a static MIMD data-flow processor using micropipelines is presented. The implemented processor, or the micro data-flow processor, differs from processors previously reported insofar as the micro data-flow processor is wholly asynchronous at both the architectural and the implementation levels 相似文献
10.
As transistor switching speed improves, synchronizing a global clock increasingly degrades system performance. Therefore, self-timed asynchronous logic becomes potentially faster than synchronous logic. To do so, however, it must exploit the techniques used in fast synchronous designs, including redundant logic, inverting logic, transistor size optimization, dynamic logic, and phase alignment. Most techniques can be applied equally well to asynchronous logic-indeed phase alignment is easier-but combining dynamic and asynchronous logic is more difficult. Minimum refresh intervals together with race- and hazard-free operation must be guaranteed. An initial chip implementation that combines dynamic and asynchronous logic running at 500 MHz in 2-μm CMOS is described. With the addition of transistor size optimization, simulations show the same circuit running in the same technology at 800 MHz 相似文献
11.
Borriello G. Ebeling C. Hauck S.A. Burns S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(4):491-501
Field-programmable gate arrays (FPGAs) are an important implementation medium for digital logic. Unfortunately, they currently suffer from poor silicon area utilization due to routing constraints. In this paper we present Triptych, an FPGA architecture designed to achieve improved logic density with competitive performance. This is done by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits. We show that, using manual placement, this architecture yields a logic density improvement of up to a factor of 3.5 over commercial FPGAs, with comparable performance. We also describe Montage, the first FPGA architecture to fully support asynchronous and synchronous interface circuits 相似文献
12.
13.
This paper presents a simple implementation method of pipelined asynchronous circuits, suitable for commercial field programmable gate arrays (FPGAs). Contrary to other existing asynchronous design techniques, the presented method does not require the application of additional user actions such as constraining or building hard macros. As a design example, an architecture of the asynchronous PicoBlaze compatible microcontroller and 12-bit pipelined fast array multiplier have been considered. The developed synchronous and asynchronous versions of the microcontroller as well as fast array multiplier have been implemented and tested using Xilinx FPGAs, and then compared in terms of the area requirement, power consumption and performance. 相似文献
14.
Ilham Hassoune Francois Mace Denis Flandre Jean-Didier Legat 《Microelectronics Journal》2006,37(9):997-1006
A new logic style called low-swing current mode logic (LSCML) is presented. It features a dynamic and differential structure and a low-swing current mode operation. The LSCML logic style may be used for hardware implementation of secure smart cards against differential power analysis (DPA) attacks but also for implementation of self-timed circuits thanks to its self-timed operation. Electrical simulations of the Khazad S-box have been carried out in 0.13 μm PD (partially depleted) SOI CMOS technology. For comparison purpose, the Khazad S-box was implemented with the LSCML logic and two other dynamic differential logic styles previously reported. Simulation results have shown an improved reduction of the data-dependent power signature when using LSCML circuits. Indeed the LSCML based Khazad S-box has shown a power consumption standard deviation more than two times smaller than the one in DyCML and almost two times smaller than the one in DDCVSL. 相似文献
15.
16.
17.
Defect-oriented testability for asynchronous ICs 总被引:1,自引:0,他引:1
Roncken M. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1999,87(2):363-375
For a CMOS manufacturing process, asynchronous ICs are similar to synchronous ICs. The defect density distributions are similar, and hence, so are the fault models and fault-detection methods. So, what makes us think that asynchronous circuits are much harder to test than synchronous circuits? Because the effectiveness of best known test methods for synchronous circuits drops when applied to asynchronous circuits? They may very well be a temporal hurdle. Many test methods have already been reevaluated and successfully adapted from the synchronous to the asynchronous test domain. The paper addresses one of the final hurdles: IDDQ testing. This type of test method, based on measuring the quiescent power supply current, is very effective for detecting (resistive) bridging faults in CMOS circuits. Detection of bridging faults is crucial, because they model the majority of today's manufacturing defects. IDDQ fault effects are sensitized in a particular state or set of states and can only be detected if we stop the circuit operation right there. This is a problem for asynchronous circuits, because their operation is self-timed. In the paper, we quantify the impact of self timing on the effectiveness of IDDQ -based test methods for bridging faults, and propose a Design-for-Test (DfT) approach to develop a low-cost DfT solution. For comparison, we do the same for logic voltage testing and stuck-at faults. The approach is illustrated on circuits from Tangram, the asynchronous design-style employed at Philips Research, but it is applicable to asynchronous circuits in general 相似文献
18.
19.
Johnson D. Akella V. Stott B. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1998,6(4):731-740
We describe the design and implementation of an asynchronous discrete cosine transform/inverse discrete cosine transform (DCT/IDCT) processor core compliant with the CCITT recommendation H.261. First, a micropipelined implementation with level-sensitive latches is shown. This is improved by replacing the level-sensitive latches with dual-edge triggered flip-flops to save power and using completion-detection adders in the critical stage of the pipeline to exploit the data-dependent processing delay. Gate-level simulation of extracted layouts indicates that the performance of asynchronous implementations is comparable with that of a synchronous implementation based on an identical architecture. This is because part of the penalty introduced by handshaking circuitry in an asynchronous pipeline can be recovered by exploiting data-dependent processing delays with completion-detection circuitry. In pipelines with significant arithmetic processing such as the DCT/IDCT processor, this is easily accomplished. Our results are encouraging because asynchronous designs do not employ global clocking. In the near future when clock generation, clock distribution, and the power consumed in the clock circuitry become limiting factors in the design of large synchronous application specific integrated circuits (ASICs), asynchronous implementation methodology could be pursued as a real alternative 相似文献
20.
The conflictual demand of faster and larger designs is increasingly difficult to answer by the advances of solid state technology alone. At some point, it is expected that designers and manufacturers will have to give up the traditional synchronous design methodology for a Globally Asynchronous Locally Synchronous (GALS) one. Such changes imply more synchronization constraints, but also more flexibility. Consequently, this paper proposes a novel Field-Programmable Gate Arrays (FPGA) architecture that is compatible with existing devices and that can also support GALS designs. The main objective is simple: the proposed architecture must appear unchanged for synchronous design, but it must also include a minimal amount of basic components to prevent metastability for efficient asynchronous communications. Thus, the paper presents the constraint equations required to implement such a circuit. It also presents a pausible clock generator application and simulation results for the proposed architecture. All results demonstrate that with a few additional customized circuits, a standard FPGA cell can become appropriate for GALS methodologies. 相似文献