期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A novel and efficient routing architecture for multi-FPGA systems

Khalid M.A.S. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(1):30-39

Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture which is the manner in which wires, FPGAs and field-programmable interconnect devices (FPIDs) are connected. Several routing architectures for MFSs have been proposed, and previous research has shown that the partial crossbar is one of the best existing architectures. In this paper, we propose a new routing architecture, called the hybrid complete-graph and partial-crossbar (HCGP) which has superior speed and cost compared to a partial crossbar. The new architecture uses both hard-wired and programmable connections between the FPGAs. We compare the performance and cost of the HCGP and partial crossbar architectures experimentally, by mapping a set of 15 large benchmark circuits into each architecture. A customized set of partitioning and interchip routing tools were developed, with particular attention paid to architecture-appropriate interchip routing algorithms. We show that the cost of the partial crossbar (as measured by the number of pins on all FPGAs and FPIDs required to fit a design), is on average 20% more than the new HCGP architecture and as much as 25% more. Furthermore, the critical path delay for designs implemented on the partial crossbar were on average 20% more than the HCGP architecture and up to 43% more. Using our experimental approach, we also explore a key architecture parameter associated with the HCGP architecture-the proportion of hard-wired connections versus programmable connections-to determine its best value 相似文献

2.

The design of an SRAM-based field-programmable gate array. I.Architecture

Chow P. Soon Ong Seo Rose J. Chung K. Paez-Monzon G. Rahardja I. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(2):191-197

相似文献

3.

Using bus-based connections to improve field-programmable gate-array density for implementing datapath circuits

Ye A. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(5):462-473

As the logic capacity of field-programmable gate arrays (FPGAs) increases, they are increasingly being used to implement large arithmetic-intensive applications, which often contain a large proportion of datapath circuits. Since datapath circuits usually consist of regularly structured components (called bit-slices) which are connected together by regularly structured signals (called buses), it is possible to utilize datapath regularity in order to achieve significant area savings through FPGA architectural innovations. This paper describes such an FPGA routing architecture, called the multibit routing architecture, which employs bus-based connections in order to exploit datapath regularity. It is experimentally shown that, compared to conventional FPGA routing architectures, the multibit routing architecture can achieve 14% routing area reduction for implementing datapath circuits, which represents an overall FPGA area savings of 10%. This paper also empirically determines the best values of several important architectural parameters for the new routing architecture including the most area efficient granularity values and the most area efficient proportion of bus-based connections. 相似文献

4.

Placement and routing tools for the Triptych FPGA

Ebeling C. McMurchie L. Hauck S.A. Burns S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(4):473-482

Field-programmable gate arrays (FPGAs) are becoming an increasingly important implementation medium for digital logic. One of the most important keys to using FPGAs effectively is a complete, automated software system for mapping onto the FPGA architecture. Unfortunately, many of the tools necessary require different techniques than traditional circuit implementation options, and these techniques are often developed specifically for only a single FPGA architecture. In this paper we describe automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs. These tools include a simulated-annealing placement algorithm that handles the routability issues of fine-grained FPGAs, and an architecture-adaptive routing algorithm that can easily be retargeted to other FPGAs. We also describe extensions to these algorithms for mapping asynchronous circuits to Montage, the first FPGA architecture to completely support asynchronous and synchronous interface applications 相似文献

5.

A Novel Switched-Capacitor Based Field-Programmable Analog Array Architecture 总被引：1，自引：0，他引：1

Edward K. F. Lee Wai L. Hui 《Analog Integrated Circuits and Signal Processing》1998,17(1-2):35-50

A novel field-programmable analog array (FPAA) architecture based on switched-capacitor techniques is proposed. Each configurable analog block (CAB) in the proposed architecture is an opamp with feedback switches which are controlled by configuration bits. Interconnection networks are used to connect programmable capacitor arrays (PCAs) and the CABs. The routing switches in the interconnection networks not only function as interconnection elements but also switches for the charge transfer required in switched-capacitor circuits. This scheme minimizes the number of connecting switches between CABs and PCAs, thereby, it reduces the settling time of the resultant SC circuits and thus achieving high speed operation. The architecture is highly flexible and provides for the implementation of various A/D and D/A converters when the FPAA is connected with external digital circuits or field-programmable gate arrays (FPGAs). 相似文献

6.

Routing on field-programmable switch matrices

Ejnioui A. Ranganathan N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(2):283-287

In this paper, we address the problem of routing nets on field programmable gate arrays (FPGAs) interconnected by a switch matrix. We extend the switch matrix architecture proposed by Zhu et al. (1993) to route nets between FPGA chips in a multi-FPGA system. Given a limited number of routing resources in the form of programmable connection points within a two-dimensional switch matrix, this problem examines the issue of how to route a given net traffic through the switch matrix structure. First, we define the problem as a general undirected graph in which each vertex has one single color among six possible colors and formulate it as a constraint satisfaction problem. This is further modeled as a 0-1 multidimensional knapsack problem for which a fast approximate solution is applied. Experimental results show that the accuracy of our proposed heuristic is quite high for moderately large switch matrices. 相似文献

7.

A novel robust FPGA routing switch box design for ultra low power applications

S.D. Pable Mohd. Hasan 《International Journal of Electronics》2013,100(1):15-27

Fabrication cost of application-specific integrated circuits (ASICs) is exponentially rising in deep submicron region due to rapidly rising non-recurring engineering cost. Field programmable gate arrays (FPGAs) provide an attractive alternative to ASICs but consume an order of magnitude higher power. There is a need to explore ways of reducing FPGA power consumption so that they can also be employed in ultra low power (ULP) applications instead of ASICs. Subthreshold region of operation is an ideal choice for ULP low-throughput FPGAs. The routing of an FPGA consumes most of the chip area and primarily determines the circuit delay and power consumption. There is a need to design moderate-speed ULP routing switches for subthreshold FPGA. This article proposes a novel subthreshold FPGA routing switch box (SB) that utilises the leakage voltage through transistor as biasing voltage which shows 69%, 61.2% and 30% improvement in delay, power delay product and delay variation, respectively, over conventional routing SB. 相似文献

8.

ATOMi: An Algorithm for Circuit Partitioning Into Multiple FPGAs Using Time-Multiplexed, Off-Chip, Multicasting Interconnection Architecture

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(7):861-864

Logic emulation is so far the fastest method to verify the system functionality in the gate level before chip fabrication. Field-programmable gate array (FPGA)-based logic emulator with large gate capacity generally comprises a large number of FPGAs or special processors connected in mesh or crossbar topology. However, gate utilization of FPGAs and speed of emulation are limited by the number of signal pins among FPGAs and the interconnection architecture of the logic emulator. This paper first describes a new interconnection architecture called TOMi (Time-multiplexed, Off-chip, Multicasting interconnection) and proposes a circuit partitioning algorithm called ATOMi (Algorithm for TOMi) for multi-FPGA system incorporating four to eight FPGAs where FPGAs are interconnected through TOMi. ATOMi reduces the number of off-chip signal transfers to optimize the performance for multi-FPGA system implemented by TOMi. Experimental results using Partitioning93 benchmarks show that, by adopting the proposed TOMi interconnection architecture along with ATOMi, the pin count is reduced to 14.4%–88.6% while the critical path delay is reduced to 66.1%–90.1% compared to traditional architectures including mesh, crossbar, and VirtualWire architecture. 相似文献

9.

An efficient logic emulation system

Varghese J. Butts M. Batcheller J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1993,1(2):171-174

The Realizer, is a logic emulation system that automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented. Logic and interconnect are separated to achieve optimum FPGA utilization. Its interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity, achieves bounded interconnect delay, scales linearly with pin count, and allows hierarchical expansion to systems with hundreds of thousands of FPGA devices in a fast and uniform way. An actual multiboard system has been built, using 42 Xilinx XC3090 FPGAs for logic. Several designs, including a 32-b CPU datapath, have been automatically realized and operated at speed. They demonstrate very good FPGA utilization. The Realizer has applications in logic verification and prototyping, simulation, architecture development, and special-purpose execution 相似文献

10.

Programmable interconnects speed system verification

《Circuits and Devices Magazine, IEEE》1993,9(3):37-42

A family of CMOS integrated circuits called field programmable interconnect components (FPICs) that can provide designers with the high-density interconnect architectures for making programmable hardware a reality is discussed. The FPIC devices address a broad spectrum of interconnect needs, including system prototypes and breadboards, user-specific/configurable printed circuit boards (PCBs), application configurable processors, test interfaces, and programmable connector and switching matrix applications. Using FPIC devices for system prototyping, in conjunction with other programmable components (programmable logic devices (PLDs), field programmable gate arrays (FPGAs), microprocessors, microcontrollers, DSP, and programmable memory) enhance the design verification process, allowing faster, more flexible, and thorough product integration. Field programmable circuit boards (FPCBs) designed to take advantage of the high density interconnect and observability of FPIC devices and a FPIC/FPCB development environment are described 相似文献

11.

Antifuse field programmable gate arrays 总被引：1，自引：0，他引：1

Greene J. Hamdy E. Beal S. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1993,81(7):1042-1056

An antifuse is an electrically programmable two-terminal device with small area and low parasitic resistance and capacitance. Field-programmable gate arrays (FPGAs) using antifuses in a segmented channel routing architecture now offer the digital logic capabilities of an 8000-gate conventional gate array and system speeds of 40-60 MHz. A brief survey of antifuse technologies is provided. the antifuse technology, routing architecture, logic module, design automation, programming, testing and use of ACT antifuse FPGAs are described. Some inherent tradeoffs involving the antifuse characteristics, routing architecture and logic module are illustrated 相似文献

12.

The Triptych FPGA architecture

Borriello G. Ebeling C. Hauck S.A. Burns S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(4):491-501

Field-programmable gate arrays (FPGAs) are an important implementation medium for digital logic. Unfortunately, they currently suffer from poor silicon area utilization due to routing constraints. In this paper we present Triptych, an FPGA architecture designed to achieve improved logic density with competitive performance. This is done by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits. We show that, using manual placement, this architecture yields a logic density improvement of up to a factor of 3.5 over commercial FPGAs, with comparable performance. We also describe Montage, the first FPGA architecture to fully support asynchronous and synchronous interface circuits 相似文献

13.

Block-oriented programmable design with switching networkinterconnect

Ching-Wei Yeh Lung-Tien Liu Chung-Kuan Cheng Hu T.C. Ahmed S. Liddel M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1994,2(1):45-53

A block-oriented programmable design with switching network interconnect is proposed for fast turn-around, low manufacturing cost, and layout-independent high-speed systems. We introduce the architecture and investigate the constraints and properties originated from the architecture. We show that routability is the most crucial concern for a successful design, and propose objective functions as well as algorithms for switching network optimization. The mapping for the circuits is performed by partitioning, placement, and routing using a maximum matching method. The integration of the whole system demonstrates excellent results in terms of circuit usage 相似文献

14.

Resource requirements and layouts for field programmableinterconnection chips

Bhatia D. Haralambides J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(3):346-355

Field-programmable interconnection chips (FPIC's) provide the capability of realizing user programmable interconnection for any desired permutation. Such an interconnection is very much desired for supporting rapid prototyping of hardware systems and for providing programmable communication networks for parallel and distributed computing. An FPIC should realize any possible permutation of input to output pins via a set of programmable switches. In this paper, we show that any such architecture requires a minimum of Ω(n log n) switches, where Ω is the number of I/O pins. The result stems from an analysis of the underlying permutation network. In addition, for networks of bounded degree d, we prove an Ω(log_d-1 n) bound on the routing delay (maximum length of routing paths for specific I/O permutations) and an Ω(n log_d-1 n) bound on the average utilization of programmable switches used by the FPIC to implement a specific permutation. For the same type of networks, we prove an Ω(n log_d-1 n) bound on the number of nodes of the network. Furthermore, we design efficient architectures for FPIC's offering a wide variety of routing delays, high average programmable resource utilization, and O(n²)-area two-layer layouts. The proposed structures are called hybrid Benes-Crossbar (HBC) architectures and clearly exhibit a tradeoff between performance (routing delay utilization) and area of the layout 相似文献

15.

The effect of logic block architecture on FPGA performance

Singh S. Rose J. Chow P. Lewis D. 《Solid-State Circuits, IEEE Journal of》1992,27(3):281-287

This authors explore the effect of logic block architecture on the speed of a field-programmable gate array (FPGA). Four classes of logic block architecture are investigated: NAND gates, multiplexer configurations, lookup tables, and wide-input AND-OR gates. An experimental approach is taken, in which each of a set of benchmark logic circuits is synthesized into FPGAs that use different logic blocks. The speed of the resulting FPGA implementations using each logic block is measured. While the results depend on the delay of the programmable routing, experiments indicate that five- and six-input lookup tables and certain multiplexer configurations produce the lowest total delay over realistic values of routing delay. The fine grain blocks, such as the two-input NAND gate, exhibit poor performance because these gates require many levels of logic block to implement the circuits and hence require a large routing delay 相似文献

16.

A routing algorithm for FPGAs with time-multiplexed interconnects

Ruiqi Luo Xiaolei Chen Yajun Ha 《半导体学报》2020,(2):73-82

Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interconnect architecture,we propose a time-multiplexed routing algorithm that can actively identify qualified nets and schedule them to multiplexable wires.We validate the algorithm by using the router to implement 20 benchmark circuits to time-multiplexed FPGAs.We achieve a 38%smaller minimum channel width and 3.8%smaller circuit critical path delay compared with the state-of-the-art architecture router when a wire can be time-multiplexed six times in a cycle. 相似文献

17.

Multiterminal net routing for partial crossbar-based multi-FPGA systems

Ejnioui A. Ranganathan N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(1):71-78

Multi-FPGA (field-programmable gate arrays) systems are used as custom computing machines to solve compute-intensive problems and also in the verification and prototyping of large circuits. In this paper, we address the problem of routing multiterminal nets in a multi-FPGA system that uses partial crossbars as interconnect structures. First, we model the multiterminal routing problem as a partitioned bin-packing problem and formulate it as an integer linear programming problem where the number of variables is exponential. A fast heuristic is applied to compute an upper bound on the routing solution. Then, a column generation technique is used to solve the linear relaxation of the initial master problem in order to obtain a lower bound on the routing solution. This is followed by an iterative branch-and-price procedure that attempts to find a routing solution somewhere between the two established bounds. In this regard, the proposed algorithm guarantees an exact-routing solution by searching a branch-and-price tree. Due to the tightness of the bounds, the branch-and-price tree is small resulting in shorter execution times. Experimental results are provided for different netlists and board configurations in order to demonstrate the algorithms performance. The obtained results show that the algorithm finds an exact routing solution in a very short time. 相似文献

18.

Routability of Network Topologies in FPGAs

Saldana M. Shannon L. Jia Shuo Yue Sikang Bian Craig J. Chow P. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(8):948-951

A fundamental difference between application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs) is that the wires in ASICs are designed to match the requirements of a particular design. Conversely, in an FPGA, the area is fixed and the routing resources exist whether or not they are used. In this paper, we investigate how well several common network topologies map onto a modern FPGA routing fabric. Different multiprocessor network topologies with between 8 and 64 nodes are mapped to a single large FPGA. Except for the fully-connected networks, it is observed that the difference in logic resources used and routing overhead among these topologies is insignificant for the systems tested. Fully-connected networks up to about 22 nodes are also feasible on the same FPGA although the logic and routing utilization clearly grows much faster. The conclusion is that a modern FPGA fabric is very rich in resources and capable of supporting highly interconnected topologies. For systems with a modest number of nodes implemented on current large FPGAs, it is not necessary to use the connectivity-limited topologies typically used for networks-on-chip. Rather, direct point-to-point connections between all communicating nodes can be considered. 相似文献

19.

Probabilistic optimization for FPGA board level routing problems

Fei He Xiaoyu Song Ming Gu Guowu Yang Hung W.N.N. Jiaguang Sun 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(4):264-268

Field programmable gate arrays (FPGAs) are an enabling technology in circuit designs. We consider the board-level multi-terminal net assignment in the FPGA-based logic emulation. A novel probabilistic optimization method is devised for solving the net assignment problem. The approach incorporates randomized rounding, genetic algorithm, and solution-improvement strategies. Experimental results demonstrate promising performance. 相似文献

20.

Application of nanojunction-based RRAM to reconfigurable IC

Liu M. Wang W. 《Micro & Nano Letters, IET》2008,3(3):101-105

A novel reconfigurable architecture, rFPGA, is developed by utilising high-density resistive memory (RRAM) circuits as FPGA components. Different from the existing CMOS-nano hybrid FPGAs that use crossbars, the rFPGA mainly consists of 1T1R RRAM structures (one CMOS transistor is integrated with a two-terminal resistive nanojunction) that can be fabricated using an efficient CMOS-compatible process. These 1T1R structures can significantly improve the FPGA memory and routing circuits, and enable the rFPGA to achieve at least a 2x density enhancement along with a 10% reduction of delay and power, compared with the corresponding CMOS FPGA. 相似文献