期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The Architecture and Development Flow of the S5 Software Configurable Processor

Jeffrey M. Arnold 《The Journal of VLSI Signal Processing》2007,47(1):3-14

A software configurable processor (SCP) is a hybrid device that couples a conventional processor datapath with programmable logic to allow application programs to dynamically customize the instruction set. SCP architectures can offer significant performance gains by exploiting data parallelism, operator specialization and deep pipelines. The S5000 is a family of high performance software configurable processors for embedded applications. The S5000 consists of a conventional 32-bit RISC processor coupled with a programmable Instruction Set Extension Fabric (ISEF). To develop an application for the S5 the programmer identifies critical sections to be accelerated, writes one or more extension instructions as functions in a variant of the C programming language, and accesses those functions from the application program. Performance gains of more than an order of magnitude over the unaccelerated processor can be achieved.

Jeffrey M. ArnoldEmail:

相似文献

2.

An architecture for a DSP field-programmable gate array

Agarwala M. Balsara P.T. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(1):136-141

This paper describes an application specific architecture for field-programmable gate arrays (FPGAs). Emphasis is placed on the logic module architecture and channel segmentation for the FPGAs targeted for application areas related to digital signal processing (DSP). The proposed logic module architecture is well-suited for efficient implementation of frequently used logic functions in the DSP application area. This is mainly because it is possible to implement most of these functions using one logic module, which results in a reduction in both the net lengths and the number of antifuses used. The performance improvements are achieved by customizing the logic module architecture and the programmable interconnect to suit the requirements of DSP applications 相似文献

3.

A single-chip programmable platform based on a multithreaded processor and configurable logic clusters

Young-Don Bae Seong-Il Park In-Cheol Park 《Solid-State Circuits, IEEE Journal of》2003,38(10):1703-1711

This paper presents a single-chip programmable platform that integrates most of hardware blocks required in the design of embedded system chips. The platform includes a 32-bit multithreaded RISC processor (MT-RISC), configurable logic clusters (CLCs), programmable first-in-first-out (FIFO) memories, control circuitry, and on-chip memories. For rapid thread switch, a multithreaded processor equipped with a hardware thread scheduling unit is adopted, and configurable logics are grouped into clusters for IP-based design. By integrating both the multithreaded processor and the configurable logic on a single chip, high-level language-based designs can be easily accommodated by performing the complex and concurrent functions of a target chip on the multithreaded processor and implementing the external interface functions into the configurable logic clusters. A 64-mm/sup 2/ prototype chip integrating a four-threaded MT-RISC, three CLCs, programmable FIFOs, and 8-kB on-chip memories is fabricated in a 0.35-/spl mu/m CMOS technology with four metal layers, which operates at 100-MHz clock frequency and consumes 370 mW at 3.3-V power supply. 相似文献

4.

Co-Synthesis to a Hybrid RISC/FPGA Architecture

Maya B. Gokhale Janice M. Stone Edson Gomersall 《The Journal of VLSI Signal Processing》2000,24(2-3):165-180

Hybrid architectures combining conventional processors with configurable logic resources enable efficient coordination of control with datapath computation. With integration of the two components on a single device, housekeeping tasks and, optionally, loop control and data-dependent branching, can be handled by the conventional processor, while regular datapath computation occurs on the configurable hardware. This paper describes a novel approach to programming such hybrid devices that gives the programmer control over mapping of data and computation between conventional processor and configurable logic. With a simple set of pragma and intrinsic function directives, the NAPA C language provides for manual control over perhaps the most important aspect of programming such hybrid devices. Alternatively, as experience is gained about tradeoffs between the two computational resources, mapping directives may eventually be generated by an external tool. The paper further describes a research prototype compiler that targets the hybrid processor model, with a concrete implementation for the National Semiconductor NAPA1000 chip. The NAPA C compiler parses the mapping directives, performs semantic analysis, and co-synthesizes a conventional processor executable combined with a configuration bit stream for the configurable logic. Two major compiler phases, the synthesis of pipelined loops and the datapath synthesis, are described in detail. 相似文献

5.

Programmable multiplierless digital filter array for embedded SoCapplications

Hounsell B.I. Arslan T. 《Electronics letters》2001,37(12):735-737

The authors present a novel, programmable logic array for implementing high performance filter functions within embedded system-on-chip platforms. The novelty of the architecture is demonstrated through its specially tailored configurable logic units, and hierarchical routing scheme. The architecture and routing hierarchy are described using a filter example and results are provided demonstrating scalability, speed, and array utilisation using a typical SoC bus specification 相似文献

6.

A family of user-programmable peripherals with a functional unitarchitecture

Shubat A.S. Trinh C.Q. Zaliznyak A. Ziklik A. Roy A. Kazerounian R. Cedar Y. Eitan B. 《Solid-State Circuits, IEEE Journal of》1992,27(4):515-529

A family of user-programmable peripherals, utilizing an integration strategy based on a programmable system device (PSD) concept, is described. Specifically, PSD is an efficient and highly configurable integration of high-density memory and LSI level logic blocks. The configurability is derived by providing programmable logic and programmable interconnect. PSDX is the first PSD family of programmable microcontroller peripherals; it integrates 256 kb to 1 Mb of EPROM, 16 kb of SRAM, a 28-input by 42-product term programmable logic device (PLD), and flexible I/O ports. This family is primarily targeted for embedded microcontroller applications. Using one PSD device it is possible to replace all the core peripherals in the system and, as a result, achieve a reduction in components, power dissipation, and overall system cost. The flexible architecture is achieved by providing 46 configuration options, which allows the PSD to interface with virtually any 8- or 16-b microcontroller. The integration is made possible by developing a special configurability and testability scheme. These parts are realized on a 1.2-μm CMOS EPROM process 相似文献

7.

反熔丝FPGA器件γ剂量率辐射效应规律探讨

下载免费PDF全文

赵洪超朱小锋杜川华《太赫兹科学与电子信息学报》2010,8(1):84-86

FPGA系统电路进行抗γ剂量率器件选择是非常困难的。针对FPGA器件抗γ剂量率性能优选,试验研究了3种反熔丝FPGA器件的7剂量率辐照效应规律。全部样品均出现了低阈值γ剂量率扰动效应,但均未产生高γ剂量率闭锁效应。FPGA器件低阚值γ剂量率失效主要是瞬时光电流扰动引起了时序逻辑功能的失效,而其模块海间的反熔丝开关电阻却对产生闭锁效应的大的辐射浪涌电流提供了保护。实验结果表明,系统电路设计加固是其实现抗γ剂量率最有效的方法。相似文献

8.

Using bus-based connections to improve field-programmable gate-array density for implementing datapath circuits

Ye A. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(5):462-473

As the logic capacity of field-programmable gate arrays (FPGAs) increases, they are increasingly being used to implement large arithmetic-intensive applications, which often contain a large proportion of datapath circuits. Since datapath circuits usually consist of regularly structured components (called bit-slices) which are connected together by regularly structured signals (called buses), it is possible to utilize datapath regularity in order to achieve significant area savings through FPGA architectural innovations. This paper describes such an FPGA routing architecture, called the multibit routing architecture, which employs bus-based connections in order to exploit datapath regularity. It is experimentally shown that, compared to conventional FPGA routing architectures, the multibit routing architecture can achieve 14% routing area reduction for implementing datapath circuits, which represents an overall FPGA area savings of 10%. This paper also empirically determines the best values of several important architectural parameters for the new routing architecture including the most area efficient granularity values and the most area efficient proportion of bus-based connections. 相似文献

9.

基于SRAM的FPGA互连线结构简述

刘丽樊宇柴常春《电子科技》2008,21(2):28-32

FPGAs为信号处理、密码学和存储系统等领域提供了一个可编程的平台.可以在同一块芯片上配置不同的编程数据来实现相应的逻辑功能.可编程互连线资源是FPGA的重要功能模块.文中介绍了产生这种结构的原因以及层次式互连线结构是一种合理、灵活、优化的连线方式,并且对于实现电路功能、提高电路性能都有重要作用. 相似文献

10.

Implementation of a High-Speed MIMO Soft-Output Symbol Detector for Software Defined Radio

Di Wu Johan Eilert Dake Liu 《Journal of Signal Processing Systems》2011,63(1):27-37

This paper presents a programmable MMSE soft-output MIMO symbol detector that supports 600 Mbps data rate defined in 802.11n. The detector is implemented using a multi-core floating-point processor and configurable soft-bit demapper. Owing to the dynamic range supplied by the floating-point SIMD datapath, special algorithms can be adopted to reduce the computational latency of channel processing with sufficient numerical stability for large channel matrices. When compared to several existing fixed-functional solutions, the detector proposed in this paper is smaller and faster. More important, it is programmable and configurable so that it can support various MIMO transmission schemes defined by different standards. 相似文献

11.

A circuit-driven design methodology for video signal-processingdatapath elements

Dutta S. Wolf W. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(2):229-240

The programmable video signal processor (VSP) is an important category of processors for multimedia systems. Programmable video processors combine the flexibility of programmability with special architectural features that improve performance on video processing applications. VSPs are typically multiple processors with several processing elements (PEs) and a parallel memory system. This paper focuses on the architectural design of the PE's in a video processor and shows how technology and circuit parameters influence the structure of the datapath and, hence, the overall architecture of a programmable VSP. We emphasize the need to consider technological and circuit-level issues during the design of a system architecture and present a method whereby the conceptual organization of the PEs-the number of PEs, pipelining of the datapath, size of the register file, and number of register ports-can be evaluated in terms of a target set of applications before a detailed design is undertaken. We use motion-estimation and discrete cosine transform as example applications to illustrate how various technology parameters affect the architectural design choices. We show that the design of the register file and the datapath-pipeline depth can drastically affect PE utilization and, therefore, the number of PEs required for different applications. Our results demonstrate that pursuing the fastest cycle time can greatly increase the silicon area which must be devoted to PEs, due to both increased pipeline latency and reduced register file bandwidth 相似文献

12.

Antifuse field programmable gate arrays 总被引：1，自引：0，他引：1

Greene J. Hamdy E. Beal S. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1993,81(7):1042-1056

An antifuse is an electrically programmable two-terminal device with small area and low parasitic resistance and capacitance. Field-programmable gate arrays (FPGAs) using antifuses in a segmented channel routing architecture now offer the digital logic capabilities of an 8000-gate conventional gate array and system speeds of 40-60 MHz. A brief survey of antifuse technologies is provided. the antifuse technology, routing architecture, logic module, design automation, programming, testing and use of ACT antifuse FPGAs are described. Some inherent tradeoffs involving the antifuse characteristics, routing architecture and logic module are illustrated 相似文献

13.

一种手提软件无线电终端体系结构的研究

颜彪许宗泽《通信技术》2002,(6):40-41

主要讨论了一种基于DSP子系统的手持软件无线电终端的体系结构,它以DSP为核心处理与控制单元,而以ASIC技术实现通信算法预处理。此外,它还具有可以编程配置的功能,并且具有较低的功耗,可以支持多种业务。相似文献

14.

Programmable interconnects speed system verification

《Circuits and Devices Magazine, IEEE》1993,9(3):37-42

A family of CMOS integrated circuits called field programmable interconnect components (FPICs) that can provide designers with the high-density interconnect architectures for making programmable hardware a reality is discussed. The FPIC devices address a broad spectrum of interconnect needs, including system prototypes and breadboards, user-specific/configurable printed circuit boards (PCBs), application configurable processors, test interfaces, and programmable connector and switching matrix applications. Using FPIC devices for system prototyping, in conjunction with other programmable components (programmable logic devices (PLDs), field programmable gate arrays (FPGAs), microprocessors, microcontrollers, DSP, and programmable memory) enhance the design verification process, allowing faster, more flexible, and thorough product integration. Field programmable circuit boards (FPCBs) designed to take advantage of the high density interconnect and observability of FPIC devices and a FPIC/FPCB development environment are described 相似文献

15.

Efficient Realization of Parity Prediction Functions in FPGAs

Seok-Bum Ko Jien-Chung Lo 《Journal of Electronic Testing》2004,20(5):489-499

In this paper, we propose an AND/XOR-based technology mapping method for efficient realization of parity prediction functions in field programmable gate arrays (FPGAs). Due to the fixed size of the programmable blocks in an FPGA, decomposing a circuit into sub-circuits with appropriate number of inputs can achieve an excellent implementation efficiency. Specifically, the proposed technology mapping method is based on Davio expansion theorem. The AND/XOR nature of the proposed method allows it to operate on XOR intensive circuits, such as parity prediction functions, efficiently. We conduct experiments using the parity prediction functions with respect to MCNC benchmark circuits. With the proposed approach, the number of configurable logic blocks (CLBs) is reduced by 67.6% (compared to speed-optimized results) and 57.7% (compared to area-optimized results), respectively. The total equivalent gate counts are reduced by 65.5%, maximum combinational path delay is reduced by 56.7%, and maximum net delay is reduced by 80.5% compared to conventional methods. 相似文献

16.

An FIR processor with programmable dynamic data ranges

Chen O.T.-C. Wei-Lung Liu 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(4):440-446

This work developed a modified direct form based on the radix-4 Booth algorithm to realize a finite impulse response (FIR) architecture with programmable dynamic ranges of input data and filter coefficients. This architecture comprises a preprocessing unit, data latches, configurable connection units, double Booth decoders, coefficient registers, a path control unit, and a postprocessing unit. Programmable dynamic ranges of input data and filter coefficients can be any positive even numbers or multiple of a word length of coefficient registers, using configurable connection units or a path control unit, respectively. In particular, the proposed architecture employs only data-path controls to accomplish programmable operations, without changing word lengths and components of data latches and filter taps. A practical 8-bit and 16-bit FIR processor has also been implemented by using the TSMC 5 V 0.6 μm CMOS technology. It is suitable for operations of asymmetric, symmetric, and anti-symmetric filters at 64, 63, 32, 31, and 16 taps, and is well explored to optimize its functional units. The proposed processor has throughput rates of 50 M and 25 M samples/s for 8-bit and 16-bit input data of various filter applications, respectively 相似文献

17.

A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O

Borgatti M. Lertora F. Foret B. Cali L. 《Solid-State Circuits, IEEE Journal of》2003,38(3):521-529

A system chip targeting image and voice processing and recognition application domains is implemented as a representative of the potential of using programmable logic in system design. It features an embedded reconfigurable processor built by joining a configurable and extensible processor core and an SRAM-based embedded field-programmable gate array (FPGA). Application-specific bus-mapped coprocessors and flexible input/output peripherals and interfaces can also be added and dynamically modified by reconfiguring the embedded FPGA. The architecture of the system is discussed as well as the design flows for pre- and post-silicon design and customization. The silicon area required by the system is 20 mm/sup 2/ in a 0.18-/spl mu/m CMOS technology. The embedded FPGA accounts for about 40% of the system area. 相似文献

18.

FPGA Implementation of Carrier Synchronization for QAM Receivers

Chris Dick Fred Harris Michael Rice 《The Journal of VLSI Signal Processing》2004,36(1):57-71

Software defined radios (SDR) are highly configurable hardware platforms that provide the technology for realizing the rapidly expanding third (and future) generation digital wireless communication infrastructure. While there are a number of silicon alternatives available for implementing the various functions in a SDR, field programmable gate arrays (FPGAs) are an attractive option for many of these tasks for reasons of performance, power consumption and flexibility. Amongst the more complex tasks performed in a high data rate wireless system is synchronization. This paper examines carrier synchronization in SDRs using FPGA based signal processors. We provide a tutorial style overview of carrier recovery techniques for QPSK and QAM modulation schemes and report on the design and FPGA implementation of a carrier recovery loop for a 16-QAM modern. Two design alternatives are presented to highlight the rich design space accessible using configurable logic. The FPGA device utilization and performance for a carrier recovery circuit using a look-up table approach and CORDIC arithmetic are presented. The simulation and FPGA implementation process using a recent system level design tool called System Generator for DSP described. 相似文献

19.

Functional Demonstration of a Memristive Arithmetic Logic Unit (MemALU) for In‐Memory Computing

Long Cheng Yi Li Kang‐Sheng Yin Si‐Yu Hu Yu‐Ting Su Miao‐Miao Jin Zhuo‐Rui Wang Ting‐Chang Chang Xiang‐Shui Miao 《Advanced functional materials》2019,29(49)

The development of in‐memory computing has opened up possibilities to build next‐generation non‐von‐Neumann computing architecture. Implementation of logic functions within the memristors can significantly improve the energy efficiency and alleviate the bandwidth congestion issue. In this work, the demonstration of arithmetic logic unit functions is presented in a memristive crossbar with implemented non‐volatile Boolean logic and arithmetic computing. For logic implementation, a standard operating voltage mode is proposed for executing reconfigurable stateful IMP, destructive OR, NOR, and non‐destructive OR logic on both the word and bit lines. No additional voltages are needed beyond “V_P” and its negative component. With these basic logic functions, other Boolean functions are constructed within five devices in at most five steps. For arithmetic computing, the fundamental functions including an n‐bit full adder with high parallelism as well as efficient increment, decrement, and shift operations are demonstrated. Other arithmetic blocks, such as subtraction, multiplication, and division are further designed. This work provides solid evidence that memristors can be used as the building block for in‐memory computing, targeting various low‐power edge computing applications. 相似文献

20.

FUNCTIONALITY FAULT MODEL: A BASIS FOR TECHNOLOGY-SPECIFIC TEST GENERATION

Andrej ?emva Baldomir Zajc 《Microelectronics Reliability》1998,38(4):597-604

In this paper, we present the functionality fault model and demonstrate its feasibility and advantages. In current designs, the fan-in of the modules implemented in CMOS standard cell, mask programmable or field-programmable gate array technologies rarely exceeds 4 on average. A functionality fault model, based on the complete enumeration of the truth table of each logic module, is thus entirely feasible and enhances the quality of the test significantly. Tests based on this model provide complete coverage of module behavior and interior faults as well as input stuck-at and bridging faults of any multiplicity, reducing the need for technology and implementation-specific fault models. We have implemented the prototype software test-dc and demonstrated its application to generate high-quality test patterns. 相似文献