首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In most applications, FPGAs are used to implement “glue logic”, providing the advantages of high integration levels without the expense and risk of custom ASIC development. However, as SRAM-based FPGA devices have increased in capability, their use as in-system-configurable computing elements is receiving considerable attention. Indeed, reconfigurable FPGA technology holds the potential for reshaping the future of computing by providing the capability to dynamically alter a computer's hardware resources to optimally service immediate computational needs. Computing circuits built from SRAM-based FPGAs can meet the true goal of parallel processing-executing algorithms in circuitry with the inherent parallelism of hardware, while avoiding the instruction fetch and load/store bottlenecks of traditional von Neumann architectures. There are many computationally-intensive algorithms that can benefit from being partially or wholly implemented in hardware. Typically, these algorithms are too specialized to justify the expense of manufacturing custom IC devices  相似文献   

2.
In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the timing requirements of application code mapped on the reconfigurable logic. The reconfigurable processing units are embedded in a generic hybrid system architecture which can model a large number of existing heterogeneous reconfigurable platforms. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by our developed high-performance data-path. The methodology mainly consists of three stages; the analysis, the mapping of the application parts onto fine and coarse-grain reconfigurable hardware, and the partitioning engine. A prototype software framework realizes the partitioning flow. In this work, the methodology is validated using five real-life applications. Analytical partitioning experiments show that the speedup relative to the all-FPGA mapping solution ranges from 1.5 to 4.0, while the specified timing constraints are satisfied for all the applications.  相似文献   

3.
In this paper, we propose a method for speeding-up Digital Signal Processing applications by partitioning them between the reconfigurable hardware blocks of different granularity and mapping critical parts of applications on coarse-grain reconfigurable hardware. The reconfigurable hardware blocks are embedded in a heterogeneous reconfigurable system architecture. The fine-grain part is implemented by an embedded FPGA unit, while for the coarse-grain reconfigurable hardware our developed high-performance coarse-grain data-path is used. The design flow mainly consists of three steps; the analysis procedure, the mapping onto coarse-grain blocks, and the mapping onto the fine-grain hardware. In this work, the methodology is validated using five real-life applications; an OFDM transmitter, a medical imaging technique, a wavelet-based image compressor, a video compression scheme and a JPEG encoder. The experimental results show that the speedup, relative to an all-FPGA solution, ranges from 1.55 to 4.17 for the considered applications.  相似文献   

4.
Two-dimensional (2D) convolution is a basic operation in digital signal processing, especially in image and video applications. Although its computation is conceptually simple, a sum of products of constants by variables, its implementation is highly demanding in terms of computational power, especially when addressed to real-time embedded systems. This work brings an innovative approach oriented to dynamically reconfigurable hardware. A flexible 2D convolver is deployed on a SRAM-based FPGA split in two parts: a static region and a partially reconfigurable region (PRR). Just to provide a universal solution, all the configurable aspects of the convolver (kernel dimensions, operands resolution, constant coefficients, pipeline stages, etc.) fit allocated in the PRR. In this way, the computer can self-adapt its structure on the fly, according to the characteristics of the image to be processed each time. Although there are many research articles in the literature encompassing the design of 2D convolution computers, to the best of the authors’ knowledge, this is the first work that implements a 2D convolver based on run-time reconfigurable hardware, while other approaches synthesize it either directly in software or in hardware as fully static designs. This pioneer alternative - exploiting key implementation aspects like parallelism, pipeline, flexibility and functional density - overcomes both computational performance of software solutions and cost-effectiveness of static hardware designs, while delivering an outstanding level of adaptability. The balanced time-area trade-off achieved with this technology makes it appropriate for high-performance low-cost embedded systems.  相似文献   

5.
A system chip targeting image and voice processing and recognition application domains is implemented as a representative of the potential of using programmable logic in system design. It features an embedded reconfigurable processor built by joining a configurable and extensible processor core and an SRAM-based embedded field-programmable gate array (FPGA). Application-specific bus-mapped coprocessors and flexible input/output peripherals and interfaces can also be added and dynamically modified by reconfiguring the embedded FPGA. The architecture of the system is discussed as well as the design flows for pre- and post-silicon design and customization. The silicon area required by the system is 20 mm/sup 2/ in a 0.18-/spl mu/m CMOS technology. The embedded FPGA accounts for about 40% of the system area.  相似文献   

6.
Dynamically reconfigurable SRAM-based field-programmable gate arrays (FPGAs) enable the implementation of reconfigurable computing systems where several applications may be run simultaneously, sharing the available resources according to their own immediate functional requirements. To exclude malfunctioning due to faulty elements, the reliability of all FPGA resources must be guaranteed. Since resource allocation takes place asynchronously, an online structural test scheme is the only way of ensuring reliable system operation. On the other hand, this test scheme should not disturb the operation of the circuit, otherwise availability would be compromised. System performance is also influenced by the efficiency of the management strategies that must be able to dynamically allocate enough resources when requested by each application. As those resources are allocated and later released, many small free resource blocks are created, which are left unused due to performance and routing restrictions. To avoid wasting logic resources, the FPGA logic space must be defragmented regularly. This paper presents a non-intrusive active replication procedure that supports the proposed test methodology and the implementation of defragmentation strategies, assuring both the availability of resources and their perfect working condition, without disturbing system operation.   相似文献   

7.
吴将  朱志宇  沈舒 《电视技术》2014,38(7):50-53,44
针对现有可重构计算硬件平台配置时间长、灵活性受限的缺陷问题,介绍了一种基于PC机的FPGA可重构硬件平台结构的设计方法,该结构允许PCI总线快速重构,整个系统的硬件设计可以按以下两个部分进行设计:固定部分和可重构部分。最后在FPGA资源上的验证结果表明该设计能够有效实现FPGA的硬件重构,而且其物理硬件设计简单。  相似文献   

8.
针对目前PC算法无法实现图像实时处理以及固定硬件平台很难实现算法修改或者升级的问题,设计一种基于SOPC可重构的图像采集与处理系统,实现了图像数据的片上实时处理以及在不改变硬件电路结构而完成算法修改或者升级的功能。此系统围绕两块Xilinx FPGA芯片进行设计,通过FPGA以及其Microblaze 32 bit软核处理器和相关接口模块实现硬件电路设计,结合FPGA开发环境ISE工具和EDK工具协作完成软件设计。由于采用SOPC技术和可重构技术,此设计具有设计灵活、处理速度快和算法可灵活升级等特点。  相似文献   

9.
The ever increasing adoption of field programmable devices in various application domains for building complex embedded systems based on FPGA processors along with the reliability issues having emerged for FPGA devices built with the latest nanometer technologies, have raised the need for new fault tolerant techniques in order to improve dependability and extend system lifetime. In addition, the runtime partial reconfiguration technology highly mature in the modern FPGA families along with the availability of unused programmable resources in most FPGA designs provide new and interesting opportunities to build advanced fault tolerance mechanisms. In this paper, we exploit the dynamic reconfiguration potential of today’s FPGA architectures and the advances in the related design support tools and we propose a fault-tolerant approach for FPGA embedded processors based on runtime partial reconfiguration. According to the proposed methodology, the processor core is partitioned into reconfigurable modules and each module is duplicated to implement a concurrent error detection mechanism. Precompiled configurations containing spare resources are generated for each duplicated module and are used to repair at runtime the defective modules. Also, a fault tolerance scheme for the proxy logic of the reconfigurable modules, which cannot move in the alternative configurations along with the rest logic, is proposed. Moreover, a compression method for the alternative partial bitstreams, which significantly reduces the high storage space requirements of the proposed approach, is presented. Two different hardware decompression schemes have been implemented in a Virtex-5 device and compared in terms of area overhead and decompression latency. Furthermore, a thorough examination has been performed, regarding how the percentage of the spare resources and their allocation in the reconfigurable regions affect the compression efficiency and the processor performance. Finally, the proposed approach has been demonstrated in three different components – ALU, multiplier-accumulator, and instruction-fetch unit – of an open-source embedded processor.  相似文献   

10.
The current technological age demands the deployment of biometric security systems not only in those stringent and highly reliable fields (forensic, government, banking, etc.) but also in a wide range of daily use consumer applications (internet access, border control, health monitoring, mobile phones, laptops, etc.) accessible worldwide to any user. In order to succeed in the exploitation of biometric applications over the world, it is needed to make research on power-efficient and cost-effective computational platforms able to deal with those demanding image and signal operations carried out in the biometric processing. The present work deals with the evaluation of alternative system architectures to those existing PC (personal computers), HPC (high-performance computing) or GPU-based (graphics processing unit) platforms in one specific scenario: the physical implementation of an AFAS (automatic fingerprint-based authentication system) application. The development of automated fingerprint-based personal recognition systems in the way of compute-intensive and real-time embedded systems under SoPC (system-on-programmable-chip) devices featuring one general-purpose MPU (microprocessor unit) and one run-time reconfigurable FPGA (field programmable gate array) proves to be an efficient and cost-effective solution. The provided flexibility, not only in terms of software but also in terms of hardware thanks to the programmability and run-time reconfigurability performance exhibited by the suggested FPGA device, permits to build any application by means of hardware-software co-design techniques. The parallelism and acceleration performances inherent to the hardware design and the ability of reusing hardware resources along the application execution time are key factors to improve the performance of existing systems.  相似文献   

11.
In this paper, we propose a configuration-aware data-partitioning approach for reconfigurable computing. We show how the reconfiguration overhead impacts the data-partitioning process. Moreover, we explore the system-level power-performance tradeoffs available when implementing streaming embedded applications on fine-grained reconfigurable architectures. For a certain group of streaming applications, we show that an efficient hardware/software partitioning algorithm is required when targeting low power. However, if the application objective is performance, then we propose the use of dynamically reconfigurable architectures. We propose a design methodology that adapts the architecture and algorithms to the application requirements. The methodology has been proven to work on a real research platform based on Xilinx devices. Finally, we have applied our methodology and algorithms to the case study of image sharpening, which is required nowadays in digital cameras and mobile phones.  相似文献   

12.
钟瑜  吴明钦 《电讯技术》2019,59(7):829-835
针对传统的现场可编程门阵列(Field Programmable Gate Array,FPGA)开发方法效率低、不能充分利用芯片逻辑资源等问题,提出了一种高性能并行计算架构。设计了统一的软件、硬件编程模型,并提供FPGA操作系统层级的支持,将部分可重构技术应用于硬件线程的开发,使该架构具备资源管理和复用的能力。同时还设计了软件、硬件协同开发的流程。在开发板ZC702上进行了设计验证,评估了架构的额外资源消耗情况,并以排序算法为例展示了该架构多线程设计的灵活性。  相似文献   

13.
针对可重构计算机系统配置次数(划分块数)的最小化问题,提出了一种融合面积估算和多目标优化的硬件任务划分算法。该算法每次划分均进行硬件资源面积的估算,并且通过充分考虑可重构资源的使用、一个数据流图所有划分块执行延迟总和、划分模块间边数等因素构造了新的探测函数prior_assigned(),该函数能够计算每个就绪节点的优先权值,新算法通过该值能动态调整就绪列表任务节点的调度次序。实验结果表明,与现有的层划分、簇划分、增强静态列表、多目标时域划分、簇层次敏感等5种划分算法相比,该算法能获得最少的模块数,并且随着可重构处理单元面积的增大,除层划分算法之外,其执行延迟的均值也是最小的。  相似文献   

14.
SRAM型现场可编程门阵列(FPGA)在空间辐射环境中容易受到单粒子效应的影响,从而发生软错误,三模冗余技术(TMR)是目前使用最广泛的缓解FPGA软错误的电路加固技术。该文首先介绍了三模冗余技术研究现状,然后总结了三模冗余工具常用的细粒度TMR技术、系统分级技术、配置刷新技术、状态同步技术4项关键技术及其实现原理。随着FPGA的高层次综合技术愈发成熟,基于高层次综合的三模冗余工具逐渐成为新的研究分支,该文分类介绍了当前主流的基于寄存器传输级的三模冗余工具,基于重要软核资源的三模冗余工具,以及新兴的基于高层次综合的三模冗余工具,最后对FPGA三模冗余工具的未来发展趋势进行了总结与展望。  相似文献   

15.
Field programmable gate arrays (FPGAs) are a promising technology for developing high-performance embedded systems. The density and performance of FPGAs have drastically improved over the past few years. Consequently, the size of the configuration bit-streams has also increased considerably. As a result, the cost-effectiveness of FPGA-based embedded systems is significantly affected by the memory required for storing various FPGA configurations. This paper proposes a novel compression technique that reduces the memory required for storing FPGA configurations and results in high decompression efficiency. Decompression efficiency corresponds to the decompression hardware cost as well as the decompression rate. The proposed technique is applicable to any SRAM-based FPGA device since configuration bit-streams are processed as raw data. The required decompression hardware is simple and the decompression rate scales with the speed of the memory used for storing the configuration bit-streams. Moreover, the time to configure the device is not affected by our compression technique. Using our technique, we demonstrate up to 41% savings in memory for configuration bit-streams of several real-world applications.  相似文献   

16.
Currently most FPGAs use SRAM-based technology, which are susceptible to faults from external electromagnetic radiation or produced by long-time internal overload operation. The dynamic partial reconfigurable (DPR) system, as an emerging technology, provides a promising way to solve this problem by reallocating the tasks in damaged resource areas to non-faulty regions at runtime. Based on such idea, an infrastructure for coordinately executing specialized hardware tasks on a reconfigurable FPGA is presented to achieve the flexibility for tolerating the occurring faults at runtime. Moreover, a method named MER-3D-Contact that combines the maximum empty rectangles (MER) technique with the adjacency heuristic is proposed to allocate tasks in the dynamical partial reconfiguration system for higher resource utilization, higher task acceptance ratio and lower fragmentation ratio. At last, experiments are carried out to evaluate the performance of the proposed system, results show that the proposed system can make the highest improvement 36% without damaged areas and the highest improvement 58% with damaged resources in terms of task acceptance ratio. Thus, the proposed system is expected a wide application in the field of more reliable FPGAs.  相似文献   

17.
This paper proposes a series of related laboratory projects to the image processing area through reconfigurable integrated circuits like FPGA (Field Programmable Gate Array). With the implementation of these projects, the students will not only develop skills in electronic design, they also will increase their knowledge as engineers, with the integration of electronic engineering and computer science in the design of reconfigurable hardware with FPGA's. The algorithms proposed in these laboratory projects, for the image processing, are coded in C++ and are implemented in the embedded microcontroller Microblaze.  相似文献   

18.
Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. By mapping the compute-intensive sections of an application to reconfigurable hardware, custom computing systems exhibit significant speedups over traditional microprocessors. However, this potential acceleration is limited by the requirement that the speedups provided must outweigh the considerable cost of reconfiguration. The ability to relocate and defragment configurations on field programmable gate arrays (FPGAs) can dramatically decrease the overall reconfiguration overhead incurred by the use of the reconfigurable hardware. We therefore present hardware solutions to provide relocation and defragmentation support with a negligible area increase over a generic partially reconfigurable FPGA, as well as software algorithms for controlling this hardware. This results in factors of 8 to 12 improvement in the configuration overheads displayed by traditional serially programmed FPGAs.  相似文献   

19.
Embedded systems present significant security challenges due to their limited resources and power constraints. This paper focuses on the issues of building secure embedded systems on reconfigurable hardware and proposes a security architecture for embedded systems (SAFES). SAFES leverages the capabilities of reconfigurable hardware to provide efficient and flexible architectural support for security standards and defenses against a range of hardware attacks. The SAFES architecture is based on three main ideas: (1) reconfigurable security primitives; (2) reconfigurable hardware monitors; and (3) a hierarchy of security controllers at the primitive, system and executive level. Results are presented for reconfigurable AES and RC6 security primitives and highlight the value of such an architecture. This paper also emphasizes that reconfigurable hardware is not just a technology for hardware accelerators dedicated to security primitives as has been focused on by most studies but a real solution to provide high-security and high-performance for a system.  相似文献   

20.
文章以嵌入式和数据采集技术为基础,研究设计并实现了基于ARM+FPGA体系架构面向高速实时数据采集应用的一种实用新型智能控制器。本文阐述了主处理器ARM最小系统、协处理器FPGA最小系统和ARM与FPGA通信接口等硬件系统技术的实现,以及Linux FPGA字符设备驱动程序开发、协处理器FPGA控制程序和主处理器ARM应用程序设计。智能控制器运用FPGA并行运算处理结构的优势,控制ADC进行高速数据采集。FPGA还可配置成软核处理器-Nios II嵌入式处理器,与ARM构成双核处理器系统。智能控制器通过ARM实现对FPGA的管理控制、实时数据采集和丰富外围接口的通信。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号