首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the timing requirements of application code mapped on the reconfigurable logic. The reconfigurable processing units are embedded in a generic hybrid system architecture which can model a large number of existing heterogeneous reconfigurable platforms. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by our developed high-performance data-path. The methodology mainly consists of three stages; the analysis, the mapping of the application parts onto fine and coarse-grain reconfigurable hardware, and the partitioning engine. A prototype software framework realizes the partitioning flow. In this work, the methodology is validated using five real-life applications. Analytical partitioning experiments show that the speedup relative to the all-FPGA mapping solution ranges from 1.5 to 4.0, while the specified timing constraints are satisfied for all the applications.  相似文献   

2.
Motivated by the theoretical results on multi-antenna signal processing techniques promising substantial performance gains, a feasible reconfigurable hardware architecture for OFDM-based Wireless LANs is presented in this paper. The ultimate objective of the platform is to support single-antenna links, as well as antenna arrays at the receiver and/or at the transmitter, taking into account the limitations caused by a real-time implementation. After a brief overview of the implemented multi-antenna algorithms, we present the hardware platform that has been built based on both fixed point and floating point DSPs from Texas Instruments, together with the evaluation of the complexity associated to the operations, and their scheduling. The performance indicates that a multi-antenna architecture supporting up to four antennas at the receiver side might accomplish the real-time requirements.  相似文献   

3.
提出一种新的基于嵌入武可重构系统芯片的视频解码方案,采用了软硬件协同验证的方法.设计了相应的硬件验证平台,验证了H.264解码算法在可重构处理器上的可实现性.  相似文献   

4.
赵维  黄开臣  罗永红 《电子科技》2013,26(6):131-133
Hummingbird加密算法是针对RFID标签等硬件受限系统的轻型加密算法。其已在不同平台上得到了验证。文中提出了一种针对Hummingbird算法的硬件架构,与目前其他方法相比,在响应时间基本相同的情况下,该硬件架构所需的硬件资源更少。其采用Xilinx的低端Spartan-3系列FPGA芯片作为验证平台。实验结果表明,该硬件架构可较好地嵌入到硬件受限系统中,尤其是嵌入式系统。  相似文献   

5.
Many radar sensor systems demand high performance front-end signal processing. The high processing throughput is driven by the fast analog-to-digital conversion sampling rate, the large number of sensor channels, and stringent requirements on the filter design leading to a large number of filter taps. The computational demands range from tens to hundreds of billion operations per second (GOPS). Fortunately, this processing is very regular, highly parallel, and well suited to VLSI hardware. We recently fielded a system consisting of 100 GOPS designed using custom VLSI chips. The system can adapt to different filter coefficients as a function of changes in the transmitted radar pulse. Although the computation is performed on custom VLSI chips, there are important reasons to attempt to solve this problem using adaptive computing devices. As feature size shrinks and field programmable gate arrays become more capable, the same filtering operation will be feasible using reconfigurable electronics. In this paper we describe the hardware architecture of this high performance radar signal processor, technology trends in reconfigurable computing, and present an alternate implementation using emerging reconfigurable technologies. We investigate the suitability of a Xilinx Virtex chip (XCV1000) to this application. Results of simulating and implementing the application on the Xilinx chip is also discussed.  相似文献   

6.
This work proposes a new FPGA architecture, to meet the requirements of signal processing and testing of current system-on-chip designs. The proposed architecture provides the hardware reuse and the reconfigurability advantages of an FPGA, not only for the system functionality, but also for the system testing, while keeping the performance level required by current signal processing applications. This paper presents the new FPGA model, along with preliminary experimental results that clearly show the possible advantages at the system level of merging design and test in a reconfigurable device.  相似文献   

7.
基于可重构核的FPGA电路设计   总被引:4,自引:0,他引:4  
电路系统的自适应性、紧凑性和低成本 ,促进了在嵌入式系统中软硬件的协同设计。在线可重构FPGA不仅可以满足这一要求 ,而且在可编程专用电路系统设计的验证及可靠性等方面有着良好的应用 ,文中介绍了可重构 FPGA的实现结构及评估方法 ,提出以线性矢量表征可重构 FPGA及其可重构核的研究模型 ,以及基于可重构核的模块化设计 ,认为面向分类的专用类可重构 FPGA应当是现阶段可重构 FPGA的研究主题。  相似文献   

8.
文中提出了一种基于FPGA的高速可重构FFT处理器结构.该结构采用精简控制算法[1]可针对从32点到1024点等不同点数数字信号进行FFT处理,并且在Xilinx公司Virtex2p系列FPGA上进行了综合及后仿真.结果表明该可重构结构相比Xilinx IP core而言资源占用减少16%~21%(slice),最高时钟频率提高了10%~30%,输入输出延时减少了56~116个时钟周期,运算效率明显提高,而功耗相当.可适用于低成本高速数字信号处理系统.  相似文献   

9.
随着大数据时代的到来,基于通用处理器的服务器由于硬件结构的限制,很难在提高性能的同时降低功耗,现提出了一种基于TCP/IP硬件栈的新型服务器架构,将TCP/IP处理流程从通用CPU中分离出来,采用专门的硬件电路实现,在性能与通用服务器相同的情况下,大幅度降低功耗。最后使用FPGA搭建了原型机,并与IBM通用服务器进行了对比测试。  相似文献   

10.
With a huge increase in demand for various kinds of compute-intensive applications in electronic systems, researchers have focused on coarse-grained reconfigurable architectures because of their advantages: high performance and flexibility. This paper presents FloRA, a coarse-grained reconfigurable architecture with floating-point support. A two-dimensional array of integer processing elements in FloRA is configured at run-time to perform floating-point operations as well as integer operations. Fabricated using 130 nm process, the total area overhead due to additional hardware for floating-point operations is about 7.4% compared to the previous architecture which does not support floating-point operations. The fabricated chip runs at 125 MHz clock frequency and 1.2 V power supply. Experiments show 11.6× speedup on average compared to ARM9 with a vector-floating-point unit for integer-only benchmark programs as well as programs containing floating-point operations. Compared with other similar approaches including XPP and Butter, the proposed architecture shows much higher performance for integer applications, while maintaining about half the performance of Butter for floating-point applications.  相似文献   

11.
Architecture for Dynamically Reconfigurable Embedded Systems (ADRES) is a templatized coarse-grained reconfigurable processor architecture. It targets at embedded applications which demand high-performance, low-power and high-level language programmability. Compared with typical very long instruction word-based digital signal processor, ADRES can exploit higher parallelism by using more scalable hardware with support of novel compilation techniques. We developed a complete tool-chain, including compiler, simulator and HDL generator. This paper describes the design case of a media processor targeting at H.264 decoder and other video tasks based on the ADRES template. The whole processor design, hardware implementaiton and application mapping are done in a relative short period. Yet we obtain C-programmed real-time H.264/AVC CIF decoding at 50 MHz. The die size, clock speed and the power consumption are also very competitive compared with other processors.
S. DupontEmail:
  相似文献   

12.
基于线性预测正弦激励算法模型原理,设计了一款高质量多速率语音专用处理器芯片。芯片使用可重构体系结构和超长指令字系统设计方法,将复杂度高的子程序进行优化,能够显著提高指令并行度。仿真结果表明,在该芯片上实现语音压缩编码算法,执行效率高于相同工艺水平的通用DSP,并保持原有编码质量。该处理器能够实现多种类型的语音压缩算法,可以达到对语音算法的高保密性、低复杂度、易开发性。  相似文献   

13.
袁子昂  倪伟  冉敬楠 《电子科技》2022,35(12):35-42
神经网络被广泛应用于模式识别、预测分析、数据拟合等方面,是人工智能的重要基础。神经网络卷积计算量大且网络参数量多,导致了计算时间长且数据访存压力大等问题。针对以上问题,文中基于Winograd算法对卷积计算进行加速,设计了优化的硬件计算结构,提高了数据的复用效率和计算并行度。相较于滑窗卷积,文中所提加速器的计算效率提升了4.352倍。在卷积核梯度计算方面,该加速器采用优化的数据分配方式,减少了数据搬移且满足了多个PE并行计算的数据需求,与CPU相比性能提升了23倍。实验表明,该加速器在VGG-9网络模型下的卷积计算吞吐率可达192.55 GFLOPS,在训练后对CIFAR-10数据集的识别率为76.54%。  相似文献   

14.
This paper presents an investigation of dynamically reconfigurable mixed-signal circuit constructed using a digital control system and the new technology of Field Programmable Analog Arrays (FPAA). A Motorola FPAA described in this paper can be used to build filters for analog signals as well as other kinds of analog applications implemented in switched capacitor technology (S/C-technology). The experimental studies described, take advantage of performance and programmability of the FPAA for filtering of an analog signal. The circuit structure is based on 2 parallel FPAA chips, analog multiplexer and multiplexer's control logic controlled by a digital system such as a PC or a Field Programmable Gate Array (FPGA). Dynamic reconfiguration is used in this system for adaptive filtering, or adaptive processing in general. Modeling and measurements of the transition behavior of the switching process between the 2 FPAA chips and analysis of limitations imposed by hardware imperfections will be presented. The experimental system assembled in this work is an excellent vehicle to learn about intricacies in performance of mixed-signal circuits and is used for verification of theoretical predictions and model validation/modification.  相似文献   

15.
基于FPGA的可重构测速模块设计   总被引:2,自引:0,他引:2  
光电编码器以其高精度和高可靠性而被广泛用于各种位移、角度测量的场合。已经有很多测量的方法出现。提出一种嵌入式系统可重构系统设计的方法,把光电编码器测速检测作为模块嵌入系统中。并且基于这种方法设计了一个控制系统,充分利用了FPGA的高速可重构特性。最后给出了一些FPGA的仿真结果验证。  相似文献   

16.
乔双  宋建中 《电子器件》2002,25(2):139-142
本文提出一种以可编程整数处理单元为进化单位的FPGA结构模型,给出了函数级硬件进化的概念,并介绍了相应的遗传算法。  相似文献   

17.
超声探伤系统硬件报警技术研究   总被引:1,自引:1,他引:0  
黄元谦 《电声技术》2010,34(3):37-39,58
闸门报警技术是实现自动化探伤系统的一项关键技术。在分析传统探伤设备闸门报警技术优缺点的基础上,提出了一种基于FPGA实现的硬件报警技术,并重点介绍了硬件报警技术的工作原理及实现流程。系统测试表明,闸门形状设置灵活多样化,很好地匹配波形衰减情况,而且满足了在高重复频率下系统实时性报警要求。  相似文献   

18.
Reconfigurable Computing for Digital Signal Processing: A Survey   总被引:6,自引:0,他引:6  
Steady advances in VLSI technology and design tools have extensively expanded the application domain of digital signal processing over the past decade. While application-specific integrated circuits (ASICs) and programmable digital signal processors (PDSPs) remain the implementation mechanisms of choice for many DSP applications, increasingly new system implementations based on reconfigurable computing are being considered. These flexible platforms, which offer the functional efficiency of hardware and the programmability of software, are quickly maturing as the logic capacity of programmable devices follows Moore's Law and advanced automated design techniques become available. As initial reconfigurable technologies have emerged, new academic and commercial efforts have been initiated to support power optimization, cost reduction, and enhanced run-time performance.This paper presents a survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years. This work is placed in the context of other available DSP implementation media including ASICs and PDSPs to fully document the range of design choices available to system engineers. It is shown that while contemporary reconfigurable computing can be applied to a variety of DSP applications including video, audio, speech, and control, much work remains to realize its full potential. While individual implementations of PDSP, ASIC, and reconfigurable resources each offer distinct advantages, it is likely that integrated combinations of these technologies will provide more complete solutions.  相似文献   

19.
Dynamically reconfigurable hardware has already been deployed for accelerating computationally demanding applications. Some of these hardware architectures allow run time reconfiguration but this usually leads to a large reconfiguration overhead. The advantage of run time reconfiguration is that it allows new algorithmic solutions for many applications. To study the potential of frequent run time reconfiguration it is interesting to investigate its costs and benefits from an abstract point of view and to develop new architectural concepts. Multi-level reconfigurable architectures are one such concept that introduces several levels of reconfiguration. This paper deals with new types of multi-level reconfigurable architectures. The corresponding problem of finding the best granularity for different reconfiguration levels is formulated and investigated. Although this problem is shown to be NP-complete, an interesting restricted subcase is solved optimally in polynomial time. For the general case, a good heuristic is proposed that is based on solutions for the restricted case. Results on three example applications show that the reconfiguration cost can be reduced with the new architectures. Based on a proposed measure of relative efficiency it is also shown that the new architectures are more efficient so that they obtain a larger reconfiguration cost reduction with less additional hardware.
Martin MiddendorfEmail:
  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号