共查询到20条相似文献,搜索用时 265 毫秒
1.
2.
在LTE系统中,上行链路采用单载波频分多址技术(SC-FDMA),下行链路采用正交频分多址技术(OFDMA),在这两种技术的实现过程中,快速傅里叶变换(FFT)都有着重要的应用。为了提高FFT算法的计算效率,进而提升LTE系统的性能,本文提出了一种基于多核并行处理的点数可配置FFT算法,然后基于硬件实现平台的特点利用OpenMP并行编程语句在PC上对算法进行仿真,最后在FPGA上使用可配置软核MicroBlaze和逻辑资源实现了以上设计。仿真和实现结果表明,在多核环境下计算效率提升显著,尤其在大点数情况下,这对提升整个LTE系统的性能而言是非常有意义的。 相似文献
3.
A system chip targeting image and voice processing and recognition application domains is implemented as a representative of the potential of using programmable logic in system design. It features an embedded reconfigurable processor built by joining a configurable and extensible processor core and an SRAM-based embedded field-programmable gate array (FPGA). Application-specific bus-mapped coprocessors and flexible input/output peripherals and interfaces can also be added and dynamically modified by reconfiguring the embedded FPGA. The architecture of the system is discussed as well as the design flows for pre- and post-silicon design and customization. The silicon area required by the system is 20 mm/sup 2/ in a 0.18-/spl mu/m CMOS technology. The embedded FPGA accounts for about 40% of the system area. 相似文献
4.
Kisun You Young-kyu Choi Jungwook Choi Wonyong Sung 《Journal of Signal Processing Systems》2011,63(1):95-105
We have developed a memory access reduced VLSI chip for 5,000 word speaker-independent continuous speech recognition. This
chip employs a context-dependent HMM (hidden Markov model) based speech recognition algorithm, and contains parallel and pipelined
hardware units for emission probability computation and Viterbi beam search. To maximize the performance, we adopted several
memory access reduction techniques such as sub-vector clustering and multi-block processing for the emission probability computation.
We also employed a custom DRAM controller for efficient access of consecutive data. Moreover, we analyzed the access pattern
of data to minimize the internal SRAM size while maintaining high performance. The experimental results show that the implemented
system performs speech recognition 2.4 and 1.8 times faster than real-time utilizing 32-bit DDR SDRAM and SDR SDRAM, respectively. 相似文献
5.
6.
介绍了圆柱面投影算法的基本原理,给出了基于硬件逻辑实现的系统设计,该设计基于Altera公司Cyclone III系列的FPGA,先将外部数据写入DDR2 SDRAM,然后用硬件逻辑来实现圆柱面投影算法,同时自动生成Cameralink时序提供给后端,并通过显示模块转化输出到LCD的DVI接口进行实时显示,通过显示的图像可看到,硬件逻辑实现的柱面投影达到了预期效果。 相似文献
7.
FPGA is an appealing platform to accelerate DNN.We survey a range of FPGA chip designs for AI.For DSP module,one type of design is to support low-precision operation,such as 9-bit or 4-bit multiplication.The other type of design of DSP is to support floating point multiply-accumulates(MACs),which guarantee high-accuracy of DNN.For ALM(adaptive logic module)module,one type of design is to support low-precision MACs,three modifications of ALM includes extra carry chain,or 4-bit adder,or shadow multipliers which increase the density of on-chip MAC operation.The other enhancement of ALM or CLB(configurable logic block)is to support BNN(binarized neural network)which is ultra-reduced precision version of DNN.For memory modules which can store weights and activations of DNN,three types of memory are proposed which are embedded memory,in-package HBM(high bandwidth memory)and off-chip memory interfaces,such as DDR4/5.Other designs are new architecture and specialized AI engine.Xilinx ACAP in 7 nm is the first industry adaptive compute acceleration platform.Its AI engine can provide up to 8X silicon compute density.Intel AgileX in 10 nm works coherently with Intel own CPU,which increase computation performance,reduced overhead and latency. 相似文献
8.
在高速数据收发系统设计中,首先需要解决的问题是实时数据的高速缓存,然而FPGA内部有限的存储资源无法满足海量数据缓存的要求。为了解决系统中海量数据的缓存问题,系统创新提出了一种基于DDR2 SDRAM的乒乓双缓冲设计方案。方案设计了两路基于DDR2 SDRAM的大容量异步FIFO,通过FPGA内部选择逻辑实现两条通路间的乒乓操作,从而实现数据的高速缓存。实验结果表明,基于DDR2 SDRAM的数据收发系统实现了每路512 Mbit的缓存空间和200 MHz的总线速率,解决了海量数据的高速缓存问题。 相似文献
9.
10.
定标器(Scaler)是广泛应用于平板显示器系统中的图像缩放引擎,它将不同分辨率的输入图像经缩放后以固定的分辨率输出到平板显示器上.本文首先在分析定标器系统结构的基础上提出了三个时序约束条件,并推导了相应的公式,当满足这三个约束条件时,定标器中的FIFO和行缓冲区不会上溢或下溢,显示帧与输入帧同步,很好地解决了定标器的时序问题.随后介绍了基于双线性插值算法的图像缩放引擎设计,然后用FPGA实现该缩放引擎,并构建测试环境对整个定标器进行逻辑功能验证,最后给出验证的结果. 相似文献
11.
针对SDRAM控制器设计复杂且可复用性低的特点,基于VerilogHDL提出了一种简单且可灵活定制异步FIFO的SDRAM控制器实现。图像预处理时经常会用到SDRAM来作为缓存,SDRAM的工作频率很高,所以一般会用异步FIFO缓存数据匹配它的频率,但是每次都重新设计FIFO的控制显然太繁琐。本设计结合FPGA的特点一方面简化SDRAM的控制时序提高了系统性能,另一方面在控制器中嵌入多路异步FIFO,当面对不同的设计需要时只需给设计关心的异步FIFO加载上数据、时钟、深度以及地址则可。既节约了逻辑资源又实现了重复使用的目的为后续设计节省了时间。 相似文献
12.
基于SDRAM的Bayer格式图像插值算法硬件设计 总被引:1,自引:0,他引:1
针对传统的双线性Bayer格式图像彩色恢复算法效果不理想,提出了一种新算法,设计了一种将其用FPGA实现的硬件方案。改进算法应用梯度变化来增加通道间的相关性,提高线性插值的效果,根据探测器数据输出格式不满足Bayer插值算法要求,设计了一种基于SDRAM,运用乒乓操作和流水线等技术的硬件处理新机制。整个系统采用一片可编程门阵列(FPGA)作为硬件设计载体,使用Verilog-HDL硬件描述语言并采用自上而下的模块化设计对整个系统进行硬件描述。试验表明,算法在硬件上工作正常,实时输出的彩色图像在PSNR和目视方面均优于双线性插值算法。基本满足了系统在实时性和图像质量方面的要求。 相似文献
13.
A new architecture of field programmable gate array for high-speed datapath applications is presented. Its implementation is facilitated by a configurable interconnect technology based on a onetime, two-terminal programmable, very low-impedance anti-fuse and by a configurable logic module optimized for datapath applications. The configurable logic module can effectively implement diverse logic functions including sequential elements such as latches and flip-flops, and arithmetic functions such as one-bit full adder and two-bit comparator. A novel programming architecture is designed for supplying large current through the anti-fuse element, which drops the on-resistance of anti-fuse below 20 Ω. The chip has been fabricated using a 0.8-μm n-well complementary metal oxide semiconductor technology with two layers of metalization. 相似文献
14.
15.
针对传统工业数字摄像机的灵活性差、实时性差等缺点,设计了一种基于FPGA的工业数字摄像机系统。将工业数字摄像机与FPGA结合起来,利用FPGA通过I2 C总线接口控制器控制图像传感器采集图像数据,然后将Bayer格式图像转化为RGB格式图像,通过调用Altera IP核DDRII SDRAM controller with ALTMEMPHY和FIFO存储器设计了DDR2SDRAM的接口,将图像数据缓存到DDR2存储器中,最后通过SPI总线接口在液晶屏上显示图像,可达到53帧/s图像的速度。系统代码共需约5 000个逻辑单元,3 704个寄存器,117个引脚。将设计代码下载到系统芯片中后,系统可以清晰显示所拍到的画面。设计结果表明,基于FPGA的工业数字摄像机设计灵活,易于移植,可实现高速图像采集和传输。 相似文献
16.
17.
18.
Software defined radios (SDR) are highly configurable hardware platforms that provide the technology for realizing the rapidly expanding third (and future) generation digital wireless communication infrastructure. While there are a number of silicon alternatives available for implementing the various functions in a SDR, field programmable gate arrays (FPGAs) are an attractive option for many of these tasks for reasons of performance, power consumption and flexibility. Amongst the more complex tasks performed in a high data rate wireless system is synchronization. This paper examines carrier synchronization in SDRs using FPGA based signal processors. We provide a tutorial style overview of carrier recovery techniques for QPSK and QAM modulation schemes and report on the design and FPGA implementation of a carrier recovery loop for a 16-QAM modern. Two design alternatives are presented to highlight the rich design space accessible using configurable logic. The FPGA device utilization and performance for a carrier recovery circuit using a look-up table approach and CORDIC arithmetic are presented. The simulation and FPGA implementation process using a recent system level design tool called System Generator for DSP described. 相似文献
19.
通过分析FPGA可配置逻辑块的细致结构,提出了一种基于FPGA的细粒度映射方法,并使用该方法高效实现了大数模乘脉动阵列.在保持高速计算特点的同时,将模乘脉动阵列的资源消耗降低为原来的三分之一.在低成本的20万门级FPGA器件中即可实现1024位模乘器.该实现每秒可进行20次RSA签名.如果换用高性能FPGA,签名速度更可提高至每秒40次. 相似文献