共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
提出了一种支持可变位宽高效加法的现场可编程逻辑门阵列(FPGA)嵌入式数字信号处理(DSP)单元知识产权(IP)硬核结构,相比于Altera公司的Stratix-III DSP结构,基于本文提出的优化结构可以更高效地实现加法、乘加以及累加等多种应用。利用软件对不同数据类型和位宽的输入实现数据预处理,减小了硬件资源的开销,并进一步提升了电路性能。同时在DSP结构中加入了乘法旁路器和二级符号位扩展的加法电路,在减小DSP实现面积的同时,支持超高位宽、高速的流水线型加法运算,扩展了DSP的应用范围。采用TSMC 55 nm标准CMOS工艺设计并完成了所提出的DSP IP核的电路实现,可实现包括72位可变位宽加法及36位可变位宽乘法等在内的9种运算模式。 相似文献
4.
Kai Sun Meng Wang Zili Shao Hui Liu Hongxing Wei Tianmiao Wang 《Journal of Signal Processing Systems》2010,59(1):71-83
MPSoC (Multi-Processor System-on-Chip) architecture is becoming increasingly used because it can provide designers much more
opportunities to meet specific performance and power goals. In this paper, we propose an MPSoC architecture for implementing
real-time signal processing in gamma camera. Based on a fully analysis of the characteristics of the application, we design
several algorithms to optimize the systems in terms of processing speed, power consumption, and area costs etc. Two types
of DSP core have been designed for the integral algorithm and the coordinate algorithm, the key parts of signal processing
in a gamma camera. An interconnection synthesis algorithm is proposed to reduce the area cost of the Network-on-Chip. We implement
our MPSoC architecture on FPGA, and synthesize DSP cores and Network-on-Chip using Synopsys Design Compiler with a UMC 0.18
\upmum\upmu\textrm m standard cell library. The results show that our technique can effectively accelerate the processing and satisfy the requirements
of real-time signal processing for 256 × 256 image construction. 相似文献
5.
6.
Seyed Mohammad Ali Zeinolabedin Nader Karimi Shadrokh Samavi Tony Tae-Hyoung Kim 《Circuits, Systems, and Signal Processing》2016,35(3):953-976
Contourlet transform (CT) is a powerful image processing tool. Even though many promising applications have been proposed, no hardware implementation of CT has been reported. This paper analyzes CT to form a structure which is hardware implementable. CT consists of two main parts, Laplacian pyramid (LP) and directional filter bank (DFB). In both parts, novel algorithmic changes are proposed for realizing efficient hardware architecture. In the proposed LP structure, 50 % of the arithmetic operations have been reduced and it operates twice as fast as the existing implementations. To the best of our knowledge, DFB has not comprehensively been studied for hardware implementation so far. Thus, we first analyze DFB to figure out its hardware-oriented structure and then propose DFB architecture. Finally, analysis and simulation results demonstrate that the proposed CT architecture achieves the real-time performance (40 frame/s) operating at 76 MHz which is verified through FPGA implementation. Moreover, since all stages utilize fixed-point arithmetic operations, the comprehensive quantization analysis is performed to keep the MSE and PSNR values in an acceptable range. 相似文献
7.
为了解决在实时处理中多数合成孔径雷达(SAR)算法存在的运算量大、耗时长等问题,提出基于多核数字信号处理器(DSP)以及串行高速互联接口(SRIO)的一种新硬件解决方法。主要讨论了现场可编程门阵列(FPGA)+DSP架构下采用多核DSP和SRIO实现SAR算法的主要流程,并在多核DSP中使用流水线技术优化快速傅里叶变换(FFT)算法。通过使用多核DSP和流水线技术以及SRIO技术,使数据运算、传输速率更快,达到缩短运算时间的目的。 相似文献
8.
FPGA technology for multi-axis control systems 总被引:1,自引:0,他引:1
Armando Astarloa Jesús Lázaro Unai Bidarte Jaime Jiménez Aitzol Zuloaga 《Mechatronics》2009,19(2):258-268
The research presented in this article applies the newest Field-Programmable-Gate-Arrays to implement motor controller devices in accordance with the actual core-based design. The flexibility of the System-on-a-Programmable-Chips in motor multi-axis control systems enables the processing of the most intensive computation operations by hardware (PID IP cores) and the trajectory computation by software in the same device. In those systems, the trajectory generation software may run in powerful microprocessors embedded in the FPGA. In this paper, we present a high-performance PID IP core controller described in VHDL; the design flow that has been followed in its design and how the simulation and the PID constants tuning has been approached. The reusability of this module is demonstrated with the design of a 4 axis SoPC controller. Additionally, an experimental self-reconfigurable SoPC design using Run-Time-Reconfiguration is presented. In this case, the control IP core can be replaced dynamically by another module with another with different features. 相似文献
9.
10.
B. Mohan Kumar R. Vidhya Lavanya E.P. Sumesh 《International Journal of Electronics》2013,100(3):288-301
Wavelet transform is considered one of the efficient transforms of this decade for real time signal processing. Due to implementation constraints scalar wavelets do not possess the properties such as compact support, regularity, orthogonality and symmetry, which are desirable qualities to provide a good signal to noise ratio (SNR) in case of signal denoising. This leads to the evolution of the new dimension of wavelet called ‘multiwavelets’, which possess more than one scaling and wavelet filters. The architecture implementation of multiwavelets is an emerging area of research. In real time, the signals are in scalar form, which demands the processing architecture to be scalar. But the conventional Donovan Geronimo Hardin Massopust (DGHM) and Chui-Lian (CL) multiwavelets are vectored and are also unbalanced. In this article, the vectored multiwavelet transforms are converted into a scalar form and its architecture is implemented in FPGA (Field Programmable Gate Array) for signal denoising application. The architecture is compared with DGHM multiwavelets architecture in terms of several objective and performance measures. The CL multiwavelets architecture is further optimised for best performance by using DSP48Es. The results show that CL multiwavelet architecture is suited better for the signal denoising application. 相似文献
11.
提出一种基于提升算法实现JPEG2000编码系统中的二维离散小波变换(Discrete Wavelet Transform)的并行阵列式的VLSI结构设计方法.利用该方法所得结构由两个行处理器,一个列处理器以及少量行缓存组成;行列处理器内部是由并行阵列式的处理单元组成;能使行和列滤波器同时进行滤波,用优化的移位加操作替代乘法操作.整个结构采用流水线的设计方法处理,在保证同样的精度下,大大减少了运算量和提高了硬件资源利用率,几乎达到100%,加快了变换速度,也减少了电路的规模.该结构对于N×N大小的图像,处理速度达到O(N2/2)个时钟周期.二维离散小波滤波器结构已经过FPGA验证,并可作为单独的IP核应用于正在开发的JPEG2000图像编解码芯片中. 相似文献
12.
Tuomo Hänninen Janne Janhunen Markku Juntti 《Analog Integrated Circuits and Signal Processing》2014,78(3):645-655
We summarize our recent state-of-the-art programmable and reconfigurable detector and QR decomposition (QRD) implementations targeting 3G long term evolution (LTE) downlink and uplink requirements. The downlink transmission is based on the orthogonal frequency division multiplexing, whereas the uplink transmission uses a single-carrier frequency-division multiple access. The downlink implementations are based on the programmable transport triggered architecture (TTA) which provides a flexible and energy efficient architecture template. In TTA detector implementation, the LTE detection rate requirements up to 20 MHz bandwidth and 4 × 4 antenna system with 64-QAM, are achieved by using 1–6 programmable cores in parallel. Each core runs at 277 MHz clock frequency and consumes 55.5–64.0 mW depending on the detector configuration. The downlink detector is based on the selective spanning with fast enumeration algorithm. The uplink field-programmable gate array (FPGA) detector implementation is targeted for 4 × 4 antenna system and 64-QAM achieving a detection rate requirement for 20 MHz bandwidth. The used FPGA board for uplink implementation is Xilinx Virtex-6 and the implementation has been carried out using Xilinx Vivado high level synthesis tool. Two different detector architectures are implemented. The first one achieves the detection rate requirement with a single processing block running at 231 MHz and the latter one with four blocks in parallel, each running at 247 MHz. The implemented detector is based on the K-best algorithm. A multiple-input multiple-output receiver requires QRD to produce valid inputs for the detector. In addition to detector implementations, QRD is also implemented on both TTA and FPGA. Modified Gram–Schmidt algorithm is used in both QRD implementations. 相似文献
13.
Crookes D. Benkrid K. Bouridane A. Alotaibi K. Benkrid A. 《Vision, Image and Signal Processing, IEE Proceedings -》2000,147(4):377-384
Reconfigurable hardware in the form of field programmable gate arrays (FPGAs) has been proposed as a way of obtaining high performance for computationally intensive DSP applications such as image processing (IP), even under real time requirements. The inherent reprogrammability of FPGAs gives them some of the flexibility of software while keeping the performance advantages of an application specific solution. However, a major disadvantage of FPGAs is their low level programming model. To bridge the gap between these two levels, the authors present a high level software environment for FPGA-based image processing, which aims to hide hardware details as much as possible from the user. Their approach is to provide a very high level image processing coprocessor (IPC) with a core instruction set based on the operations of image algebra. The environment includes a generator which generates optimised architectures for specific user-defined operations 相似文献
14.
《Journal of Visual Communication and Image Representation》2008,19(1):1-11
This paper presents a novel hardware implementation of a disparity estimation scheme targeted to real-time Integral Photography (IP) image and video sequence compression. The software developed for IP image compression achieves high quality ratios over classic methodologies by exploiting the inherent redundancy that is present in IP images. However, there are certain time constraints to the software approach that must be confronted in order to address real-time applications. Our main effort is to achieve real-time performance by implementing in hardware the most time-consuming parts of the compression algorithm. The proposed novel digital architecture features minimized memory read operations and extensive simultaneous processing, while taking into concern the memory and data bandwidth limitations of a single FPGA implementation. Our results demonstrate that the implemented hardware system can successfully process high resolution IP video sequences in real-time, addressing a vast range of applications, from mobile systems to demanding desktop displays. 相似文献
15.
为了提高图像处理器目标跟踪的准确度和对复杂环境的适应能力,设计了一种CPCI架构的多传感器融合图像跟踪系统。针对高分辨率高帧频图像的采集、目标搜索与跟踪,设计并实现了以FPGA+DSP为核心的嵌入式图像处理平台,采用CPCI标准总线作为数据共享通道,由嵌入式图像处理平台完成实时计算,数据中心完成跟踪结果的数据融合和系统综合控制。实验结果表明:系统可以实现多路不同分辨率CameraLink图像或HD-SDI图像的采集和融合跟踪处理,该架构处理能力强,可扩展性高,结构紧凑,为图像融合跟踪系统提供了一种可靠的解决方案。 相似文献
16.
Real-time image processing usually requires an enormous throughput rate and a huge number of operations. Parallel processing, in the form of specialized hardware, or multiprocessing are therefore indispensable. This piper describes a flexible programmable image processing system using the field programmable gate array (FPGA). The logic cell nature of currently available FPGA is most suitable for performing real-time bit-level image processing operations using the bit-level systolic concept. Here, we propose a novel architecture, the programmable image processing system (PIPS), for the integration of these programmable hardware and digital signal processors (DSPs) to handle the bit-level as well as the arithmetic operations found in many image processing applications. The versatility of the system is demonstrated by the implementation of a 1-D median filter. 相似文献
17.
Sanjay Singh Anil K Saini Ravi Saini A.S. Mandal Chandra Shekhar Anil Vohra 《International Journal of Electronics》2013,100(12):1705-1715
A new resource efficient FPGA-based hardware architecture for real-time edge detection using Sobel operator for video surveillance applications has been proposed. The choice of Sobel operator is due to its property to counteract the noise sensitivity of the simple gradient operator. FPGA is chosen for this implementation due to its flexibility to provide the possibility to perform algorithmic changes in later stage of the system development and its capability to provide real-time performance, hard to achieve with general purpose processor or digital signal processor, while limiting the extensive design work, time and cost required for application specific integrated circuit. The proposed architecture uses single processing element for both horizontal and vertical gradient computation for Sobel operator and utilised approximately 38% less FPGA resources as compared to standard Sobel edge detection architecture while maintaining real-time frame rates for high definition videos (1920 × 1080 image sizes). The complete system is implemented on Xilinx ML510 (Virtex-5 FX130T) FPGA board. 相似文献
18.
基于微小型机载成像跟踪系统设计思想及需求,设计并实现了以高性能的DSP芯片TMS320-DM642为核心处理器,结合可编程逻辑器件CPLD和FPGA的实时图像跟踪处理平台。平台采用基于粒子滤波的目标跟踪算法,实现对目标的实时跟踪。采用卡尔曼滤波器,提高了粒子的利用效率,在改进了算法实时性的同时解决了图像跟踪系统的延时性问题,提高了跟踪系统的稳定性。算法仿真结果表明,与传统相关匹配算法相比,基于粒子滤波的跟踪算法具有更好的鲁棒性和实时性,能满足机载成像跟踪系统实时图像跟踪的要求。 相似文献
19.