首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
提出一种基于提升算法(lifting scheme)实现JPEG2000编码系统中的二维离散小波变换(Discrete Wavelet Transform)的并行阵列式的VLSI结构设计方法.该结构由一个行处理器和一个列处理器组成,行、列处理器通过时分复用同时进行滤波,用优化的移位加操作替代乘法操作,采用嵌入式数据延拓算法处理边界延拓.整个结构采用流水线设计方法,减少了运算量,提高了硬件资源利用率,该结构可应用于JPEG2000图像编码芯片中.  相似文献   

2.
JPEG2000并行阵列式小波滤波器的VLSI结构设计   总被引:2,自引:0,他引:2       下载免费PDF全文
兰旭光  郑南宁  梅魁志  刘跃虎 《电子学报》2004,32(11):1806-1809
提出一种基于提升算法实现JPEG2000编码系统中的二维离散小波变换(Discrete Wavelet Transform)的并行阵列式的VLSI结构设计方法.利用该方法所得结构由两个行处理器,一个列处理器以及少量行缓存组成;行列处理器内部是由并行阵列式的处理单元组成;能使行和列滤波器同时进行滤波,用优化的移位加操作替代乘法操作.整个结构采用流水线的设计方法处理,在保证同样的精度下,大大减少了运算量和提高了硬件资源利用率,几乎达到100%,加快了变换速度,也减少了电路的规模.该结构对于N×N大小的图像,处理速度达到O(N2/2)个时钟周期.二维离散小波滤波器结构已经过FPGA验证,并可作为单独的IP核应用于正在开发的JPEG2000图像编解码芯片中.  相似文献   

3.
Novel decomposed lifting scheme (DLS) is presented to perform one-dimensional (1D) discrete wavelet transform (DWT) with consistent data flow in both row and column dimension. Based on the proposed DLS, intermediate data can be transferred seamlessly between the column processor and the row processor in the hardware implementation of two-dimensional (2D) DWT, resulting in the reduction of on-chip memory, output latency and control complexity. Moreover, the implementation of 2D DWT can be easily extended to achieve higher processing speed with controlled increase of hardware cost. Memory-efficient and high-speed architectures are proposed to implement 2D DWT for JPEG2000, which are called fast architecture (FA) and high-speed architecture (HA). FA and HA can perform 2D DWT in N 2 /2 and N 2 /4 clock cycles for an N×N image, respectively, but the required internal memory is only 4N for 9/7 DWT and 2N for 5/3 DWT. Compared with the works reported in previous literature, the proposed designs provide excellent performance in hardware cost, control complexity, output latency and computing time. The proposed designs were implemented to process 2D 9/7 DWT in SMIC 0.18 μm CMOS logic fabrication with 4 KB internal memory for the image size 512 × 512. The areas are only 999137 um 2 and 1333054 um 2 for FA and HA, respectively, but the operation frequency can be up to 150 MHz.  相似文献   

4.
We propose an architecture that performs the forward and inverse discrete wavelet transform (DWT) using a lifting-based scheme for the set of seven filters proposed in JPEG2000. The architecture consists of two row processors, two column processors, and two memory modules. Each processor contains two adders, one multiplier, and one shifter. The precision of the multipliers and adders has been determined using extensive simulation. Each memory module consists of four banks in order to support the high computational bandwidth. The architecture has been designed to generate an output every cycle for the JPEG2000 default filters. The schedules have been generated by hand and the corresponding timings listed. Finally, the architecture has been implemented in behavioral VHDL. The estimated area of the proposed architecture in 0.18-μ technology is 2.8 nun square, and the estimated frequency of operation is 200 MHz  相似文献   

5.
提出了一种基于提升算法的二维离散5/3小波变换(DWT)高效并行VLSI结构设计方法。该方法使得行和列滤波器同时进行滤波,采用流水线设计方法处理,在保证同样的精度下,大大减少了运算量,提高了变换速度,节约了硬件资源。该方法已通过了VerilogHDL行为级仿真验证,可作为单独的IP核应用在JPEG2000图像编、解码芯片中。该结构可推广到9/7小波提升结构。  相似文献   

6.
JPEG2000小波提升算法的硬件设计   总被引:7,自引:1,他引:6       下载免费PDF全文
董文辉  刘明业 《电子学报》2003,31(11):1674-1677
离散小波变换是当今许多图像处理和压缩技术的基础,并被最新的ISO/IEC静态图像压缩标准JPEG2000所采用.基于提升方法的离散小波变换比传统的基于卷积的运算量小.我们为JPEG2000中的小波提升算法提出一个硬件结构,该结构整体运算速度高,存储需求低,硬件资源耗费少.我们提出在数据通道之外实现边界扩展,以降低数据通道的复杂性,提高运算效率.我们通过采用流水线技术,进一步提高了硬件设计的运算效率.  相似文献   

7.
基于提升格式的离散小波变换比传统的基于卷积的运算量少,易于VLSI实现。本文提出了一种基于提升格式,高效实时实现JPEG2000中9/7双正交离散小波变换滤波器的VLSI结构设计方法。该方法所设计的结构,在保证同样的精度下,减少了运算量,整体运算速度高,硬件花费少,存储需求低,硬件利用率达到100%。本文用Verilog HDL对系统进行硬件描述,并选用Xilinx公司的XCV50e-cs144-8器件在ISE4.1环境下实现了综合。  相似文献   

8.
JPEG2000小波变换器的VLSI结构设计   总被引:3,自引:1,他引:2  
新一代静止图像压缩标准JPEG2000将离散小波变换(DWT)作为其核心变换技术,并推荐采用推举体制(lifting)快速算法来实现.空间组合推举体制算法(SCLA)大大降低了lifting的运算量.当选用9/7小波滤波器时,SCLA的乘法运算量只有lifting的7/12.本文提出了一种实现SCLA算法的VLSI结构,降低了基于lifting实现的运算量, 加快了变换的速度,减小了电路的规模.本文的二维正反小波变换器已经作为单独的IP核应用于我们目前正在开发的JPEG2000图像编解码芯片中.  相似文献   

9.
Novel architectures for 1-D and 2-D discrete wavelet transform (DWT) by using lifting schemes are presented in this paper. An embedded decimation technique is exploited to optimize the architecture for 1-D DWT, which is designed to receive an input and generate an output with the low- and high-frequency components of original data being available alternately. Based on this 1-D DWT architecture, an efficient line-based architecture for 2-D DWT is further proposed by employing parallel and pipeline techniques, which is mainly composed of two horizontal filter modules and one vertical filter module, working in parallel and pipeline fashion with 100% hardware utilization. This 2-D architecture is called fast architecture (FA) that can perform J levels of decomposition for N * N image in approximately 2N2(1 - 4(-J))/3 internal clock cycles. Moreover, another efficient generic line-based 2-D architecture is proposed by exploiting the parallelism among four subband transforms in lifting-based 2-D DWT, which can perform J levels of decomposition for N * N image in approximately N2(1 - 4(-J))/3 internal clock cycles; hence, it is called high-speed architecture. The throughput rate of the latter is increased by two times when comparing with the former 2-D architecture, but only less additional hardware cost is added. Compared with the works reported in previous literature, the proposed architectures for 2-D DWT are efficient alternatives in tradeoff among hardware cost, throughput rate, output latency and control complexity, etc.  相似文献   

10.
This paper presents a novel unified and programmable 2-D Discrete Wavelet Transform (DWT) system architecture, which was implemented using a Field Programmable Gate Array (FPGA)-based Nios II soft-core processor working in combination with custom hardware accelerators generated through high-level synthesis. The proposed system architecture, synthesized on an Altera DE3 Stratix III FPGA board, was developed through an iterative design space exploration methodology using Altera’s C2H compiler. Experimental results show that the proposed system architecture is capable of real-time video processing performance for grayscale image resolutions of up to 1920?×?1080 (1080p) when ran on the Altera DE3 board, and it outperforms the existing 2-D DWT architecture implementations known in literature by a considerable margin in terms of throughput. While the proposed 2-D DWT system architecture satisfies real-time performance constraints, it can also perform both forward and inverse DWT, support a number of popular DWT filters used for image and video compression and provide architecture programmability in terms of number of levels of decomposition as well as image width and height. Based from the design principles used to implement the proposed 2-D DWT system architecture, a system design guideline can be formulated for SOC designs which plan to incorporate dedicated 2-D DWT hardware acceleration.  相似文献   

11.
对JPEG2 0 0 0中推荐的 5 /3整数滤波器和 9/7实数滤波器进行了硬件实现时所需要的有限精度分析 ;确定了小波变换过程中各个参数的最佳数据宽度 ,还确定了整个变换系统的数据通路的数据宽度。基于lifting的小波变换的特点结合嵌入式延拓算法提出了两种小波变换———折叠结构和长流水线结构 ;对两种结构进行了分析比较。最后 ,对折叠结构和相关的其它结构在所需存储单元的数量、存储单元的访问次数、处理能力以及功耗等方面进行了分析比较 ,可以看出文中提出的结构在性能上有明显优点。  相似文献   

12.
Efficient architectures for 1-D and 2-D lifting-based wavelet transforms   总被引:4,自引:0,他引:4  
The lifting scheme reduces the computational complexity of the discrete wavelet transform (DWT) by factoring the wavelet filters into cascades of simple lifting steps that process the input samples in pairs. We propose four compact and efficient hardware architectures for implementing lifting-based DWTs, namely, one-dimensional (1-D) and two-dimensional (2-D) versions of what we call recursive and dual scan architectures. The 1-D recursive architecture exploits interdependencies among the wavelet coefficients by interleaving, on alternate clock cycles using the same datapath hardware, the calculation of higher order coefficients along with that of the first-stage coefficients. The resulting hardware utilization exceeds 90% in the typical case of a five-stage 1-D DWT operating on 1024 samples. The 1-D dual scan architecture achieves 100% datapath hardware utilization by processing two independent data streams together using shared functional blocks. The recursive and dual scan architectures can be readily extended to the 2-D case. The 2-D recursive architecture is roughly 25% faster than conventional implementations, and it requires a buffer that stores only a few rows of the data array instead of a fixed fraction (typically 25% or more) of the entire array. The 2-D dual scan architecture processes the column and row transforms simultaneously, and the memory buffer size is comparable to existing architectures.  相似文献   

13.
Memory requirements and critical path are essential for 2-D Discrete Wavelet Transform (DWT). In this paper, we address this problem and develop a memory-efficient high-speed architecture for multi-level two-dimensional DWT. First, dual data scanning technique is first adopted in 2-D 9/7 DWT processing unit to perform lifting operations, which doubles the throughputs per cycle. Second, for 2-D DWT architecture, the proposed Row Transform Unit and Column Transform Unit take advantage of input sample availabilities and provision computing resources accordingly to optimize the processing speed, in which the number of processors is further optimized to significantly reduce the hardware cost. Third, to address the problem of high cost of memory for the immediate computing results from each level and the computation time as resolution level increases, multiple proposed 2-D DWT units were combined to build a parallel multi-level architecture, which can perform up to six levels of 2-D DWT in a resolution level parallel way on any arbitrary image size at competitive hardware cost. Experimental results demonstrated that the proposed scheme achieves improved hardware performance with significantly reduced on-chip memory resource and computational time, which outperforms the-state-of-the-art schemes and makes it desirable in memory-constrained real-time application systems.  相似文献   

14.
1 IntroductionJPEG2000,which is a new image compression standard,enables the achievement of higher image compression ratiosthan JPEG and also has superior features such as lower tile boundary noise and higher image quality.Furthermore,it alsofeatures various powerful functions,such as highly hierarchical encoding,region-of-interest(ROI),lossless compression,etc.[1].As a result,it is highly expected that it will replace the existing JPEG for applications such as surveillance networkcamera…  相似文献   

15.
提出并实现了一种用于JPEG2000编码芯片中高速Tier1编码器的并行流水结构。该编码器采用了双位平面并行编码、通道扫描的流水控制、状态变量实时产生电路以及列内并行上下文生成等技术,实现了一种0状态存储器的多并行流水位平面编码器;并行同步流水的多记号输入算术编码器以及不定算术编码周期下的多输入同步读取电路,使算术编码速度平均为1.3上下文编码记号对/时钟;对算术编码产生的压缩码流存储呈高效的宏流水线结构。该编码器在100MHz工作时钟下,最高编码速度为85M小波系数/s。用SMIC0.25μm工艺库综合时,门电路为6.3万门,片上存储器为26kb(码块大小32×32),关键路径为5.2ns。  相似文献   

16.
This paper proposes two JPEG 2000 compliant architectures: one for DWT (Discrete Wavelet Transform) and one for IWT (Integer Wavelet Transform) implementation. First of all some theoretical issues about DWT and IWT are discussed, then, starting from transforms characteristics, the architectures are presented showing both performance and cost. In the literature many DWT architectures have been proposed; our implementation is a new architecture that computes the DWT using filters of interest for the forthcoming JPEG 2000 standard. Moreover, we propose a Lifting Scheme based architecture for IWT, JPEG 2000 compliant too. The proposed architectures are able to support real-time streams: the DWT one, which is made of 20,000 cells, with an input throughput of 160 Msamples per second and a clock frequency of 160 MHz, the IWT one, consisting of 50,000 cells, with an input throughput of 4.5 Msamples per second and an internal clock frequency of 108 MHz.  相似文献   

17.
一种快速高效的二维一级小波变换的硬件实现   总被引:2,自引:1,他引:1  
提出了一种针对9/7小波滤波器的二维一级小波变换的硬件平台,整体结构采用流水方式实现,数据分组输入,列变换采用多个小波变换单元,行变换模块为可重构硬件结构,行列变换之间不需要片上存储器。与已有结构相比,该结构可以通过更少的硬件资源消耗获得更高的处理速度。  相似文献   

18.
高涛  白璘 《电子设计工程》2012,20(14):120-122
文中通过深入研究三维离散小波变换(3D DWT)核心算法并根据序列图像编码的特点,设计并实现了一种适合硬件实现的高效的三维小波变换VLSI结构。编写了相应verilog模型,并进行了仿真和逻辑综合。仿真结果表明行列滤波并行处理并采用流水线设计方法,加快了运算速度,有效降低了片内存储容量。  相似文献   

19.
一种无乘法高性能9/7离散小波变换滤波器的硬件设计   总被引:1,自引:0,他引:1  
马艳萍  王剑峰  刘云 《电讯技术》2006,46(5):200-204
提出了一种基于提升格式,高效、实时实现JPEG2000中9/7双正交离散小波变换虑波器的VLSI结构设计方法。该方法所设计的结构,在保证同样的精度下,大大减少了运算量,整体运算速度高,硬件花费少,存储需求低,硬件利用率达到100%。用Verilog HDL对系统进行了硬件描述,并选用Xilinx公司的xcv50e-cs144-8器件在ISE4.1环境下实现了综合。  相似文献   

20.
Many VLSI architectures for computing the discrete wavelet transform (DWT) were presented, but the parallel input data sequence and the programmability of the 2-D DWT were rarely mentioned. In this paper, we present a parallel-processing VLSI architecture to compute the programmable 2-D DWT, including various wavelet filter lengths and various wavelet transform levels. The proposed architecture is very regular and easy for extension. To eliminate high frequency components, the pixel values outside the boundary of the image are mirror-extended as the symmetric wavelet transform (SWT) and the mirror-extension is realized via the routing network. Owing to the property of the parallel processing, we adopt the row-based recursive pyramid algorithm (RPA), similar to 1-D RPA, as the data scheduling. This design has been implemented and fabricated in a 0.35 m 1P4M CMOS technology and the working frequency is 50 MHz. The chip size is about 5200 m × 2500 m. For a 256 × 256 image, the chip can perform 30 frames per second with the filter length varying from 2 to 20 and with various levels. The proposed architecture is suitable for real-time applications such as JPEG 2000.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号