首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Real》2000,6(4):297-312
This paper presents a VLSI implementation of One Dimensional Direct Discrete Wavelet transform (1-D DWT). The DDWT can be viewed as a multi-resolution decomposition of a signal. This means that it decomposes a signal into its components in different frequency bands (octave bands). We propose a new architecture using parallel filters. We consider the implementation of 1-D three levels DWT. The proposed architecture is simple and offers 16-bit precision on input and output data. It is constituted of three basic units: one register bank, four filters, and a control unit. The filters are of different lengths and with new coefficients derived from Daubechies filter coefficients. The designed processor architecture requires no interface circuitry for interconnection to a standard communication bus. The architecture can compute DWT at a data rate of 12×106samples/s corresponding to a typical clock speed of 12 MHz. The architecture is simulated at the gate level in VLSI.  相似文献   

2.
《Real》2004,10(1):31-39
This paper presents a new hardware design for a neural network based colour image compression. The compressed image consists of a colour palette containing few best colours and the coded image. Kohonen's map neural network is applied to construct the colour palette and the coded image, both forming the compressed image. The Kohonen's map based compression results in linear time complexity (in the size of the image). It is advantageous over traditional JPEG in colour quantization applications and compression of images with limited colours. The architecture of the hardware unit is based on single instruction multiple data methodology. The architecture has been implemented in an application specific integrated circuit and results show that the proposed design achieves high speed allowing inputs at a video rate for compression of images up to size of 512×512 with low area requirement.  相似文献   

3.
在处理海量数据时,以软件方式实现的Z标准(Zstd)无损压缩算法难以满足特定应用领域对压缩速度的需求.对Zstd进行硬件加速设计是解决这一问题的有效方案,尤其是针对Zstd的有限状态熵编码(finitestateentropy,FSE)的硬件加速.因此,提出一种适用于Zstd的FSE压缩、解压硬件实现架构,采用固定压缩表实现最优的硬件加速步骤;通过增加序列映射的硬件模块来降低存储空间并提高传输速度;采用软硬件协同设计方案,并对硬件实现架构进行7级流水设计.通过VisualStudio与Modelsim的联合验证平台进行验证,实验结果表明在TSMC55 nm的工艺下,系统最高频率可达到750 MHz.与软件实现相比,整体压缩速度提高了9倍以上,整体解压速度提高了约100倍.  相似文献   

4.
《Real》1998,4(3):171-180
Scene matching is the problem of matching regions of two images of the same scene taken by different sensors at different times or under different viewing conditions. In this paper, we describe an efficient architecture for scene matching called SMAC (Scene Matching ArChitecture). The architecture achieves a significant amount of speedup by utilizing a large amount of parallelism and pipelining. Such an architecture can be used to compute the exhaustive search task of hierarchical scene matching, a technique used to reduce the amount of computations involved in scene matching applications. A prototype very large scale integration (VLSI) chip implementing a scaled down version of the proposed architecture has been designed and built. The prototype chip has been tested to be fully functional at a frequency of 50 MHz with a clock cycle of 20 ns. Based on the prototype design, it is estimated that the proposed architecture can process a 512 × 512 image with an 128 × 128 size template in about 15.36 μs, which corresponds to a rate of 65K frames per second.  相似文献   

5.
郭艳飞  李占才  王沁 《计算机工程》2006,32(16):11-13,2
提出了一种面积优化的Reed-Solomon(RS)解码器实现方法,其运用折叠结构来实现解码过程矢量运算的求解电路。该方法提高了解码器主要运算部件的复用率,缩减了其电路规模。基于TSMC 0.25标准单元库的实现结果显示该文设计的解码器电路规模为约27 000门,与同类设计相比规模最大可缩减39%,该设计已集成在一款符合DVB-C标准的HDTV信道解调芯片中并已通过实场测试。  相似文献   

6.
提出一种基于行和提升算法,实现JPEG2000编码系统中的小波正反变换(discretewavelettransform)的低功耗、并行的VLSI结构设计方法·利用该方法所得结构一次处理两行数据,分时复用行处理器,使行处理器内以及行、列处理器实现并行处理,且最小化行缓存·对称扩展通过嵌入式电路实现,整个结构采用流水线设计方法优化,加快了变换速度,增加了硬件资源利用率,降低了功耗,效率几乎达到100%·小波滤波器正反变换结构已经经过FPGA验证,可作为单独的IP核应用于正在开发的JPEG2000图像编解码芯片中·  相似文献   

7.
In this paper, an efficient image codec is proposed using Magnetic Resonance Images (MRI). During the past few years, frequency domain analyzes such as Discrete Cosine Transform and Discrete Wavelet Transform (DWT) have been widely used in the field of image compression due to their well localized property of its coefficients in both frequency and space domain. This work also deals with image compression based on frequency domain transformation. As the medical images are very important for diagnosis, they require lossless compression to store them. However, the coefficients of DWT are real numbers; lossless compression cannot be achieved. To overcome this limitation, a variant of DWT named Lifting Wavelet Transform (LWT) is utilized in the proposed system. The proposed codec is applied on the decomposed image. The codec has also been synthesized using FPGA and the results are compared with simulation results and verified.  相似文献   

8.
高效可配置FFT处理器的VLSI设计及其应用   总被引:2,自引:0,他引:2  
针对正交频分复用通信系统中的快速傅里叶变换(FFT)处理器的硬件实现,提出一种高效可配置的VLSI结构. 在基于存储器的FFT架构基础上,采用一种双路并行处理的数据通路和一种有效的控制方案,节省了硬件面积并提高了系统运算的效率. 此外,对FFT的蝶形运算单元进行了优化,使其能处理多种运算模式.基于该结构的FFT处理器已应用于DVB-T/H系统中,并在SMIC 0.18 μm工艺下进行了逻辑综合、Layout以及功耗分析,等效逻辑门数为56 k,在20 MHz工作频率下功耗约为33.5 mW.与FFT结构相比,该结构有效地减少了硬件面积和功耗.  相似文献   

9.
Fractal coding algorithm has many applications including image compression. In this paper a classification scheme is presented which allows the hardware implementation of the fractal coder. High speed and low power consumption are the goal of the suggested design. The introduced method is based on binary classification of domain and range blocks. The proposed technique increases the processing speed and reduces the power consumption while the qualities of the reconstructed images are comparable with those of the available software techniques. In order to show the functionality of the proposed algorithm, the architecture was implemented on a FPGA chip. The application of the proposed hardware is shown in image compression. The resulted compression ratios, PSNR error, gate count, compression speed and power consumption are compared with the existing designs. Other applications of the proposed design are feasible in certain fields such as mass–volume database coding and also in video coder’s block matching schemes.  相似文献   

10.
11.
高性能的EBCOT编码及其VLSI结构   总被引:1,自引:0,他引:1  
刘凯  李云松  吴成柯 《软件学报》2006,17(7):1553-1560
提出了比特平面与编码过程全并行处理的EBCOT(embedded block coding with optimizedtruncation)编码结构.通过分析JPEG2000和国内外提出的EBCOT编码结构,指出不仅每一个比特平面,而且对应的编码过程的编码信息可以同时获得,从而给出了比特平面与编码过程全并行处理的块编码方法,并且详细说明了实现的VLSI结构.理论分析以及具体实验结果表明,比特平面与编码过程全并行处理所需的时钟周期最少,FPGA原型系统最高时钟频率可达65MHz,对于512×512的灰度图像,处理速度可达30fps,完全可以实时处理,图像质量达到了公布的JPEG2000标准.  相似文献   

12.
The Euclidean Distance Transform (EDT) is an important tool in image analysis and machine vision. This paper provides an area-efficient hardware solution to the computation of EDT on a binary image. An O(n) hardware algorithm for computing EDT of an n×n image is presented. A pipelined 2D array architecture for harware implementation is designed. The architecture has a regular structure with locally connected identical processing elements. Further, pipelining reduces hardware resources. Such an array architecture is easily scalable to handle images of different sizes and is suitable for implementation on reconfigurable devices like FPGAs. Results of FPGA-based implementation shows that the hardware can process about 6000 images of size 512×512 per second which is much higher than the video rate of 30 frames per second.  相似文献   

13.
This paper presents a compact and unified hardware architecture implementing SHA-1 and SHA-256 algorithms that is suitable for the mobile trusted module (MTM), which should satisfy small area and low-power condition. The built-in hardware hash engine in a MTM is one of the most important circuit blocks and dominates the performance of the whole platform because it is used as a key primitive to support most MTM commands concerning to the platform integrity and the command authentication. Unlike the general trusted platform module (TPM) for PCs, the MTM, that is to be employed in mobile devices, has very stringent limitations with respect to available power, circuit area, and so on. Therefore, MTM needs the spatially optimized architecture and design method for the construction of a compact SHA hardware. The proposed hardware for unified SHA-1 and SHA-256 component can compute a sequence of 512-bit data blocks and has been implemented into 12,400 gates of 0.25 μm CMOS process. Furthermore, in the processing speed and power consumption, it shows the better performance in comparison with commercial TPM chips and software-only implementation. The highest operation frequency and throughput of the proposed architecture are 137 MHz and 197.6 Mbps, respectively, which satisfy the processing requirement for the mobile application.  相似文献   

14.
基于FPGA的高性能离散小波变换设计   总被引:1,自引:1,他引:0  
针对db8(Daubechies 8)小波设计了高速正、反变换系统,用DE2开发板进行了系统验证。正、反变换的最高时钟频率分别达到217.72 MHz和217.58 MHz。对比同类文献中的设计,本设计在最高处理速度方面具有明显优势。基于此,考虑通用性,还设计了一种通用小波变换FPGA架构。该架构通用性强,可高性能实现多种小波变换。采用DA算法、LUT结构、流水线技术等对设计进行了优化。  相似文献   

15.
《Real》2000,6(6):461-470
The design and VLSI implementation of a Cellular Automaton processor for the detection of lines and corners in gray-scale images is presented in this paper. The behavior of a number of different Cellular Automaton rules was investigated and it was found that certain rules result in transitions in the Cellular Automaton state-transition diagram that correspond to the masks required for the line and corner detection. More specifically, the one-dimensional Cellular Automaton of length 8, operating under rule 56 with periodic boundary conditions, is capable of generating different sets of mask operators for line detection, corner detection and dominant point detection (and, thus, for arbitrarily-shaped curve detection), depending only on the initial state of the Cellular Automaton, without any additional hardware cost for the implementation or the reconfiguration of different masks. The proposed architecture was designed and implemented on a single VLSI chip using 0.7 μm double-layer metal (DLM) CMOS technology. The behavior of the chip was successfully verified for all sets of masks for line detection, corner detection and dominant point detection.  相似文献   

16.
董文辉  刘明业 《计算机工程》2004,30(15):24-25,96
离散小波变换是当今许多图像处理和压缩技术的基础,并被最新的ISO/IEC静态图像压缩标准JPEG2000所采用。5/3小波提升方法在JPEG2000中主要用于无损图像压缩,该文为该算法提出一种硬件结构,并在FPGA上仿真实现。  相似文献   

17.
With the augmentation in multimedia technology, demand for high-speed real-time image compression systems has also increased. JPEG 2000 still image compression standard is developed to accommodate such application requirements. Embedded block coding with optimal truncation (EBCOT) is an essential and computationally very demanding part of the compression process of JPEG 2000 image compression standard. Various applications, such as satellite imagery, medical imaging, digital cinema, and others, require high speed and performance EBCOT architecture. In JPEG 2000 standard, the context formation block of EBCOT tier-1 contains high complexity computation and also becomes the bottleneck in this system. In this paper, we propose a fast and efficient VLSI hardware architecture design of context formation for EBCOT tier-1. A high-speed parallel bit-plane coding (BPC) hardware architecture for the EBCOT module in JPEG 2000 is proposed and implemented. Experimental results show that our design outperforms well-known techniques with respect to the processing time. It can reach 70 % reduction when compared to bit plane sequential processing.  相似文献   

18.
冯燕  刘肃  谢朝辉 《计算机工程》2007,33(7):217-219
提出了一种支持H.264/AVC和AVS两款视频编解码标准的解码芯片中去块效应环路滤波(Deblocking Loop Filter)的硬件实现结构。这种结构通过采用恰当的片内Buffer管理方式和流水线设计,解决了环路滤波的硬件实现时速度慢的问题,使得效率提高。通过标准的复用,能有效地节省面积。  相似文献   

19.
针对多重信号分类(MUSIC)算法计算复杂度高,难以实时实现的特点,给出了适用于均匀线阵的实数化预处理算法和实用的空间谱定义,并选择了适合FPGA硬件实现的特征值分解算法,给出了MUSIC算法FPGA实现的整体架构。仿真实验结果表明,该FPGA实现能够完成MUSIC算法的准确、快速计算。  相似文献   

20.
This paper proposes a systematic design of a digit-serial-in-serial-out systolic multiplier for the efficient implementation of the Montgomery algorithm in an RSA cryptosystem. For processing speed, the proposed multiplier can also accommodate bit-level pipelining, thereby achieving sample speeds comparable to bit-parallel multipliers with a lower area. If the appropriate digit-size is chosen, the proposed architecture can meet the throughput requirement of a specific application with minimum hardware. The new digit-serial systolic multiplier is highly regular, nearest-neighbor connected, and thus well suited for VLSI implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号