共查询到11条相似文献,搜索用时 7 毫秒
1.
Scalable Parallel Memory Architectures for Video Coding 总被引:1,自引:0,他引:1
Current video compression standards, which process frames macroblock by macroblock, employ several processing functions to achieve the compression. These functions refer to data memory address space in different ways. E.g., performing motion estimation and motion compensation functions requires many times data accesses unaligned to word boundaries. On the other hand, Discrete Cosine Transformation (DCT) and inverse of it (IDCT) for 8 × 8 block can be performed first for rows and then for columns. Thus, transposition is needed between these two stages. Among other things, parallel memory architecture can provide a solution for these tasks. In our other paper, we shortly surveyed parallel memory architectures and proposed parallel memory architecture designs for different data path widths for video coding applications. In this paper, we construct video coding function examples by using the proposed parallel data memory efficiently. Furthermore, performance and implementation cost of the parallel memory architecture are estimated and compared to more conventional memory architectures. The examples are given for different data bus widths (16, 32, 64, and 128 bits). We show that the parallel memory can keep the data path fully utilized in many video coding function implementations. This ensures high-speed operation and full utilization of the processing resources. 相似文献
2.
Lode Nachtergaele Toon Gijbels Jan Bormans Francky Catthoor Ivo Bolsens 《The Journal of VLSI Signal Processing》2001,27(1-2):161-169
Upcoming multi-media compression applications will require high memory bandwidth. In this paper, we estimate that a software reference implementation of an MPEG-4 video decoder typically requires 200 Mtransfers/s to memory to decode 1 CIF (352×288) Video Object Plane (VOP) at 30 frames/s. This imposes a high penalty in terms of power but also performance.However, we also show that we can heavily improve on the memory transfers, without sacrificing speed (even gaining about 10% on cache misses and cycles for a DEC Alpha), by aggressive code transformations. For this purpose, we have manually applied an extended version of our data transfer and storage exploration (DTSE) methodology, which was originally developed for custom hardware implementations. 相似文献
3.
In modern multimedia applications, memory bottleneck can be alleviated with special stride data accesses. Data elements in stride access can be retrieved in parallel with parallel memories, in which the idea is to increase memory bandwidth with several memory modules working in parallel and feed the processor with only necessary data. Arbitrary stride access capability with interleaved memories is described in previous research where the skewing scheme is changed at run time according to the currently used stride. This paper presents the improved schemes which are adapted to parallel memories. The proposed novel parallel memory implementation allows conflict free accesses with all the constant strides which has not been possible in prior application specific parallel memories. Moreover, the possible access locations are unrestricted and the accessed data element count equals to the number of memory modules. Timing and area estimates are given for Altera Stratix FPGA and 0.18 micrometer CMOS process with memory module count from 2 to 32. The FPGA results show 129 MHz clock frequency for a system with 16 memory modules when read and write latencies are 3 and 2 clock cycles, respectively. The complexity of the proposed system is shown to be a trade-off between application specific and highly configurable parallel memory system. 相似文献
4.
AVS中的音视频编码压缩技术 总被引:6,自引:1,他引:6
介绍了音视频编码标准AVS中的主要技术特点,对AVS标准所采用的主要技术进行了综述,给出了AVS视频标准与MPEG-4 AVC/H.264编码器性能的比较和分析,讨论了AVS的发展前景. 相似文献
5.
本文在分析标准遗传算法的优越性与存在不足的基础上,借鉴生命科学中免疫的概念与理论,提出了一种新的算法——免疫算法.该算法的核心在于免疫算子的构造,而免疫算子又是通过接种疫苗和免疫选择两个步骤来完成的.理论证明免疫算法是收敛的,并结合TSP问题,提出了免疫疫苗的选取与免疫算子的构造方法.最后,用免疫算法对75城市的TSP问题进行了仿真计算,并将其计算过程与标准遗传算法进行了对比,结果表明该算法对减轻遗传算法后期的波动现象具有明显的效果,同时使收敛的速度有较大的提高. 相似文献
6.
随着SIMD(Single Instruction Multiple Data stream)结构DSP(Digital Signal Processor)片上集成了越来越多的处理单元,并行访存的灵活性及带宽效率对实际运算性能的影响越来越大.本文详细分析了一般SIMD结构DSP中基2 FFT(Fast Fourier Transform)并行算法面临的访存问题,采用简单的部分地址异或逻辑完成SIMD并行访存地址转换,实现了FFT运算的无冲突SIMD并行访存;提出了几种带特殊混洗模式的向量访存指令,可完全消除SIMD结构下基2 FFT运算时需要的额外混洗指令操作.最后将其应用于某16路SIMD数字信号处理器YHFT-Matrix2中向量存储器VM的优化设计.测试结果表明,采用该SIMD并行存储结构优化的VM以增加18%的硬件开销实现了FFT运算全流水无冲突并行访存和100%并行访存带宽利用率;相比优化前的设计,不同点数FFT运算可获得1.32~2.66的加速比. 相似文献
7.
8.
9.
10.
11.
A Software-Hardware Co-Implementation of MPEG-4 Advanced Video Coding (AVC) Decoder with Block Level Pipelining 总被引:3,自引:0,他引:3
Shih-Hao Wang Wen-Hsiao Peng Yuwen He Guan-Yi Lin Cheng-Yi Lin Shih-Chien Chang Chung-Neng Wang Tihao Chiang 《The Journal of VLSI Signal Processing》2005,41(1):93-110
We present a baseline MPEG-4 Advanced Video Coding (AVC) decoder based on the methodology of joint optimization of software and hardware. The software is first optimized with algorithm improvements for frame buffer management, boundary padding, content-aware inverse transform and context-based entropy decoding. The overall decoding throughput is further enhanced by pipelining the software and the dedicated hardware at macroblock level. The decoder is partitioned into the software and hardware modules according to the target frame rate and complexity profiles. The hardware acceleration modules include motion compensation, inverse transform and loop filtering. By comparing the optimized decoder with the committee reference decoder of Joint Video Team (JVT), the experimental results show improvement on the decoding throughput by 7 to 8 times. On an ARM966 board, the optimized software without hardware acceleration can achieve a decoding rate up to 5.9 frames per second (fps) for QCIF video source. The overall throughput is improved by another 27% to 7.4 fps on the average and up to 11.5 fps for slow motion video sequences. Finally, we provide a theoretical analysis of the ideal performance of the proposed decoder.Shih-Hao Wang was born in Tainan, Taiwan, R.O.C. in 1977. He received the M.S. degree in Electrical and Control Engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2001, where he is currently working toward the Ph.D. degree in the Institute of Electronics.His research interests are video compression and VLSI implementation.Wen-Hsiao Peng was born in Hsin-Chu, Taiwan, Republic of China, in 1975. He received the B.S. and the M.S. degrees in Electrics Engineering from National Chiao-Tung University, Hsin-Chu, Taiwan, in 1997 and 1999respectively. During 2000–2001, he was an intern in Intel Microprocessor Research Lab, U.S.A. In 2002, he joined the Institute of Electronics of National Chiao-Tung University, where he is currently a Ph.D candidate. His major research interests include scalable video coding, video codec optimization and platform based architecture design for video compression applications. Since 2000, he has been working with video coding development and implementation. He has actively contributed to the development of MPEG-4 Fine Granularity Scalability (FGS) and MPEG-21 Scalable Video Coding (Now, MPEG-4 Part 10 AVC Amd.1).Yu-Wen Hereceived his Ph.D. degree in computer application from Tsinghua University in 2002. He was a lecture of the Department of Computer Science and Technology from 2002 to 2003 in Tsinghua University. In 2004, he joined Internet Media group of Microsoft Research Asia.His research interests include video coding, transmission and embedded multimedia application systems.Guan-yi Lin was born in Kaohsiung, Taiwan in 1981. He received the B.S. degree in Electronics Engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2003, where he is currently working toward the M.S. degree in the Institute of Electronics.His research interests are video compression and communication systems design.Cheng-Yi Lin was born in Tainan, Taiwan in 1981. He received the B.S. degree in Electronics Engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2003, where he is currently working toward the M.S. degree in the Institute of Electronics.His research interests are on-chip communication and testing.Shih-Chien Chang was born in Taichung, Taiwan in 1981. He received the B.S. degree in Electronics Engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2003, where he is currently working toward the M.S. degree in the Institute of Electronics.His research interests are video compression and VLSI implementation.Chung-Neng Wang was born in PingTung, Taiwan, in 1972. He received the B.S. degree and Ph.D degree in computer science and information engineering from National Chiao-Tung University (NCTU), HsinChu, Taiwan in 1994 and 2003, respectively. He joined the faculty at National Chiao-Tung University in Taiwan, R.O.C in January 2003.Since 2001 he has actively participated in ISO’s Moving Picture Experts Group (MPEG) digital video coding standardization process. He has made more than 18 contributions to the MPEG committee over the past 4 years. He published over 23 technical journal and conference papers in the field of video and signal processing. His current research interests are video/image compression, motion estimation, video transcoding, and streaming.Tihao Chiangwas born in Cha-Yi, Taiwan, Republic of China, 1965. He received the B.S. degree in electrical engineering from the National Taiwan University, Taipei, Taiwan, in 1987, and the M.S. degree in electrical engineering from Columbia University in 1991. He received his Ph.D. degree in electrical engineering from Columbia University in 1995. In 1995, he joined David Sarnoff Research Center as a Member of Technical Staff. Later, he was promoted as a technology leader and a program manager at Sarnoff. While at Sarnoff, he led a team of researchers and developed an optimized MPEG-2 software encoder. For his work in the encoder and MPEG-4 areas, he received two Sarnoff achievement awards and three Sarnoff team awards.Since 1992 he has actively participated in ISO’s Moving Picture Experts Group (MPEG) digital video coding standardization process with particular focus on the scalability/compatibility issue. He is currently the co-editor of the part 7 on the MPEG-4 committee. He has made more than 90 contributions to the MPEG committee over the past 10 years. His main research interests are compatible/scalable video compression, stereoscopic video coding, and motion estimation. In September 1999, he joined the faculty at National Chiao-Tung University in Taiwan, R.O.C. Dr. Chiang is currently a senior member of IEEE and holder of 13 US patents and 30 European and worldwide patents. He was a co-recipient of the 2001 best paper award from the IEEE Transactions on Circuits and Systems for Video Technology. He published over 50 technical journal and conference papers in the field of video and signal processing. 相似文献