首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Turbo-code becomes popular for the next generation wireless communication systems because of its remarkable coding performance. One of the problems for decoding turbo-code in the receiver is the complexity and the high power consumption since multiple iterations of Soft Output Viterbi Algorithm (SOVA) or Maximum a posteriori (MAP) decoding have to be carried out to decode a data frame. To reduce the complexity of the turbo-code decoder, adaptive iteration based on cyclic redundancy checking (CRC) and output convergence approaches has been proposed to reduce the average number of iterations required for decoding a data frame. This results in a system that has variable workload since the amount of computation required for decoding each data frame is different. In this work, we propose a dynamic voltage scaling approach to further reduce the power consumption. Different from other variable workload systems, the workload here is not known at the time when the data is being decoded. Thus, optimum voltage assignment is not feasible. We propose several heuristic algorithms to assign supply voltage for different decoding iterations. Simulation results show that significant reduction of power consumption is achieved comparing with the system using fixed supply voltage  相似文献   

2.
In this paper, we propose a low complexity decoder architecture for low-density parity-check (LDPC) codes using a variable quantization scheme as well as an efficient highly-parallel decoding scheme. In the sum-product algorithm for decoding LDPC codes, the finite precision implementations have an important tradeoff between decoding performance and hardware complexity caused by two dominant area-consuming factors: one is the memory for updated messages storage and the other is the look-up table (LUT) for implementation of the nonlinear function Ψ(x). The proposed variable quantization schemes offer a large reduction in the hardware complexities for LUT and memory. Also, an efficient highly-parallel decoder architecture for quasi-cyclic (QC) LDPC codes can be implemented with the reduced hardware complexity by using the partially block overlapped decoding scheme and the minimized power consumption by reducing the total number of memory accesses for updated messages. For (3, 6) QC LDPC codes, our proposed schemes in implementing the highly-parallel decoder architecture offer a great reduction of implementation area by 33% for memory area and approximately by 28% for the check node unit and variable node unit computation units without significant performance degradation. Also, the memory accesses are reduced by 20%.  相似文献   

3.
In this paper, we propose hardware architecture for a high‐speed context‐adaptive variable length coding (CAVLC) decoder in H.264. In the CAVLC decoder, the codeword length of the current decoding block is used to determine the next input bitstreams (valid bits). Since the computation of valid bits increases the total processing time of CAVLC, we propose two techniques to reduce processing time: one is to reduce the number of decoding steps by introducing a lookup table, and the other is to reduce cycles for calculating the valid bits. The proposed CAVLC decoder can decode 1920×1088 30 fps video in real time at a 30.8 MHz clock.  相似文献   

4.
This paper presents an MPEG‐4 video codec, called MoVa, for video coding applications that adopts 3G‐324M. We designed MoVa to be optimal by embedding a cost‐effective ARM7TDMI core and partitioning it into hardwired blocks and firmware blocks to provide a reasonable tradeoff between computational requirements, power consumption, and programmability. Typical hardwired blocks are motion estimation and motion compensation, discrete cosine transform and quantization, and variable length coding and decoding, while intra refresh, rate control, error resilience, error concealment, etc. are implemented by software. MoVa has a pipeline structure and its operation is performed in four stages at encoding and in three stages at decoding. It meets the requirements of MPEG‐4 SP@L2 and can perform either 30 frames/s (fps) of QCIF or SQCIF, or 7.5 fps (in codec mode) to 15 fps (in encode/decode mode) of CIF at a maximum clock rate of 27 MHz for 128 kbps or 144 kbps. MoVa can be applied to many video systems requiring a high bit rate and various video formats, such as videophone, videoconferencing, surveillance, news, and entertainment.  相似文献   

5.
主要针对当前H.264/AVC中CAVLC中的标准解码方法 TLSS查表时存在查表时间长的问题,提出了一种全新的基于哈希表快速查询的CAVLC解码查表优化方法。在CAVLC解码查表中引入哈希表查找技术,提高了CAVLC解码查表速度,降低了CAVLC解码中不规则可变长码表(UVLCT)的码字获取时间,从而减少CAVLC解码查表时间。实验仿真结果表明,在没有丝毫降低视频解码质量前提下,相比于标准TLSS方法,提出的新算法可以提高约18%~22%的表查找时间。  相似文献   

6.
袁建国  汪哲  何昌伟  王永 《半导体光电》2016,37(4):532-535,591
光通信系统中低密度奇偶校验(Low-density Parity-check,LDPC)码采用对数似然比置信传播(Log-likelihood Ratio Belief Propagation,LLR-BP)算法进行译码时,在高信噪比区域迭代译码过程中会出现变量节点外部信息振荡不收敛而导致译码纠错性能的降低.为满足光通信系统的要求,提出了一种削弱外部消息振荡的改进LLR-BP译码算法.该算法通过引入加权系数平衡前后两次迭代之间变量节点传递的外部信息,明显减缓了外部信息的振荡现象.仿真结果表明:与传统LLR-BP译码算法相比,该改进LLR-BP算法具有更佳的误码性能,同时降低变量节点外部信息振荡现象并加快了译码的收敛速度.  相似文献   

7.
A low-power dual-standard video decoder has been developed for mobile applications. It supports MPEG-2 SP@ML and H.264/AVC BL@L4 video decoding in a single chip and features a scalable architecture to reach area/power efficiency. This chip integrates diverse algorithms of MPEG-2 and H.264/AVC to reduce silicon area. Three low-power techniques are proposed. First, a domain-pipelined scalability (DPS) technique is used to optimize the pipelined structure according to the number of processing cycles. Second, bandwidth scalability is implemented via a line-pixel-lookahead (LPL) scheme to improve the external bandwidth and reduce the internal memory size, leading to 51% of memory power reduction compared to a conventional design. Third, low-power motion compensation and deblocking filter are designed to reduce the operating frequency without degrading system performance. A test chip is fabricated in a 0.18mum one-poly six-metal CMOS technology with an area of 15.21 mm2. For mobile applications, H.264/AVC and MPEG-2 video decoding of quarter-common intermediate format (QCIF) sequences at 15 frames per second are achieved at 1.15 MHz clock frequency with power dissipation of 125 muW and 108 muW, respectively, at 1V supply voltage  相似文献   

8.
MPEG-2视频码流分解的VHDL描述与验证   总被引:2,自引:0,他引:2  
本文提出一个MPEG-2视频解码中码流分解的硬件设计,包括解码控制和变长码解码。一些新的硬件设计,如:将宏块和块控制作为主要状态;采用桶形移位缓冲器并行解变长码;将变长码的长度计算和解码分别进行;将码表分割成多个小码表等等,保证MPEG-2MP@ML的实时解码,并为更复杂的应用提供了扩展的余地。本文中的设计是MPEG-2解码ASIC VLSI设计工作的一部分。  相似文献   

9.
A single chip system for real–time MPEG–2 decoding can be created by integrating a general purpose dual–issue RISC processor, with a small dedicated hardware for the variable length decoding (VLD) and block loading processes; a 32KB instruction RAM; and a 32KB data RAM. The VLD hardware performs Huffman decoding on the input data. The block loader performs the half–sample prediction for motion compensation and acts as a direct memory access (DMA) controller for the RISC processor by transferring data between an external 2MB DRAM and the internall 32 KB data RAM. The dual-issue RISC processor, running at 250MHz, is enhanced with a set of key sub-word and multimedia instructions for a sustained peak performance of 1000 MOPS. With this setup for MPEG-2 decoding applications, bi-directionally predicted non-intra video blocks are decoded in less than 800 cycles, leading to a single-chip, real-time MPEG-2 decoding system.  相似文献   

10.
We have designed a microprocessor that is based on a single instruction multiple data stream (SIMD) architecture. It features a two-way superscalar architecture for multimedia embedded systems that need to support especially MPEG2 video decoding/encoding and 3DCG image processing. This microprocessor meets all requirements of embedded systems, including (a) MPEG2 (MP@ML) decoding and graphic processing capabilities for three-dimensional images, (b) programming flexibility, and (c) low power consumption and low manufacturing cost. High performance was achieved by enhanced parallel processing capabilities while adopting a SIMD architecture and a two-way superscalar architecture. Programming flexibility was increased by providing 170 dedicated multimedia instructions. Low power consumption was achieved by utilizing advanced process technology and power-saving circuits. The processor supports a general-purpose RISC instruction set. This feature is important, as the processor will have to work as a controller of various target systems. The processor has been fabricated by 0.21-μm CMOS four-metal technology on a 9.84×10.12 mm die. It performs 2.16 GOPS/720 MFLOPS at an operating frequency of 180 MHz, with a power consumption of 1.2 W and a power supply of 1.8 V  相似文献   

11.
The H.264/AVC video coding standard can deliver high compression efficiency at a cost of increased complexity and power. The increasing popularity of video capture and playback on portable devices requires that the power of the video codec be kept to a minimum. This work implements several architecture optimizations such as increased parallelism, pipelining with FIFOs, multiple voltage/frequency domains, and custom voltage-scalable SRAMs that enable low voltage operation to reduce the power of a high-definition decoder. Dynamic voltage and frequency scaling can efficiently adapt to the varying workloads by leveraging the low voltage capabilities and domain partitioning of the decoder. An H.264/AVC Baseline Level 3.2 decoder ASIC was fabricated in 65-nm CMOS and verified. For high definition 720p video decoding at 30 frames per second (fps), it operates down to 0.7$~$ V with a measured power of 1.8 mW, which is significantly lower than previously published results. The highly scalable decoder is capable of operating down to 0.5 V for decoding QCIF at 15 fps with a measured power of 29 $mu$W.   相似文献   

12.
This paper describes a lookup-table (LUT)-based digital predistortion system usable for enhanced data for global system for mobile evolution (EDGE) handset transmitters. The system is memoryless and capable of improving average efficiency and performance in terms of the leakage power at offset frequencies and error vector magnitude. The obtainable efficiency at maximum linear output power is comparable, but at backoffs superior to commercial EDGE power amplifiers (PAs). Minimum system requirements on word length and LUT size have been investigated, which shows that a LUT having approximately 500 coefficients and a system word length of 13 bits are sufficient for EDGE. The proposed system is simple compared to basestation implementations comprising PA memory compensation and can be easily implemented in handsets in order to improve the overall system performance. The effects of antenna mismatch on system performance have been investigated  相似文献   

13.
The current forward error correction (FEC) scheme for very high bit-rate digital subscriber line (VDSL) systems in the ANSI standard employs a 16-state four-dimensional (4D) Wei code as the inner code and the Reed-Solomon (RS) code as the outer code. The major drawback of this scheme is that further improvement cannot be achieved without a substantial increase in the complexity and power penalty. Also, a VDSL system employing the 4D Wei-RS scheme operates far below the channel capacity. In 1993, powerful turbo codes were introduced whose performance closely approaches the Shannon limit. In this paper, we propose a bandwidth and power efficient turbo coding scheme for VDSL modems in order to obtain high data rates, extended loop reach and increased transmission robustness. We also propose a pipelined decoding scheme to reduce the latency at the receiver end. The objective of the proposed scheme is to provide a higher coding gain than that given by the 4D Wei-RS scheme, resulting in an improved performance of the VDSL modems in terms of bit rate, loop length and transmitting power. The scheme is investigated for various values of transmitting power, signaling frequencies and numbers of crosstalkers for a targeted bit error rate of 10−5 and is implemented in a system with a quadrature amplitude modulation in which a mixed set partitioning mapping is employed to reduce the decoding complexity. The effects of code complexity, interleaver length, the number of decoding iterations and the level of modulation on the performance of VDSL modems are explored. Simulation results are presented and compared to those of the 4D Wei-RS scheme. The results show that the choice of turbo codes not only provides a significant coding gain over the standard FEC scheme but also efficiently maximizes the loop length and bit rate at a very low transmitting power in the presence of dominant far-end crosstalk and intersymbol interference. In order to compare the hardware complexity, we synthesize the proposed and 4D Wei-RS schemes using SYNOPSYS with the target technology of Xilinx 4020e-3. The Xilinx field programmable gate array statistics of the proposed scheme is compared with that of the 4D Wei-RS scheme.  相似文献   

14.
叶文伟 《半导体光电》2014,35(5):877-880
依据SCG-LDPC码的结构特点提出了一种高效的分层可靠置信传播(HRBP)译码算法,该算法结合分层迭代与可靠度判决测量有效降低后续迭代过程中的变量节点数,同时加快了收敛速度。针对适用于光传输系统的SCG-LDPC(3 969,3 720)码进行仿真,仿真结果表明HRBP算法与传统的BP算法相比,在保证性能的同时大大降低了运算量,在阈值为15时,HRBP译码算法误码率性能与BP译码算法相当,但是后续迭代的变量节点数在高信噪比下相比BP译码算法减少约69%,当阈值进一步增大时,HRBP算法将逐步退化为分层置信传播(Layered-BP)译码算法。  相似文献   

15.
In this work, we propose a novel entropy coding mode decision algorithm to balance the tradeoff between the rate-distortion (R-D) performance and the entropy decoding complexity for the H.264/AVC video coding standard. Context-based adaptive binary arithmetic coding (CABAC), context-based adaptive variable length coding (CAVLC), and universal variable length coding (UVLC) are three entropy coding tools adopted by H.264/AVC. CABAC can be used to encode the texture and the header data while CAVLC and UVLC are employed to encode the texture and the header data, respectively. Although CABAC can provide better R-D performance than CAVLC/UVLC, its decoding complexity is higher. Thus, by taking the entropy decoding complexity into account, CABAC may not be the best tool, which motivates us to examine the entropy coding mode decision problem in depth. It will be shown experimentally that the proposed mode decision algorithm can help the encoder generate the bit streams that can be decoded at much lower complexity with little R-D performance loss.  相似文献   

16.
An effective hierarchical reliable belief propagation (HRBP) decoding algorithm is proposed according to the struc- tural characteristics of systematically constructed Gallager low-density parity-check (SCG-LDPC) codes. The novel decoding algorithm combines the layered iteration with the reliability judgment, and can greatly reduce the number of the variable nodes involved in the subsequent iteration process and accelerate the convergence rate. The result of simulation for SCG-LDPC(3969,3720) code shows that the novel HRBP decoding algorithm can greatly reduce the computing amount at the condition of ensuring the performance compared with the traditional belief propagation (BP) algorithm. The bit error rate (BER) of the HRBP algorithm is considerable at the threshold value of 15, but in the sub- sequent iteration process, the number of the variable nodes for the HRBP algorithm can be reduced by about 70% at the high signal-to-noise ratio (SNR) compared with the BP algorithm. When the threshold value is further increased, the HRBP algorithm will gradually degenerate into the layered-BP algorithm, but at the BER of 10-7 and the maximal iteration number of 30, the net coding gain (NCG) of the HRBP algorithm is 0.2 dB more than that of the BP algo- rithm, and the average iteration times can be reduced by about 40% at the high SNR. Therefore, the novel HRBP de- coding algorithm is more suitable for optical communication systems.  相似文献   

17.
Viterbi decoder is a common module in communication system, which has the requirement of low power and low decoding latency. The conventional register exchange (RE) algorithm and memory-based trace-back (TB) algorithm cannot meet both constraints of power and decoding latency. In this paper, we propose a new Survivor Memory Unit (SMU) algorithm, named State Exchange (SE) algorithm. The SE algorithm uses the trace-forward unit (TFU) to run the decoding operation for low decoding latency. Besides, we enhance the SE algorithm by the concept of the trace-back (TB). Based on this enhancement, we propose two types of SE-SMU. Proposed type-I SE-SMU has lower register requirement with a long critical path. Proposed type-II SE-SMU can support the high speed requirement with the cost of additional TFUs and latency. Both two proposed SE-SMUs have the decoding latency slightly higher than the decoding latency of RE-SMU. We synthesized the proposed architecture in TSMC 0.13 um technology. Both two approaches have fewer active registers as decoding. From the power analysis, proposed SE-SMUs can give a 70% power reduction comparing with RE-SMU at 100 MHz with the decoding length = 96. The power saving ration will increase further with the longer decoding length.  相似文献   

18.
Under severely unreliable channel, decoding of error‐correcting codes frequently fails, which requires a lot of computational complexity, especially, in the iterative decoding algorithm. In hybrid automatic repeat request systems, most of computation power is wasted on failed decoding if a codeword is retransmitted many times. Therefore, early stopping of iterative decoding needs to be adopted. In this paper, we propose a new stopping algorithm of iterative belief propagation decoding for low‐density parity‐check codes, which is effective on both high and low signal‐to‐noise ratio ranges and scalable to variable code rate and length. The proposed stopping algorithm combines several good stopping criteria. Each criterion is extremely simple and will not be a burden to the overall system. With the proposed stopping algorithm, it is shown via numerical analysis that the decoding complexity of hybrid automatic repeat request system with adaptive modulation and coding scheme can be fairly reduced. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
Techniques using Reed-Solomon (RS) codes to recover lost packets in digital video/audio broadcasting and packet switched network communications are reviewed. Usually, different RS codes and their corresponding encoders/decoders are designed and utilized to meet different requirements for different systems and applications. We incorporate these techniques into a variable RS code and present encoding and decoding algorithms suitable for the variable RS code. A mother RS code can be used to produce a variety of RS codes and the same encoder/decoder can be used for all the derivative codes, with adding/detecting zeros, removing some parity symbols and adding erasures. A VLSI implementation for erasure decoding of the variable RS code is described and the achievable performance is quantitatively analyzed. A typical example shows that the signal processing speed is up to 2.5 Gbits/second and the processing delay is less than one millisecond, when integrating the decoder on a single chip. Therefore, the proposed algorithm and the encoder/decoder can universally be utilized for different applications with various requirements, such as transmission data rate, packet length, packet loss protection capacity, as well as layered protection and adaptive redundancy protection in DVB/DAB, Internet and mobile Internet communications.  相似文献   

20.
在物联网监测系统中,由于视频图像所占空间较大,且所需的通信带宽要求高,会在传输和存储的过程中造成困难,因此需要对视频图像进行压缩。文中介绍了一种适用于物联网监测系统的视频图像压缩方法,即通过帧预测、DCT变换和可变长编码相结合的方式对视频图像进行编码和相应的解码。经过实验表明,该方法有效地压缩了视频图像,且在提高视频图像传输速率的同时,减少了传输时间。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号