首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A high-throughput memory-efficient decoder architecture for low-density parity-check (LDPC) codes is proposed based on a novel turbo decoding algorithm. The architecture benefits from various optimizations performed at three levels of abstraction in system design-namely LDPC code design, decoding algorithm, and decoder architecture. First, the interconnect complexity problem of current decoder implementations is mitigated by designing architecture-aware LDPC codes having embedded structural regularity features that result in a regular and scalable message-transport network with reduced control overhead. Second, the memory overhead problem in current day decoders is reduced by more than 75% by employing a new turbo decoding algorithm for LDPC codes that removes the multiple checkto-bit message update bottleneck of the current algorithm. A new merged-schedule merge-passing algorithm is also proposed that reduces the memory overhead of the current algorithm for low to moderate-throughput decoders. Moreover, a parallel soft-input-soft-output (SISO) message update mechanism is proposed that implements the recursions of the Balh-Cocke-Jelinek-Raviv (BCJR) algorithm in terms of simple "max-quartet" operations that do not require lookup-tables and incur negligible loss in performance compared to the ideal case. Finally, an efficient programmable architecture coupled with a scalable and dynamic transport network for storing and routing messages is proposed, and a full-decoder architecture is presented. Simulations demonstrate that the proposed architecture attains a throughput of 1.92 Gb/s for a frame length of 2304 bits, and achieves savings of 89.13% and 69.83% in power consumption and silicon area over state-of-the-art, with a reduction of 60.5% in interconnect length.  相似文献   

2.
Low-density parity-check (LDPC) codes, proposed by Gallager, emerged as a class of codes which can yield very good performance on the additive white Gaussian noise channel as well as on the binary symmetric channel. LDPC codes have gained lots of importance due to their capacity achieving property and excellent performance in the noisy channel. Belief propagation (BP) algorithm and its approximations, most notably min-sum, are popular iterative decoding algorithms used for LDPC and turbo codes. The trade-off between the hardware complexity and the decoding throughput is a critical factor in the implementation of the practical decoder. This article presents introduction to LDPC codes and its various decoding algorithms followed by realisation of LDPC decoder by using simplified message passing algorithm and partially parallel decoder architecture. Simplified message passing algorithm has been proposed for trade-off between low decoding complexity and decoder performance. It greatly reduces the routing and check node complexity of the decoder. Partially parallel decoder architecture possesses high speed and reduced complexity. The improved design of the decoder possesses a maximum symbol throughput of 92.95 Mbps and a maximum of 18 decoding iterations. The article presents implementation of 9216 bits, rate-1/2, (3, 6) LDPC decoder on Xilinx XC3D3400A device from Spartan-3A DSP family.  相似文献   

3.
Highly parallel decoders for convolutional turbo codes have been studied by proposing two parallel decoding architectures and a design approach of parallel interleavers. To solve the memory conflict problem of extrinsic information in a parallel decoder, a block-like approach in which data is written row-by-row and read diagonal-wise is proposed for designing collision-free parallel interleavers. Furthermore, a warm-up-free parallel sliding window architecture is proposed for long turbo codes to maximize the decoding speeds of parallel decoders. The proposed architecture increases decoding speed by 6%-34% at a cost of a storage increase of 1% for an eight-parallel decoder. For short turbo codes (e.g., length of 512 bits), a warm-up-free parallel window architecture is proposed to double the speed at the cost of a hardware increase of 12%  相似文献   

4.
Iterative turbo decoder analysis based on density evolution   总被引:4,自引:0,他引:4  
We track the density of extrinsic information in iterative turbo decoders by actual density evolution, and also approximate it by symmetric Gaussian density functions. The approximate model is verified by experimental measurements. We view the evolution of these density functions through an iterative decoder as a nonlinear dynamical system with feedback. Iterative decoding of turbo codes and of serially concatenated codes is analyzed by examining whether a signal-to-noise ratio (SNR) for the extrinsic information keeps growing with iterations. We define a “noise figure” for the iterative decoder, such that the turbo decoder will converge to the correct codeword if the noise figure is bounded by a number below zero dB. By decomposing the code's noise figure into individual curves of output SNR versus input SNR corresponding to the individual constituent codes, we gain many new insights into the performance of the iterative decoder for different constituents. Many mysteries of turbo codes are explained based on this analysis. For example, we show why certain codes converge better with iterative decoding than more powerful codes which are only suitable for maximum likelihood decoding. The roles of systematic bits and of recursive convolutional codes as constituents of turbo codes are crystallized. The analysis is generalized to serial concatenations of mixtures of complementary outer and inner constituent codes. Design examples are given to optimize mixture codes to achieve low iterative decoding thresholds on the signal-to-noise ratio of the channel  相似文献   

5.
We present a bandwidth-efficient channel coding scheme that has an overall structure similar to binary turbo codes, but employs trellis-coded modulation (TCM) codes (including multidimensional codes) as component codes. The combination of turbo codes with powerful bandwidth-efficient component codes leads to a straightforward encoder structure, and allows iterative decoding in analogy to the binary turbo decoder. However, certain special conditions may need to be met at the encoder, and the iterative decoder needs to be adapted to the decoding of the component TCM codes. The scheme has been investigated for 8-PSK, 16-QAM, and 64-QAM modulation schemes with varying overall bandwidth efficiencies. A simple code choice based on the minimal distance of the punctured component code has also been performed. The interset distances of the partitioning tree can be used to fix the number of coded and uncoded bits. We derive the symbol-by-symbol MAP component decoder operating in the log domain, and apply methods of reducing decoder complexity. Simulation results are presented and compare the scheme with traditional TCM as well as turbo codes with Gray mapping. The results show that the novel scheme is very powerful, yet of modest complexity since simple component codes are used  相似文献   

6.
List decoding of turbo codes is analyzed under the assumption of a maximum-likelihood (ML) list decoder. It is shown that large asymptotic gains can be achieved on both the additive white Gaussian noise (AWGN) and fully interleaved flat Rayleigh-fading channels. It is also shown that the relative asymptotic gains for turbo codes are larger than those for convolutional codes. Finally, a practical list decoding algorithm based on the list output Viterbi algorithm (LOVA) is proposed as an approximation to the ML list decoder. Simulation results show that the proposed algorithm provides significant gains corroborating the analytical results. The asymptotic gain manifests itself as a reduction in the bit-error rate (BER) and frame-error rate (FER) floor of turbo codes  相似文献   

7.
In this paper, we propose a flexible turbo decoding algorithm for a high order modulation scheme that uses a standard half‐rate turbo decoder designed for binary quadrature phase‐shift keying (B/QPSK) modulation. A transformation applied to the incoming I‐channel and Q‐channel symbols allows the use of an off‐the‐shelf B/QPSK turbo decoder without any modifications. Iterative codes such as turbo codes process the received symbols recursively to improve performance. As the number of iterations increases, the execution time and power consumption also increase. The proposed algorithm reduces the latency and power consumption by combination of the radix‐4, dual‐path processing, parallel decoding, and early‐stop algorithms. We implement the proposed scheme on a field‐programmable gate array and compare its decoding speed with that of a conventional decoder. The results show that the proposed flexible decoding algorithm is 6.4 times faster than the conventional scheme.  相似文献   

8.
A 1024-b, rate-1/2, soft decision low-density parity-check (LDPC) code decoder has been implemented that matches the coding gain of equivalent turbo codes. The decoder features a parallel architecture that supports a maximum throughput of 1 Gb/s while performing 64 decoder iterations. The parallel architecture enables rapid convergence in the decoding algorithm to be translated into low decoder switching activity resulting in a power dissipation of only 690 mW from a 1.5-V supply  相似文献   

9.
In this letter we present a low-complexity architecture designed for the decoding of block turbo codes. In particular we simplify the implementation of Pyndiah?s algorithm by not storing any of the concurrent codewords generated by the list decoder.  相似文献   

10.
Near-optimum decoding of product codes: block turbo codes   总被引:2,自引:0,他引:2  
This paper describes an iterative decoding algorithm for any product code built using linear block codes. It is based on soft-input/soft-output decoders for decoding the component codes so that near-optimum performance is obtained at each iteration. This soft-input/soft-output decoder is a Chase decoder which delivers soft outputs instead of binary decisions. The soft output of the decoder is an estimation of the log-likelihood ratio (LLR) of the binary decisions given by the Chase decoder. The theoretical justifications of this algorithm are developed and the method used for computing the soft output is fully described. The iterative decoding of product codes is also known as the block turbo code (BTC) because the concept is quite similar to turbo codes based on iterative decoding of concatenated recursive convolutional codes. The performance of different Bose-Chaudhuri-Hocquenghem (BCH)-BTCs are given for the Gaussian and the Rayleigh channel. Performance on the Gaussian channel indicates that data transmission at 0.8 dB of Shannon's limit or more than 98% (R/C>0.98) of channel capacity can be achieved with high-code-rate BTC using only four iterations. For the Rayleigh channel, the slope of the bit-error rate (BER) curve is as steep as for the Gaussian channel without using channel state information  相似文献   

11.
万国春  陈岚 《电视技术》2007,31(3):28-31
针对DVB-RCS中的双二进制Turbo码,提出一种新的改进译码方法.根据该改进算法,基于流水线设计思想,设计出硬件译码结构,给出时序图.结果表明,对于ATM和MPEG两种帧,其误比特性能比max-log-map算法提高0.2 dB.  相似文献   

12.
This paper is devoted to the finite-length analysis of turbo decoding over the binary erasure channel (BEC). The performance of iterative belief-propagation decoding of low-density parity-check (LDPC) codes over the BEC can be characterized in terms of stopping sets. We describe turbo decoding on the BEC which is simpler than turbo decoding on other channels. We then adapt the concept of stopping sets to turbo decoding and state an exact condition for decoding failure. Apply turbo decoding until the transmitted codeword has been recovered, or the decoder fails to progress further. Then the set of erased positions that will remain when the decoder stops is equal to the unique maximum-size turbo stopping set which is also a subset of the set of erased positions. Furthermore, we present some improvements of the basic turbo decoding algorithm on the BEC. The proposed improved turbo decoding algorithm has substantially better error performance as illustrated by the given simulation results. Finally, we give an expression for the turbo stopping set size enumerating function under the uniform interleaver assumption, and an efficient enumeration algorithm of small-size turbo stopping sets for a particular interleaver. The solution is based on the algorithm proposed by Garello et al. in 2001 to compute an exhaustive list of all low-weight codewords in a turbo code.  相似文献   

13.
This work proposes a VLSI decoding architecture for concatenated convolutional codes. The novelty of this architecture is twofold: 1) the possibility to switch on-the-fly from the universal mobile telecommunication system turbo decoder to the WiMax duo-binary turbo decoder with a limited resources overhead compared to a single-mode WiMax architecture; and 2) the design of a parallel, collision free WiMax decoder architecture. Compared to two single-mode solutions, the proposed architecture achieves a complexity reduction of 17.1% and 27.3% in terms of logic and memory, respectively. The proposed, flexible architecture has been characterized in terms of performance and complexity on a 0.13-mum standard cell technology, and sustains a maximum throughput of more than 70 Mb/s.  相似文献   

14.
A channel decoder chip compliant with the 3GPP mobile wireless standard is described. It supports both data and voice calls simultaneously in a unified turbo/Viterbi decoder architecture. For voice services, the decoder can process over 128 voice channels encoded with rate 1/2 or 1/3, constraint length 9 convolutional codes. For data services, the turbo decoder is capable of processing any mix of rate 1/3, constraint length 4 turbo encoded data streams with an aggregate data rate of up to 2.5 Mb/s with 10 iterations per block (or 4.1 Mb/s with six iterations). The turbo decoder uses the logMAP algorithm with a programmable logsum correction table. It features an interleaver address processor that computes the 3GPP interleaver addresses for all block sizes enabling it to quickly switch context to support different data services for several users. The decoder also contains the 3GPP first channel de-interleaving function and a post-decoder bit error rate estimation unit. The chip is fabricated in a 0.18-/spl mu/m six-layer metal CMOS technology, has an active area of 9 mm/sup 2/, and has a peak clock frequency of 110.8 MHz at 1.8 V (nominal). The power consumption is 306 mW when turbo decoding a 2-Mb/s data stream with ten iterations per block and eight voice calls simultaneously.  相似文献   

15.
根据实际中Turbo译码器硬件实现的重要性,提出了一种适合于并行计算的改进Log-MAP译码算法,即在其译码计算中间参数的过程中,将具有n个输入变量的最大近似算法max*运算简化为取最大值的max运算和相关函数的计算,减少了存储量,有效实现了低复杂度的Turbo译码器的硬件结构。将此改进的算法应用于CCSDS标准和Wi MAX标准中,仿真结果表明,所提出的简化的近似算法与传统的Log-MAP算法对比,有效降低了译码复杂度和时延,而且纠错性能接近Log-MAP算法,便于实际工程应用。  相似文献   

16.
A low-complexity design architecture for implementing the Successive Cancellation (SC) decoding algorithm for polar codes is presented. Hardware design of polar decoders is accomplished using SC decoding due to the reduced intricacy of the algorithm. Merged processing element (MPE) block is the primary area occupying factor of the SC decoder as it incorporates numerous sign and magnitude conversions. Two’s complement method is typically used in the MPE block of SC decoder. In this paper, a low-complex MPE architecture with minimal two’s complement conversion is proposed. A reformulation is also applied to the merged processing elements at the final stage of SC decoder to generate two output bits at a time. The proposed merged processing element thereby reduces the hardware complexity of the SC decoder and also reduces latency by an average of 64%. An SC decoder with code length 1024 and code rate 1/2 was designed and synthesized using 45-nm CMOS technology. The implementation results of the proposed decoder display significant improvement in the Technology Scaled Normalized Throughput (TSNT) value and an average 48% reduction in hardware complexity compared to the prevalent SC decoder architectures. Compared to the conventional SC decoder, the presented method displayed a 23% reduction in area.  相似文献   

17.
Random linear network coding is an efficient technique for disseminating information in networks, but it is highly susceptible to errors. Kötter-Kschischang (KK) codes and Mahdavifar-Vardy (MV) codes are two important families of subspace codes that provide error control in noncoherent random linear network coding. List decoding has been used to decode MV codes beyond half distance. Existing hardware implementations of the rank metric decoder for KK codes suffer from limited throughput, long latency and high area complexity. The interpolation-based list decoding algorithm for MV codes still has high computational complexity, and its feasibility for hardware implementations has not been investigated. In this paper we propose efficient decoder architectures for both KK and MV codes and present their hardware implementations. Two serial architectures are proposed for KK and MV codes, respectively. An unfolded decoder architecture, which offers high throughput, is also proposed for KK codes. The synthesis results show that the proposed architectures for KK codes are much more efficient than rank metric decoder architectures, and demonstrate that the proposed decoder architecture for MV codes is affordable.  相似文献   

18.
The use of "turbo codes" has been proposed for several applications, including the development of wireless systems, where highly reliable transmission is required at very low signal-to-noise ratios (SNR). The problem of extracting the best coding gains from these kind of codes has been deeply investigated in the last years. Also the hardware implementation of turbo codes is a very challenging topic, mainly due to the iterative nature of the decoding process, which demands an operating frequency much higher than the data rate; in the case of wireless applications, the design constraints became even more strict due to the low-cost and low-power requirements. This paper first presents a new architecture for the decoder core with improved area and power dissipation properties; then partitioning techniques are proposed to reduce the power consumption of the decoder memories. It is proven that most of the power is dissipated by the large RAM units required by the decoder, so the described technique is very efficient: an average power saving of 70% with an area overhead of 23% has been obtained on a set of analyzed architectures.  相似文献   

19.
介绍了多维Turbo的编码原理,并提出了一种基于MAP算法的多维Turbo码译码器结构。研究了多维Turbo码的性能及优点,并通过仿真与二维的Turbo码进行了比较。分析表明,多维Turbo码可以获得更好的性能。  相似文献   

20.
We consider the iterative decoding of generalized low-density (GLD) parity-check codes where, rather than employ an optimal subcode decoder, a Chase (1972) algorithm decoder more commonly associated with "turbo product codes" is used. GLD codes are low-density graph codes in which the constraint nodes are other than single parity-checks. For extended Hamming-based GLD codes, we use bit error rates derived by simulation to demonstrate this new strategy to be successful at higher code rates. For long block lengths, good performance close to capacity is possible with decoding costs reduced further since the Chase decoder employed is an efficient implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号