共查询到20条相似文献,搜索用时 15 毫秒
1.
One of the most significant impediments to the use of LDPC codes in many communication and storage systems is the error-rate floor phenomenon associated with their iterative decoders. The error floor has been attributed to certain subgraphs of an LDPC code?s Tanner graph induced by so-called trapping sets. We show in this paper that once we identify the trapping sets of an LDPC code of interest, a sum-product algorithm (SPA) decoder can be custom-designed to yield floors that are orders of magnitude lower than floors of the the conventional SPA decoder. We present three classes of such decoders: (1) a bi-mode decoder, (2) a bit-pinning decoder which utilizes one or more outer algebraic codes, and (3) three generalized-LDPC decoders. We demonstrate the effectiveness of these decoders for two codes, the rate-1/2 (2640,1320) Margulis code which is notorious for its floors and a rate-0.3 (640,192) quasi-cyclic code which has been devised for this study. Although the paper focuses on these two codes, the decoder design techniques presented are fully generalizable to any LDPC code. 相似文献
2.
《Communications Letters, IEEE》2008,12(12):888-890
It has been observed that irregular decoders for lowdensity parity-check (LDPC) codes can be more robust to channel estimation errors compared to conventional decoders. In this work, by presenting a robustness measure, we propose a method for the joint optimization of irregular LDPC code-decoder pairs to have the widest convergence region when channel estimation errors exist. 相似文献
3.
In this letter we analyze the performance of a low-density parity-check iterative decoder which makes use of an additional buffer at its input, whose function is to decrease the overall complexity of the decoding circuit. We propose a semi-analytical technique that permits the evaluation of the optimum buffer length for each analyzed code. 相似文献
4.
Achieving high image quality is an important aspect in an increasing number of wireless multimedia applications. These applications require resource efficient error correction hardware to detect and correct errors introduced by the communication channel. This paper presents an innovative flexible architecture for error correction using Low-Density Parity-Check (LDPC) codes. The proposed partially-parallel decoder architecture utilizes a novel code construction technique based on multi-level Hierarchical Quasi-Cyclic (HQC) matrix. The proposed architecture is resource efficient, provides scalable throughput and requires substantially less power compared to other decoders reported to date. The proposed decoder has been implemented on a Xilinx FPGA suitable for WiMAX application and achieves a throughput of 548 Mbps. Performance evaluation of the decoder has been carried out by transmitting JPEG images over a wireless noisy channel and comparing the quality of the reconstructed images with those from other similar decoders. 相似文献
5.
Stefan Gr?nroos Kristian Nybom Jerker Bj?rkqvist 《Analog Integrated Circuits and Signal Processing》2012,73(2):583-595
The next generation DVB-T2, DVB-S2, and DVB-C2 standards for digital television broadcasting specify the use of low-density parity-check (LDPC) codes with codeword lengths of up to 64800 bits. The real-time decoding of these codes on general purpose computing hardware is useful for completely software defined receivers, as well as for testing and simulation purposes. Modern graphics processing units (GPUs) are capable of massively parallel computation, and can in some cases, given carefully designed algorithms, outperform general purpose CPUs (central processing units) by an order of magnitude or more. The main problem in decoding LDPC codes on GPU hardware is that LDPC decoding generates irregular memory accesses, which tend to carry heavy performance penalties (in terms of efficiency) on GPUs. Memory accesses can be efficiently parallelized by decoding several codewords in parallel, as well as by using appropriate data structures. In this article we present the algorithms and data structures used to make log-domain decoding of the long LDPC codes specified by the DVB-T2 standard??at the high data rates required for television broadcasting??possible on a modern GPU. Furthermore, we also describe a similar decoder implemented on a general purpose CPU, and show that high performance LDPC decoders are also possible on modern multi-core CPUs. 相似文献
6.
The need for circularly shifting an array of data is a distinguishing feature of decoders for structured low-density parity-check (LDPC) code, as a result of an efficient trade-off between performance and parallelisation of the elaborations, or throughput. Since the decoder must typically cope with blocks of data with different size, described is an efficient architecture of a reconfigurable multi-size circular shifting network, used to circularly shift an array with arbitrary size. 相似文献
7.
《AEUE-International Journal of Electronics and Communications》2014,68(5):379-383
Quasi-cyclic (QC) low-density parity-check (LDPC) codes have the parity-check matrices consisting of circulant matrices. Since QC LDPC codes whose parity-check matrices consist of only circulant permutation matrices are difficult to support layered decoding and, at the same time, have a good degree distribution with respect to error correcting performance, adopting multi-weight circulant matrices to parity-check matrices is useful but it has not been much researched. In this paper, we propose a new code structure for QC LDPC codes with multi-weight circulant matrices by introducing overlapping matrices. This structure enables a system to operate on dual mode in an efficient manner, that is, a standard QC LDPC code is used when the channel is relatively good and an enhanced QC LDPC code adopting an overlapping matrix is used otherwise. We also propose a new dual mode parallel decoder which supports the layered decoding both for the standard QC LDPC codes and the enhanced QC LDPC codes. Simulation results show that QC LDPC codes with the proposed structure have considerably improved error correcting performance and decoding throughput. 相似文献
8.
This paper presents a high-throughput decoder design for the Quasi-Cyclic (QC) Low-Density Parity-Check (LDPC) codes. Two new techniques are proposed, including parallel layered decoding architecture (PLDA) and critical path splitting. PLDA enables parallel processing for all layers by establishing dedicated message passing paths among them. The decoder avoids crossbar-based large interconnect network. Critical path splitting technique is based on articulate adjustment of the starting point of each layer to maximize the time intervals between adjacent layers, such that the critical path delay can be split into pipeline stages. Furthermore, min-sum and loosely coupled algorithms are employed for area efficiency. As a case study, a rate-1/2 2304-bit irregular LDPC decoder is implemented using ASIC design in 90nm CMOS process. The decoder can achieve the maximum decoding throughput of 2.2Gbps at 10 iterations. The operating frequency is 950MHz after synthesis and the chip area is 2.9mm2. 相似文献
9.
10.
M. Meysam Zargham Author Vitae Christian Schlegel Author Vitae 《Integration, the VLSI Journal》2010,43(4):365-377
Analog implementations of digital error control decoders, generally referred to as analog decoding, have recently been proposed as an energy and area competitive methodology. Despite several successful implementations of small analog error control decoders, little is currently known about how this methodology scales to smaller process technologies and copes with the non-idealities of nano-scale transistor sizing. A comprehensive analysis of the potential of sub-threshold analog decoding is examined in this paper. It is shown that mismatch effects dominated by threshold mismatch impose firm lower limits on the sizes of transistors. The effect of various forms of leakage currents is also investigated and minimal leakage current to normalizing currents are found using density evolution and control simulations. Finally, the convergence speed of analog decoders is examined via a density evolution approach. The results are compiled and predictions are given which show that process scaling below 90 nm processes brings no advantages, and, in some cases, may even degrade performance or increase required resources. 相似文献
11.
Zhang Z. Dolecek L. Nikolic B. Anantharam V. Wainwright M.J. 《Communications, IEEE Transactions on》2009,57(11):3258-3268
Many classes of high-performance low-density parity-check (LDPC) codes are based on parity check matrices composed of permutation submatrices. We describe the design of a parallel-serial decoder architecture that can be used to map any LDPC code with such a structure to a hardware emulation platform. High-throughput emulation allows for the exploration of the low bit-error rate (BER) region and provides statistics of the error traces, which illuminate the causes of the error floors of the (2048, 1723) Reed-Solomon based LDPC (RS-LDPC) code and the (2209, 1978) array-based LDPC code. Two classes of error events are observed: oscillatory behavior and convergence to a class of non-codewords, termed absorbing sets. The influence of absorbing sets can be exacerbated by message quantization and decoder implementation. In particular, quantization and the log-tanh function approximation in sum-product decoders strongly affect which absorbing sets dominate in the errorfloor region. We show that conventional sum-product decoder implementations of the (2209, 1978) array-based LDPC code allow low-weight absorbing sets to have a strong effect, and, as a result, elevate the error floor. Dually-quantized sum-product decoders and approximate sum-product decoders alleviate the effects of low-weight absorbing sets, thereby lowering the error floor. 相似文献
12.
Tailbiting MAP decoders 总被引:4,自引:0,他引:4
We extend the MAP decoding algorithm of Bahl et al. (1974) to the case of tail-biting trellis codes. An algorithm is given that is based on finding an eigenvector, and another that avoids this. Several examples are given. The algorithm has application to turbo decoding and source-controlled channel decoding 相似文献
13.
Decoder design involves choosing the optimal circuit style and figuring out their sizing, including adding buffers if necessary. The problem of sizing a simple chain of logic gates has an elegant analytical solution, though there have been no corresponding analytical results until now which include the resistive effects of the interconnect. Using simple RC models, we analyze the problem of optimally sizing the decoder chain with RC interconnect and find the optimum fan-out to be about 4, just as in the case of a simple buffer chain. As in the simple buffer chain, supporting a fan-out of 4 often requires noninteger number of stages in the chain. Nevertheless, this result is used to arrive at a tight lower bound on the delay of a decoder. Two simple heuristics for sizing of real decoder with integer stages are examined. We evaluate a simple technique to reduce power, namely, reducing the sizes of the inputs of the word drivers, while sizing each of the subchains for maximum speed, and find that it provides for an efficient mechanism to trade off speed and power. We then use the RC models to compare different circuit techniques in use today and find that decoders with two input gates for all stages after the predecoder and pulse mode circuit techniques with skewed N to P ratios have the best performance 相似文献
14.
We outline a new technique to compute the EXIT-characteristic of softbit-source decoders analytically without extensive histogram measurements. Based on the analytic considerations it is straightforward to derive a compact determination rule for the maximum value of attainable extrinsic information. We also show that the area under the EXIT-characteristic grows almost logarithmically with the prediction gain which is utilizable due to the residual redundancy in the source data. 相似文献
15.
We present a decoder for parallel concatenated codes that incorporates a binary-input binary-output Markov channel model, thereby allowing the receiver to utilize the statistical structure of the channel during the decoding process. These decoders can enable reliable communication at rates which are above the capacity of a memoryless channel with the same stationary bit error probability as the Markov channel, and therefore outperform systems based on the traditional approach of using a channel interleaver to create a channel which is assumed to be memoryless 相似文献
16.
Sarwate D.V. Morrison R.D. 《IEEE transactions on information theory / Professional Technical Group on Information Theory》1990,36(4):884-889
A t -error-correcting bounded-distance decoder either produces the codeword nearest the received vector (if there is a codeword at distance no more than t ) or indicates that no such codeword exists. However, BCH decoders based on the Peterson-Gorenstein-Zierler algorithm or the Euclidean algorithm can malfunction and produce output vectors that are not codewords at all. For any integer i no greater than t /2, if the received vector is at distance at most t -2i from a codeword belonging to a (t -i )-error-correcting BCH supercode, then the BCH decoder output is that codeword from the supercode 相似文献
17.
Mansour M.M. Shanbhag N.R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(4):627-650
Very large scale integration (VLSI) design methodology and implementation complexities of high-speed, low-power soft-input soft-output (SISO) a posteriori probability (APP) decoders are considered. These decoders are used in iterative algorithms based on turbo codes and related concatenated codes and have shown significant advantage in error correction capability compared to conventional maximum likelihood decoders. This advantage, however, comes at the expense of increased computational complexity, decoding delay, and substantial memory overhead, all of which hinge primarily on the well-known recursion bottleneck of the SISO-APP algorithm. This paper provides a rigorous analysis of the requirements for computational hardware and memory at the architectural level based on a tile-graph approach that models the resource-time scheduling of the recursions of the algorithm. The problem of constructing the decoder architecture and optimizing it for high speed and low power is formulated in terms of the individual recursion patterns which together form a tile graph according to a tiling scheme. Using the tile-graph approach, optimized architectures are derived for the various forms of the sliding-window and parallel-window algorithms known in the literature. A proposed tiling scheme of the recursion patterns, called hybrid tiling, is shown to be particularly effective in reducing memory overhead of high-speed SISO-APP architectures. Simulations demonstrate that the proposed approach achieves savings in area and power in the range of 4.2%-53.1% over state of the art. 相似文献
18.
High-speed architectures for Reed-Solomon decoders 总被引:2,自引:0,他引:2
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2001,9(5):641-655
New high-speed VLSI architectures for decoding Reed-Solomon codes with the Berlekamp-Massey algorithm are presented in this paper. The speed bottleneck in the Berlekamp-Massey algorithm is in the iterative computation of discrepancies followed by the updating of the error-locator polynomial. This bottleneck is eliminated via a series of algorithmic transformations that result in a fully systolic architecture in which a single array of processors computes both the error-locator and the error-evaluator polynomials. In contrast to conventional Berlekamp-Massey architectures in which the critical path passes through two multipliers and 1+[log2,(t+1)] adders, the critical path in the proposed architecture passes through only one multiplier and one adder, which is comparable to the critical path in architectures based on the extended Euclidean algorithm. More interestingly, the proposed architecture requires approximately 25% fewer multipliers and a simpler control structure than the architectures based on the popular extended Euclidean algorithm. For block-interleaved Reed-Solomon codes, embedding the interleaver memory into the decoder results in a further reduction of the critical path delay to just one XOR gate and one multiplexer, leading to speed-ups of as much as an order of magnitude over conventional architectures 相似文献
19.
The design of a single-chip VLSI system to implement the Zigangirov-Jelinek sequential decoding algorithm for bit-error-correction is described and the dependence of performance on design parameters is discussed. By virtue of being self-contained, having few input and output pins, and processing stack elements once each clock cycle, the system should be capable of high-speed decoding. For constraint length 21, rate 1/2 codes, and 3-b soft decision detection, it is found that a system containing approximately 25000 stack cells reduces errors in a 3-dB signal-to-noise level environment, corresponding to 7.8% hard decision error rate, by two orders of magnitude. Higher decoding gain is obtained at lower noise levels through the use of a relatively long constraint length. The constraint length is not limited by the architecture. Chip area estimates needed to obtain prescribed decoded error rates and average decoding rates are also described and indicate that an effective system is potentially achievable with current technology 相似文献
20.
Jinwei Wang Shiguo Lian Leiming Yan Jin Han Yuxiang Wang 《Telecommunication Systems》2013,54(3):305-313
With the rapid development of Internet-based media consuming, the piracy issue becomes urgent, which makes copyright protection a hot research topic. There exist various watermarking methods that are regarded as potential solutions to copyright protection. However, most of watermarking methods have different properties, and the combinatorial method is expected to make use of different methods’ advantages and to get a good tradeoff. In this paper, a novel hybrid watermark embedding rule is proposed to select the general additive embedding rule or the special additive embedding rule securely. The selection is controlled by a secret key, and the watermark is embedded into the Discrete Wavelet Transform (DWT) domain. With respect to the hybrid embedding rule, the novel optimum and locally optimum hybrid additive decoder is proposed, which is based on the minimum Bayesian risk criterion. And simultaneously, the performance of the optimum hybrid decoder is theoretically analyzed, with the DWT coefficients modelled as the generalized Gaussian distribution. Furthermore, the security of the proposed hybrid watermarking scheme is proved higher than that of existing schemes. Finally, empirical experimental results are given to prove the validity of theoretical analysis. 相似文献