期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Design of Encoder and Decoder for LDPC Codes Using Hybrid H‐Matrix

Chanho Lee 《ETRI Journal》2005,27(5):557-562

Low‐density parity‐check (LDPC) codes have recently emerged due to their excellent performance. However, the parity check (H) matrices of the previous works are not adequate for hardware implementation of encoders or decoders. This paper proposes a hybrid parity check matrix which is efficient in hardware implementation of both decoders and encoders. The hybrid H‐matrices are constructed so that both the semi‐random technique and the partly parallel structure can be applied to design encoders and decoders. Using the proposed methods, the implementation of encoders can become practical while keeping the hardware complexity of the partly parallel decoder structures. An encoder and a decoder are designed using Verilog‐HDL and are synthesized using a 0.35 µm CMOS standard cell library. 相似文献

2.

基于FPGA的LDPC码编译码器联合设计 总被引：1，自引：0，他引：1

袁瑞佳白宝明《电子与信息学报》2012,34(1):38-44

该文通过对低密度校验(LDPC)码的编译码过程进行分析,提出了一种基于FPGA的LDPC码编译码器联合设计方法,该方法使编码器和译码器共用同一校验计算电路和复用相同的RAM存储块,有效减少了硬件资源的消耗量。该方法适合于采用校验矩阵进行编码和译码的情况,不仅适用于全并行的编译码器结构,同时也适用于目前广泛采用的部分并行结构,且能够使用和积、最小和等多种译码算法。采用该方法对两组不同的LDPC码进行部分并行结构的编译码器联合设计,在Xilinx XC4VLX80 FPGA上的实现结果表明,设计得到的编码器和译码器可并行工作,且仅占用略多于单个译码器的硬件资源,提出的设计方法能够在不降低吞吐量的同时有效减少系统对硬件资源的需求。相似文献

3.

Block-Interlaced LDPC Decoders With Reduced Interconnect Complexity 总被引：1，自引：0，他引：1

Darabiha A. Carusone A.C. Kschischang F.R. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2008,55(1):74-78

Two design techniques are proposed for high-throughput low-density parity-check (LDPC) decoders. A broadcasting technique mitigates routing congestion by reducing the total global wirelength. An interlacing technique increases the decoder throughput by processing two consecutive frames simultaneously. The brief discusses how these techniques can be used for both fully parallel and partially parallel LDPC decoders. For fully parallel decoders with code lengths in the range of a few thousand bits, the half-broadcasting technique reduces the total global wirelength by about 26% without any hardware overhead. The block interlacing scheme is applied to the design of two fully parallel decoders, increasing the throughput by 60% and 71% at the cost of 5.5% and 9.5% gate count overhead, respectively. 相似文献

4.

High-throughput LDPC decoders

Mansour M.M. Shanbhag N.R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):976-996

A high-throughput memory-efficient decoder architecture for low-density parity-check (LDPC) codes is proposed based on a novel turbo decoding algorithm. The architecture benefits from various optimizations performed at three levels of abstraction in system design-namely LDPC code design, decoding algorithm, and decoder architecture. First, the interconnect complexity problem of current decoder implementations is mitigated by designing architecture-aware LDPC codes having embedded structural regularity features that result in a regular and scalable message-transport network with reduced control overhead. Second, the memory overhead problem in current day decoders is reduced by more than 75% by employing a new turbo decoding algorithm for LDPC codes that removes the multiple checkto-bit message update bottleneck of the current algorithm. A new merged-schedule merge-passing algorithm is also proposed that reduces the memory overhead of the current algorithm for low to moderate-throughput decoders. Moreover, a parallel soft-input-soft-output (SISO) message update mechanism is proposed that implements the recursions of the Balh-Cocke-Jelinek-Raviv (BCJR) algorithm in terms of simple "max-quartet" operations that do not require lookup-tables and incur negligible loss in performance compared to the ideal case. Finally, an efficient programmable architecture coupled with a scalable and dynamic transport network for storing and routing messages is proposed, and a full-decoder architecture is presented. Simulations demonstrate that the proposed architecture attains a throughput of 1.92 Gb/s for a frame length of 2304 bits, and achieves savings of 89.13% and 69.83% in power consumption and silicon area over state-of-the-art, with a reduction of 60.5% in interconnect length. 相似文献

5.

Design of LDPC decoders for improved low error rate performance: quantization and algorithm choices

Zhang Z. Dolecek L. Nikolic B. Anantharam V. Wainwright M.J. 《Communications, IEEE Transactions on》2009,57(11):3258-3268

Many classes of high-performance low-density parity-check (LDPC) codes are based on parity check matrices composed of permutation submatrices. We describe the design of a parallel-serial decoder architecture that can be used to map any LDPC code with such a structure to a hardware emulation platform. High-throughput emulation allows for the exploration of the low bit-error rate (BER) region and provides statistics of the error traces, which illuminate the causes of the error floors of the (2048, 1723) Reed-Solomon based LDPC (RS-LDPC) code and the (2209, 1978) array-based LDPC code. Two classes of error events are observed: oscillatory behavior and convergence to a class of non-codewords, termed absorbing sets. The influence of absorbing sets can be exacerbated by message quantization and decoder implementation. In particular, quantization and the log-tanh function approximation in sum-product decoders strongly affect which absorbing sets dominate in the errorfloor region. We show that conventional sum-product decoder implementations of the (2209, 1978) array-based LDPC code allow low-weight absorbing sets to have a strong effect, and, as a result, elevate the error floor. Dually-quantized sum-product decoders and approximate sum-product decoders alleviate the effects of low-weight absorbing sets, thereby lowering the error floor. 相似文献

6.

Optimal Overlapped Message Passing Decoding of Quasi-Cyclic LDPC Codes

Yongmei Dai Zhiyuan Yan Ning Chen 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(5):565-578

Efficient hardware implementation of low-density parity-check (LDPC) codes is of great interest since LDPC codes are being considered for a wide range of applications. Recently, overlapped message passing (OMP) decoding has been proposed to improve the throughput and hardware utilization efficiency (HUE) of decoder architectures for LDPC codes. In this paper, we first study the scheduling for the OMP decoding of LDPC codes, and show that maximizing the throughput gain amounts to minimizing the intra- and inter-iteration waiting times. We then focus on the OMP decoding of quasi-cyclic (QC) LDPC codes. We propose a partly parallel OMP decoder architecture and implement it using FPGA. For any QC LDPC code, our OMP decoder achieves the maximum throughput gain and HUE due to overlapping, hence has higher throughput and HUE than previously proposed OMP decoders while maintaining the same hardware requirements. We also show that the maximum throughput gain and HUE achieved by our OMP decoder are ultimately determined by the given code. Thus, we propose a coset-based construction method, which results in QC LDPC codes that allow our optimal OMP decoder to achieve higher throughput and HUE. 相似文献

7.

Multiple-Rate Low-Density Parity-Check Codes with Constant Blocklength

Casado A.I.V. Wen-yen Weng Valle S. Wesel R. 《Communications, IEEE Transactions on》2009,57(1):75-83

This paper describes and analyzes low-density parity-check code families that support variety of different rates while maintaining the same fundamental decoder architecture. Such families facilitate the decoding hardware design and implementation for applications that require communication at different rates, for example to adapt to changing channel conditions. Combining rows of the lowest-rate parity-check matrix produces the parity-check matrices for higher rates. An important advantage of this approach is that all effective code rates have the same blocklength. This approach is compatible with well known techniques that allow low-complexity encoding and parallel decoding of these LDPC codes. This technique also allows the design of programmable analog LDPC decoders. The proposed design method maintains good graphical properties and hence low error floors for all rates. 相似文献

8.

Window-constrained interconnect-efficient progressive edge growth LDPC codes

Aiman H. El-Maleh Mohamed Adnan Landolsi Esa A. AlGhoneim 《AEUE-International Journal of Electronics and Communications》2013,67(7):588-594

One of the attractive features of low-density parity-check (LDPC) codes is the parallel iterative nature of their iterative belief propagation decoding, making them amenable to efficient hardware implementation. However, for an arbitrary code construction, the random-like connections between the code's Tanner graph variable and check nodes makes fully-parallel implementation a difficult task as this leads to complex interconnect wiring and routing congestion. In this paper, we present a novel LDPC code design approach, based on the progressive edge growth (PEG) Tanner graph construction, to solve the problem of dense connections between processing nodes. The approach is based on controlling the maximum connection length between processing nodes in order to make fully parallel implementation feasible. The proposed algorithm offers a good compromise between error correction performance and decoder complexity. Simulation results and FPGA-based implementation comparisons are presented to demonstrate the advantages of the proposed LDPC code constructions, and it is shown that, with proper window-constrained node placement design, an improvement of up to 40% in interconnect efficiency is achievable without any significant degradation in error correction capability. 相似文献

9.

A 0.18-$muhbox m$CMOS Analog Min-Sum Iterative Decoder for a (32,8) Low-Density Parity-Check (LDPC) Code

Hemati S. Banihashemi A.H. Plett C. 《Solid-State Circuits, IEEE Journal of》2006,41(11):2531-2540

Current-mode circuits are presented for implementing analog min-sum (MS) iterative decoders. These decoders are used to efficiently decode the best known error correcting codes such as low-density parity-check (LDPC) codes and turbo codes. The proposed circuits are devised based on current mirrors, and thus, in any fabrication technology that accurate current mirrors can be designed, analog MS decoders can be implemented. The functionality of the proposed circuits is verified by implementing an analog MS decoder for a (32,8) LDPC code in a 0.18-mum CMOS technology. This decoder is the first reported analog MS decoder. For low signal to noise ratios where the circuit imperfections are dominated by the noise of the channel, the measured error correcting performance of this chip in steady-state condition surpasses that of the conventional floating-point discrete-time synchronous MS decoder. When data throughput is 6 Mb/s, loss in the coding gain compared to the conventional MS decoder at BER of 10^-3 is about 0.3 dB and power consumption is about 5 mW. This is the first time that an analog decoder has been successfully tested for an LDPC code, though a short one 相似文献

10.

Design of a 20-mb/s 256-state Viterbi decoder 总被引：1，自引：0，他引：1

Xun Liu Papaefthymiou M.C. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):965-975

The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 /spl mu/m standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W. 相似文献

11.

采用C语言FPGA技术实现LDPC码译码算法

张培陶志福周昌雄汪一鸣《微电子学与计算机》2011,28(1)

针对LDPC码(Low Density Parity Check Codes)译码算法的特点和最新一代Impulse C语言的并行编程技术,提出一种对LDPC码译码器进行FPGA(Field Programmable Gate Array)设计与实现的便捷新方案,以获得译码速率和硬件资源消耗的平衡.在XC2V2000芯片上实现了一种码率1/2,码长2500的(3,6)LDPC码译码器.实验表明当最大迭代次数为10次,主频50MHz时,译码速率可达10Mbps. 相似文献

12.

Parallel high-throughput limited search trellis decoder VLSI design

Fei Sun Tong Zhang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1013-1022

Limited search trellis decoding algorithms have great potentials of realizing low power due to their largely reduced computational complexity compared with the widely used Viterbi algorithm. However, because of the lack of operational parallelism and regularity in their original formulations, the limited search decoding algorithms have been traditionally ruled out for applications demanding very high throughput. We believe that, through appropriate algorithm and hardware architecture co-design, certain limited search trellis decoding algorithms can become serious competitors to the Viterbi algorithm for high-throughout applications. Focusing on the well-known T-algorithm, this paper presents techniques at the algorithm and VLSI architecture levels to design fully parallel T-algorithm limited search trellis decoders. We first develop a modified T-algorithm, called SPEC-T, to improve the algorithmic parallelism. Then, based on the conventional state-parallel register exchange Viterbi decoder, we develop a parallel SPEC-T decoder architecture that can effectively transform the reduced computational complexity at the algorithm level to the reduced switching activities in the hardware. We demonstrate the effectiveness of the SPEC-T design solution in the context of convolutional code decoding. Compared with state-parallel register exchange Viterbi decoders, the SPEC-T convolutional code decoders can achieve almost the same throughput and decoding performance, while realizing up to 56% power savings. For the first time, this work provides an approach to exploit the low power potential of the T-algorithm in very high throughput applications. 相似文献

13.

Block-LDPC: a practical LDPC coding system design approach

Hao Zhong Tong Zhang 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(4):766-775

This paper presents a joint low-density parity-check (LDPC) code-encoder-decoder design approach, called Block-LDPC, for practical LDPC coding system implementations. The key idea is to construct LDPC codes subject to certain hardware-oriented constraints that ensure the effective encoder and decoder hardware implementations. We develop a set of hardware-oriented constraints, subject to which a semi-random approach is used to construct Block-LDPC codes with good error-correcting performance. Correspondingly, we develop an efficient encoding strategy and a pipelined partially parallel Block-LDPC encoder architecture, and a partially parallel Block-LDPC decoder architecture. We present the estimation of Block-LDPC coding system implementation key metrics including the throughput and hardware complexity for both encoder and decoder. The good error-correcting performance of Block-LDPC codes has been demonstrated through computer simulations. With the effective encoder/decoder design and good error-correcting performance, Block-LDPC provides a promising vehicle for real-life LDPC coding system implementations. 相似文献

14.

多元LDPC译码器设计

刘飞黎海涛《信号处理》2012,28(3):397-403

在多元低密度奇偶校验码(NB-LDPC)的扩展最小和译码算法（EMS）中,由于消息向量的递归计算和校验/变量节点信息之间的迭代交换,导致译码器存在较大延迟。针对此问题本文提出了一种新型译码器结构,它优化了校验节点更新单步运算单元,根据前向后向算法规则,以3路单步运算单元完成校验节点更新,硬件资源消耗略有增加,但所需时钟周期约降为一般结构的1/3;并采用全并行运算的变量节点信息更新单元,无需利用前向后向算法将更新过程分解为多个单步运算,消除了变量节点更新的递归计算,且具有低复杂度低延时等优点,并在现场可编程门阵列(FPGA)Xilinx Virtex-4 (XC4VLX200)平台上对一个GF(16)域上(480,360)的准循环多元LDPC码进行了综合仿真。仿真结果证明,设计的译码器在较小资源消耗条件下能成倍提高吞吐量。相似文献

15.

适于分层译码算法的LDPC码构造方法

下载免费PDF全文

王达董明科陈晨金野项海格《中国通信》2012,9(7):99-107

The layered decoding algorithm has been widely used in the implementation of Low Density Parity Check (LDPC) decoders, due to its high convergence speed. However, the pipeline operation of the layered decoder may introduce memory access conflicts, which heavily deteriorates the decoder throughput. To essentially deal with the issue of memory access conflicts, we propose a construction algorithm of LDPC codes, to which a constraint condition is added in the Progressive Edge-Growth (PEG) algorithm. The constraint condition can guarantee that for our constructed LDPC codes, the sets of all the variable nodes connected to the consecutive layers do not share any common variable node, which can avoid the memory access conflicts. Simulation results show that the performance of our constructed LDPC codes is close to the several other LDPC codes adopted in wireless standards. Moreover, compared with the decoder for IEEE 802. 16e LDPC codes, the throughput of our LDPC decoder has large improvement, while the chip resource consumption is unchanged. Thus, our constructed LD-PC codes can be adopted in the high-speed transmission. 相似文献

16.

FPGA implementation of low complexity LDPC iterative decoder

Shivani Verma Sanjay Sharma 《International Journal of Electronics》2016,103(7):1112-1126

Low-density parity-check (LDPC) codes, proposed by Gallager, emerged as a class of codes which can yield very good performance on the additive white Gaussian noise channel as well as on the binary symmetric channel. LDPC codes have gained lots of importance due to their capacity achieving property and excellent performance in the noisy channel. Belief propagation (BP) algorithm and its approximations, most notably min-sum, are popular iterative decoding algorithms used for LDPC and turbo codes. The trade-off between the hardware complexity and the decoding throughput is a critical factor in the implementation of the practical decoder. This article presents introduction to LDPC codes and its various decoding algorithms followed by realisation of LDPC decoder by using simplified message passing algorithm and partially parallel decoder architecture. Simplified message passing algorithm has been proposed for trade-off between low decoding complexity and decoder performance. It greatly reduces the routing and check node complexity of the decoder. Partially parallel decoder architecture possesses high speed and reduced complexity. The improved design of the decoder possesses a maximum symbol throughput of 92.95 Mbps and a maximum of 18 decoding iterations. The article presents implementation of 9216 bits, rate-1/2, (3, 6) LDPC decoder on Xilinx XC3D3400A device from Spartan-3A DSP family. 相似文献

17.

A Flexible LDPC/Turbo Decoder Architecture

Yang Sun Joseph R. Cavallaro 《Journal of Signal Processing Systems》2011,64(1):1-16

Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm². The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding. 相似文献

18.

The Development of Turbo and LDPC Codes for Deep-Space Applications 总被引：3，自引：0，他引：3

Andrews K.S. Divsalar D. Dolinar S. Hamkins J. Jones C.R. Pollara F. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2007,95(11):2142-2156

The development of error-correcting codes has been closely coupled with deep-space exploration since the early days of both. Since the discovery of turbo codes in 1993, the research community has invested a great deal of work on modern iteratively decoded codes, and naturally NASA's Jet Propulsion Laboratory (JPL) has been very much involved. This paper describes the research, design, implementation, and standardization work that has taken place at JPL for both turbo and low-density parity-check (LDPC) codes. Turbo code development proceeded from theoretical analyses of polynomial selection, weight distributions imposed by interleaver designs, decoder error floors, and iterative decoding thresholds. A family of turbo codes was standardized and implemented and is currently in use by several spacecraft. JPL's LDPC codes are built from protographs and circulants, selected by analyses of decoding thresholds and methods to avoid loops in the code graph. LDPC encoders and decoders have been implemented in hardware for planned spacecraft, and standardization is under way. 相似文献

19.

同步部分并行结构的准循环LDPC码译码器

许恩杨姜明赵春明《电子与信息学报》2008,30(7):1630-1634

该文根据准循环LDPC码的结构特点,提出了一种同步部分并行结构的译码器。在译码器中,校验节点处理单元和变量节点处理单元同时并行工作,使得迭代过程中新产生的软信息能够被提前使用,加快迭代的收敛速度。同时,采用差分演化的方法对各节点处理单元的起始位置进行优化,进一步提高了译码器的性能。仿真结果表明,该方案在译码性能和复杂度上都要优于现有其他方案,适合高速译码器的实现。相似文献

20.

A generic,comprehensive and granular decoder complexity model for the H.264/AVC standard

《Journal of Visual Communication and Image Representation》2014,25(7):1686-1703

With recent advances in computing and communication technologies, ubiquitous access to high quality multimedia content such as high definition video using smartphones, netbooks, or tablets is a fact of our daily life. However, power consumption is still a major concern for portable devices. One approach to address this concern is to control and optimize power consumption using a power model for each multimedia application, such as a video decoder. In this paper, a generic, comprehensive and granular decoder complexity model for the baseline profile of H.264/AVC decoder has been proposed. The modeling methodology was designed to ensure a platform and implementation independent complexity model. Simulation results indicate that the proposed model estimates decoder complexity with an average accuracy of 92.15% for a wide range of test sequences using both the JM reference software and the x264 software implementation of H.264/AVC, and 89.61% for a dedicated hardware implementation of the motion compensation module. It should be noted that in addition to power consumption control, the proposed model can be used for designing a receiver-aware H.264/AVC encoder, where the complexity constraints of the receiver side are taken into account during compression. To further evaluate the proposed model, a receiver-aware encoder has been designed and implemented. Our simulation results indicate that using the proposed model the designed receiver aware encoder performs similar to the original encoder, while still being able to satisfy the complexity constraints of various decoders. 相似文献