期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimal Overlapped Message Passing Decoding of Quasi-Cyclic LDPC Codes

Yongmei Dai Zhiyuan Yan Ning Chen 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(5):565-578

Efficient hardware implementation of low-density parity-check (LDPC) codes is of great interest since LDPC codes are being considered for a wide range of applications. Recently, overlapped message passing (OMP) decoding has been proposed to improve the throughput and hardware utilization efficiency (HUE) of decoder architectures for LDPC codes. In this paper, we first study the scheduling for the OMP decoding of LDPC codes, and show that maximizing the throughput gain amounts to minimizing the intra- and inter-iteration waiting times. We then focus on the OMP decoding of quasi-cyclic (QC) LDPC codes. We propose a partly parallel OMP decoder architecture and implement it using FPGA. For any QC LDPC code, our OMP decoder achieves the maximum throughput gain and HUE due to overlapping, hence has higher throughput and HUE than previously proposed OMP decoders while maintaining the same hardware requirements. We also show that the maximum throughput gain and HUE achieved by our OMP decoder are ultimately determined by the given code. Thus, we propose a coset-based construction method, which results in QC LDPC codes that allow our optimal OMP decoder to achieve higher throughput and HUE. 相似文献

2.

一种准循环LDPC解码器的设计与实现 总被引：5，自引：5，他引：0

李刚黑勇仇玉林《微电子学与计算机》2008,25(7)

面向准循环LDPC码的硬件实现,定点分析了各种解码算法的解码性能,偏移量最小和(OMS)算法具备较高解码性能和实现复杂度低的特点.提出一种基于部分并行方式的准循环LDPC解码器结构,在FPGA上利用该结构成功实现了WiMAX标准中的LDPC解码器.FPGA验证结果表明,采用该结构的解码器性能优良,实现复杂度低,数据吞吐率高. 相似文献

3.

High-throughput layered decoder implementation for quasi-cyclic LDPC codes 总被引：2，自引：0，他引：2

Zhang K. Huang X. Wang Z. 《Selected Areas in Communications, IEEE Journal on》2009,27(6):985-994

This paper presents a high-throughput decoder design for the Quasi-Cyclic (QC) Low-Density Parity-Check (LDPC) codes. Two new techniques are proposed, including parallel layered decoding architecture (PLDA) and critical path splitting. PLDA enables parallel processing for all layers by establishing dedicated message passing paths among them. The decoder avoids crossbar-based large interconnect network. Critical path splitting technique is based on articulate adjustment of the starting point of each layer to maximize the time intervals between adjacent layers, such that the critical path delay can be split into pipeline stages. Furthermore, min-sum and loosely coupled algorithms are employed for area efficiency. As a case study, a rate-1/2 2304-bit irregular LDPC decoder is implemented using ASIC design in 90nm CMOS process. The decoder can achieve the maximum decoding throughput of 2.2Gbps at 10 iterations. The operating frequency is 950MHz after synthesis and the chip area is 2.9mm². 相似文献

4.

A Flexible LDPC/Turbo Decoder Architecture

Yang Sun Joseph R. Cavallaro 《Journal of Signal Processing Systems》2011,64(1):1-16

Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm². The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding. 相似文献

5.

CORDIC instructions for LDPC decoding on SDR platforms

Murugappan Senthilvelan Meng Yu Daniel Iancu Mihai Sima Michael Schulte 《Analog Integrated Circuits and Signal Processing》2011,69(2-3):191-206

Wireless protocols strive to increase spectral efficiency and achieve high data throughput. Low-density parity-check (LDPC) codes are advanced forward error correction (FEC) codes that use iterative decoding techniques to achieve close to the Shannon capacity. Due to their superior performance, state-of-art wireless protocols, such as WiMAX and LTE Advanced, are adopting LDPC codes. LDPC codes come with the high cost of drastically increased computational effort for decoding. Among the proposed decoding algorithms, the belief propagation (BP) algorithm leads to a good approximation of an optimal decoder; however, it uses compute-intensive hyperbolic trigonometric functions. To reduce the computational complexity, typical LDPC decoder implementations use simplified algorithms, such as the min-sum algorithm, at the expense of reduced signal processing performance. Efficient and accurate methods to compute hyperbolic trigonometric functions can facilitate the use of the BP algorithm in real-time LDPC decoder implementations. This paper investigates hyperbolic COordinate Rotation DIgital Computer (CORDIC) instruction set architecture (ISA) extensions for software-defined radio (SDR) processors to compute the hyperbolic trigonometric functions for LDPC decoding efficiently. The CORDIC ISA extensions are evaluated on the low-power multi-threaded Sandbridge Sandblaster? SB3000 platform. The computational performance, numerical accuracy, hardware estimates, power consumption estimates, and memory requirements with the CORDIC ISA extensions are compared to a baseline implementation without these extensions on the SB3000. 相似文献

6.

High-throughput LDPC decoders

Mansour M.M. Shanbhag N.R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):976-996

A high-throughput memory-efficient decoder architecture for low-density parity-check (LDPC) codes is proposed based on a novel turbo decoding algorithm. The architecture benefits from various optimizations performed at three levels of abstraction in system design-namely LDPC code design, decoding algorithm, and decoder architecture. First, the interconnect complexity problem of current decoder implementations is mitigated by designing architecture-aware LDPC codes having embedded structural regularity features that result in a regular and scalable message-transport network with reduced control overhead. Second, the memory overhead problem in current day decoders is reduced by more than 75% by employing a new turbo decoding algorithm for LDPC codes that removes the multiple checkto-bit message update bottleneck of the current algorithm. A new merged-schedule merge-passing algorithm is also proposed that reduces the memory overhead of the current algorithm for low to moderate-throughput decoders. Moreover, a parallel soft-input-soft-output (SISO) message update mechanism is proposed that implements the recursions of the Balh-Cocke-Jelinek-Raviv (BCJR) algorithm in terms of simple "max-quartet" operations that do not require lookup-tables and incur negligible loss in performance compared to the ideal case. Finally, an efficient programmable architecture coupled with a scalable and dynamic transport network for storing and routing messages is proposed, and a full-decoder architecture is presented. Simulations demonstrate that the proposed architecture attains a throughput of 1.92 Gb/s for a frame length of 2304 bits, and achieves savings of 89.13% and 69.83% in power consumption and silicon area over state-of-the-art, with a reduction of 60.5% in interconnect length. 相似文献

7.

一种低译码复杂度的Turbo架构LDPC码

熊磊谈振辉姚冬苹《电子与信息学报》2007,29(12):2907-2911

针对低密度奇偶校验(LDPC)码较大的译码复杂度和RAM占用,该文提出了一种低译码复杂度的Turbo架构LDPC码并行交织级联Gallager码 (Parallel Interleaved Concatenated Gallager Code,PICGC)。该文给出了PICGC的设计方法和编译码算法,并分析比较了PICGC译码器与LDPC译码器所需的RAM存储量,推导出RAM节省比的上界。理论分析和仿真结果表明,PICGC以纠错性能略微降低为代价,有效地降低译码复杂度和RAM存储量,且译码时延并未增加,是一种有效且易于实现的信道编码方案。相似文献

8.

Block-LDPC: a practical LDPC coding system design approach

Hao Zhong Tong Zhang 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(4):766-775

This paper presents a joint low-density parity-check (LDPC) code-encoder-decoder design approach, called Block-LDPC, for practical LDPC coding system implementations. The key idea is to construct LDPC codes subject to certain hardware-oriented constraints that ensure the effective encoder and decoder hardware implementations. We develop a set of hardware-oriented constraints, subject to which a semi-random approach is used to construct Block-LDPC codes with good error-correcting performance. Correspondingly, we develop an efficient encoding strategy and a pipelined partially parallel Block-LDPC encoder architecture, and a partially parallel Block-LDPC decoder architecture. We present the estimation of Block-LDPC coding system implementation key metrics including the throughput and hardware complexity for both encoder and decoder. The good error-correcting performance of Block-LDPC codes has been demonstrated through computer simulations. With the effective encoder/decoder design and good error-correcting performance, Block-LDPC provides a promising vehicle for real-life LDPC coding system implementations. 相似文献

9.

A Memory Efficient Partially Parallel Decoder Architecture for Quasi-Cyclic LDPC Codes

Wang Z. Cui Z. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):483-488

This paper presents a memory efficient partially parallel decoder architecture suited for high rate quasi-cyclic low-density parity-check (QC-LDPC) codes using (modified) min-sum algorithm for decoding. In general, over 30% of memory can be saved over conventional partially parallel decoder architectures. Efficient techniques have been developed to reduce the computation delay of the node processing units and to minimize hardware overhead for parallel processing. The proposed decoder architecture can linearly increase the decoding throughput with a small percentage of extra hardware. Consequently, it facilitates the applications of LDPC codes in area/power sensitive high-speed communication systems 相似文献

10.

基于矩阵分块的LDPC码快速编码结构研究

窦金芳周诠《微电子学与计算机》2007,24(1):166-168

低密度奇偶校验(LDPC)码由于具有接近香农限的性能和高速并行的译码结构而成为研究热点。然而,当码长很长时,编译码器的硬件实现变得很困难。文章从编译码实际实现的角度出发,提出一种基于分块的LDPC码下三角形校验矩阵结构,降低了编译码复杂度,不仅可以实现线性时间编码,同时还可以实现部分并行译码。仿真结果表明,具有这种结构的LDPC码和随机构造的LDPC码相比具有同样好的纠错性能。相似文献

11.

A 690-mW 1-Gb/s 1024-b, rate-1/2 low-density parity-check codedecoder

Blanksby A.J. Howland C.J. 《Solid-State Circuits, IEEE Journal of》2002,37(3):404-412

A 1024-b, rate-1/2, soft decision low-density parity-check (LDPC) code decoder has been implemented that matches the coding gain of equivalent turbo codes. The decoder features a parallel architecture that supports a maximum throughput of 1 Gb/s while performing 64 decoder iterations. The parallel architecture enables rapid convergence in the decoding algorithm to be translated into low decoder switching activity resulting in a power dissipation of only 690 mW from a 1.5-V supply 相似文献

12.

IEEE802.16e标准LDPC译码器设计与实现 总被引：1，自引：1，他引：0

杨建平陈庆春《通信技术》2010,43(5):84-86,206

LDPC码自在上个世纪90年代被重新发现以来,以其接近香农极限的差错控制性能,以及译码复杂度低、吞吐率高的优点引起了人们的关注,成为继Turbo码之后信道编码界的又一研究热点。利用FPGA设计并实现了一种基于IEEE802.16e标准的LDPC码译码器。该译码器采用偏移最小和（Offset Min-Sum）算法,其偏移因子β取值为0.125,具有接近置信传播（Belief Propagation）算法浮点的性能。译码器在结构上采用了部分并行结构,可以灵活支持标准中定义的所有码率和码长的LDPC码的译码。此外,该译码器还支持对连续输入的数据块进行处理,并具有动态停止迭代功能。硬件综合结果表明,该译码器工作频率为150MHz时,固定15次迭代,最低可达到95Mb/s的译码吞吐率,完全满足802.16e标准的要求。相似文献

13.

800Mbps准循环LDPC码译码器的FPGA实现 总被引：1，自引：0，他引：1

张仲明许拔杨军张尔扬《信号处理》2010,26(2):255-261

本文提出了一种适用于准循环低密度校验码的低复杂度的高并行度译码器架构。通常准循环低密度校验码不适于设计有效的高并行度高吞吐量译码器。我们通过利用准循环低密度校验码的奇偶校验矩阵的结构特点,将其转化为块准循环结构,从而能够并行化处理译码算法的行与列操作。使用这个架构,我们在Xilinx Virtex-5 LX330 FPGA上实现了(8176,7154)有限几何LDPC码的译码器,在15次迭代的条件下其译码吞吐量达到800Mbps。相似文献

14.

改进LDPC解码算法的DSP实现

张赟吴乐南《现代电子技术》2007,30(1):58-60

LDPC码作为一种新兴的信道编码,已经成功应用于欧洲数字卫星广播(DVB-S2)系统。在LDPC的消息传递解码算法中,Log-BP算法是最优的,但是算法中两次非线性运算增加了实现的复杂度。而Min-Sum算法复杂度低,但却有0.5 dB的性能损失。通过高斯逼近密度进化算法对在Min-Sum算法中引入的规范化因子进行寻优后发现,改进的算法基本不增加复杂度,却能达到与Log-BP算法非常接近的性能,最后在AD公司的DSP上实现了这一算法。相似文献

15.

基于FPGA的不可分层LDPC码译码器

江涛仰枫帆《无线电通信技术》2012,38(1):25-28,62

针对不可分层LDPC码无法采用分层译码算法的问题,设计了一种新型的LDPC码分层译码器。与传统分层译码器的结构不同,新结构在各层间进行并行更新,各层内进行串行更新。通过保证在不同分层的同一变量节点不同时进行更新,达到分层译码算法分层递进更新的目标。选用Altera公司的CycloneⅢ系列EP3C120器件,实现码率3/4,码长8 192的(3,12)规则不可分层QC-LDPC码译码器的布局布线,在最大迭代次数为5次时,最高时钟频率可以达到45.44 MHz,吞吐量可以达到47.6 Mbps。相似文献

16.

Low Complexity Decoder Architecture for Low-Density Parity-Check Codes

Daesun Oh Keshab K. Parhi 《Journal of Signal Processing Systems》2009,56(2-3):217-228

In this paper, we propose a low complexity decoder architecture for low-density parity-check (LDPC) codes using a variable quantization scheme as well as an efficient highly-parallel decoding scheme. In the sum-product algorithm for decoding LDPC codes, the finite precision implementations have an important tradeoff between decoding performance and hardware complexity caused by two dominant area-consuming factors: one is the memory for updated messages storage and the other is the look-up table (LUT) for implementation of the nonlinear function Ψ(x). The proposed variable quantization schemes offer a large reduction in the hardware complexities for LUT and memory. Also, an efficient highly-parallel decoder architecture for quasi-cyclic (QC) LDPC codes can be implemented with the reduced hardware complexity by using the partially block overlapped decoding scheme and the minimized power consumption by reducing the total number of memory accesses for updated messages. For (3, 6) QC LDPC codes, our proposed schemes in implementing the highly-parallel decoder architecture offer a great reduction of implementation area by 33% for memory area and approximately by 28% for the check node unit and variable node unit computation units without significant performance degradation. Also, the memory accesses are reduced by 20%. 相似文献

17.

Configurable Multi-Rate Decoder Architecture for QC-LDPC Codes Based Broadband Broadcasting System

Luoming Zhang Lin Gui Youyun Xu Wenjun Zhang 《Broadcasting, IEEE Transactions on》2008,54(2):226-235

In this paper we present a Base-matrix based decoder architecture for multi-rate QC-LDPC codes proposed in broadband broadcasting system. We use the Modified Min-Sum Algorithm (MMSA) as the decoding algorithm in this architecture, which lowers the complexity of the LDPC decoder while keeping almost the same performance or even better. Based on this algorithm, we designed a novel check node processing unit to reduce the complexity of the decoder and facilitate the multiplex of the processing units. The decoder designed with hardware constraints is not only scalable in throughput, but also easily configurable to support different QC-LDPC codes flexible in code rate and code length. 相似文献

18.

AWGN信道下LDPC码的译码算法研究

袁建国仝青振黄胜王永《半导体光电》2013,34(4):642-644,648

在高斯白噪声(AWGN)信道情况下,针对LDPC码的译码算法进行深入分析后,对适用于低密度奇偶校验(LDPC)码的硬判决译码算法与软判决译码算法进行了仿真与对比分析,并通过引入乘性校正因子以降低软判决算法中对数域置信传播(LLR-BP)算法的变量消息相关性。仿真分析表明改进后的LLR-BP算法与原算法相比,在几乎不增加计算复杂度的情况下,其译码纠错性能得到了明显的改善。因而改进后的LLR-BP算法具有明显的优越性。相似文献

19.

一种提高LDPC译码器吞吐率的译码算法

张金贵斐文端许星辰姜文哲《无线电工程》2008,38(6):49-52

为了设计高效的LDPC译码器,结合准循环结构LDPC的校验矩阵H的规律性、乘性修正最小和译码算法不需要估计信道质量的特点和部分并行译码实现复杂度低的特点,介绍了一种新的译码算法——交迭的部分并行译码算法,这种译码算法相对于采用部分并行结构的BP译码算法,不但降低了硬件实现的复杂度,减少了存储资源的开销,而且提高了译码器的吞吐率。相似文献

20.

RS码与LDPC码的联合迭代译码

贾镇泽《电视技术》2013,37(17)

针对RS码与LDPC码的串行级联结构,提出了一种基于自适应置信传播(ABP)的联合迭代译码方法.译码时,LDPC码置信传播译码器输出的软信息作为RS码ABP译码器的输入;经过一定迭代译码后,RS码译码器输出的软信息又作为LDPC译码器的输入.软输入软输出的RS译码器与LDPC译码器之间经过多次信息传递,译码性能有很大提高.码长中等的LDPC码采用这种级联方案,可以有效克服短环的影响,消除错误平层.仿真结果显示:AWGN信道下这种基于ABP的RS码与LDPC码的联合迭代译码方案可以获得约0.8 dB的增益. 相似文献