首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Microelectronics Journal》2014,45(11):1489-1498
In this paper, an area efficient and high throughput multi-rate quasi-cyclic low-density parity-check (QC-LDPC) decoder for IEEE 802.11n applications is proposed. An overlapped message passing scheme and the non-uniform quantization scheme are incorporated to reduce the overall area and power of the proposed QC-LDPC decoder. In order to enhance the decoding throughput and reduce the size of memories storing soft messages, an improved early termination (ET) scheme and base matrix reordering technique is employed. These techniques significantly reduce the total number of decoding iterations and memory accessing conflicts without mitigating the decoding performance. Equipped with these techniques an area efficient and high throughput multi-rate QC-LDPC decoder is designed, simulated and implemented with Xilinx Virtex6 (XC6VLX760-2FF1760) for an irregular LDPC code of length 1944 and code rates (1/2–5/6) specified in IEEE 802.11n standard. With a maximum clock frequency of 574.136–587.458 MHz the proposed QC-LDPC decoder can achieve throughput in the range of 1.27–2.17 Gb/s for 10 decoding iterations. Furthermore, by using Cadence RTL compiler with UMC 130 nm VLSI technology, the core area of the proposed QC-LDPC decoder is found to be 1.42 mm2 with a power dissipation in the range of 101.25–140.42 mW at 1.2 V supply voltage.  相似文献   

2.
Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding.  相似文献   

3.
A Split decoding algorithm is proposed which divides each row of the parity check matrix into two or multiple nearly-independent simplified partitions. The proposed method significantly reduces the wire interconnect and decoder complexity and therefore results in fast, small, and high energy efficiency circuits. Three full-parallel decoder chips for a (2,048, 1,723) LDPC code compliant with the 10GBASE-T standard using MinSum normalized, MinSum Split-2, and MinSum Split-4 methods are designed in 65 nm, seven metal layer CMOS. The Split-4 decoder occupies 6.1 mm2, operates at 146 MHz, delivers 19.9 Gbps throughput, with 15 decoding iterations. At 0.79 V, it operates at 47 MHz, delivers 6.4 Gbps and dissipates 226 mW. Compared to MinSum normalized, the Split-4 decoder chip is 3.3 times smaller, has a clock rate and throughput 2.5 times higher, is 2.5 times more energy efficient, and has an error performance degradation of 0.55 dB with 15 iterations.  相似文献   

4.
800Mbps准循环LDPC码译码器的FPGA实现   总被引:1,自引:0,他引:1  
张仲明  许拔  杨军  张尔扬 《信号处理》2010,26(2):255-261
本文提出了一种适用于准循环低密度校验码的低复杂度的高并行度译码器架构。通常准循环低密度校验码不适于设计有效的高并行度高吞吐量译码器。我们通过利用准循环低密度校验码的奇偶校验矩阵的结构特点,将其转化为块准循环结构,从而能够并行化处理译码算法的行与列操作。使用这个架构,我们在Xilinx Virtex-5 LX330 FPGA上实现了(8176,7154)有限几何LDPC码的译码器,在15次迭代的条件下其译码吞吐量达到800Mbps。   相似文献   

5.
Efficient hardware implementation of low-density parity-check (LDPC) codes is of great interest since LDPC codes are being considered for a wide range of applications. Recently, overlapped message passing (OMP) decoding has been proposed to improve the throughput and hardware utilization efficiency (HUE) of decoder architectures for LDPC codes. In this paper, we first study the scheduling for the OMP decoding of LDPC codes, and show that maximizing the throughput gain amounts to minimizing the intra- and inter-iteration waiting times. We then focus on the OMP decoding of quasi-cyclic (QC) LDPC codes. We propose a partly parallel OMP decoder architecture and implement it using FPGA. For any QC LDPC code, our OMP decoder achieves the maximum throughput gain and HUE due to overlapping, hence has higher throughput and HUE than previously proposed OMP decoders while maintaining the same hardware requirements. We also show that the maximum throughput gain and HUE achieved by our OMP decoder are ultimately determined by the given code. Thus, we propose a coset-based construction method, which results in QC LDPC codes that allow our optimal OMP decoder to achieve higher throughput and HUE.  相似文献   

6.
Low Density Parity-Check (LDPC) codes achieve the best performance when they are decoded with the sum-product (SP) algorithm. This is a two-phase iterative algorithm where two types of messages are interchanged and updated in each iteration. The group-shuffled or layered decoding schemes applied to the SP algorithm speed up its convergence by modifying its schedule, so they yield a reduction in the number of iterations required to achieve a given performance. However, the two-phase processing is still maintained. In this paper a modification of the group-shuffled scheme suitable for high-rate LDPC codes is proposed. The modification allows the overlapping of the two-phase computation, achieving a convergence speed up close to that of the group-shuffled scheme with higher throughput. Besides, high throughput architectures are presented for the modified algorithm. As an example, the proposed architecture has been implemented for the 2048-bit LDPC code of the IEEE 802.3an standard and it was synthesized in a 90 nm CMOS process achieving a throughput of 22.40 Gbps at 14 iterations with a clock frequency of 306 MHz and a total area of 10.5 mm2. Furthermore, the decoder performs within 0.5 dB of the floating-point 100 iterations sum-product algorithm at a PER of 10−5.  相似文献   

7.
Ma Zhuo  Du Shuanyi 《ETRI Journal》2015,37(4):736-742
A serial concatenated decoding algorithm with dynamic threshold is proposed for low‐density parity‐check codes with short and medium code lengths. The proposed approach uses a dynamic threshold to select a decoding result from belief propagation decoding and order statistic decoding, which improves the performance of the decoder at a negligible cost. Simulation results show that, under a high SNR region, the proposed concatenated decoder performs better than a serial concatenated decoder without threshold with an Eb/N0 gain of above 0.1 dB.  相似文献   

8.
In this paper we present a Base-matrix based decoder architecture for multi-rate QC-LDPC codes proposed in broadband broadcasting system. We use the Modified Min-Sum Algorithm (MMSA) as the decoding algorithm in this architecture, which lowers the complexity of the LDPC decoder while keeping almost the same performance or even better. Based on this algorithm, we designed a novel check node processing unit to reduce the complexity of the decoder and facilitate the multiplex of the processing units. The decoder designed with hardware constraints is not only scalable in throughput, but also easily configurable to support different QC-LDPC codes flexible in code rate and code length.  相似文献   

9.
This article proposes to explore parallelism in Turbo-Product Code (TPC) decoding through a parallelism level classification and characterization. From this design space exploration, an innovative TPC decoder architecture without any interleaving resource is presented. This architecture includes a fully-parallel SISO decoder capable of processing n symbols in one clock period. Syntheses results show the better efficiency of such an architecture compared with existing solutions. Considering a six-iteration turbo decoder of a BCH(32,26)2 product code, synthesized in 90 nm CMOS technology, 10 Gb/s can be achieved with an area of 600 Kgates. Moreover, a second architecture enhancing parallelism rate is described. The throughput is 50 Gb/s while an area estimation gives 2.2 Mgates. Finally, comparisons with existing TPC decoders and existing LDPC decoders are performed. They validate the potential of proposed TPC decoder for Gb/s optical fiber transmission systems.  相似文献   

10.
Random linear network coding is an efficient technique for disseminating information in networks, but it is highly susceptible to errors. Kötter-Kschischang (KK) codes and Mahdavifar-Vardy (MV) codes are two important families of subspace codes that provide error control in noncoherent random linear network coding. List decoding has been used to decode MV codes beyond half distance. Existing hardware implementations of the rank metric decoder for KK codes suffer from limited throughput, long latency and high area complexity. The interpolation-based list decoding algorithm for MV codes still has high computational complexity, and its feasibility for hardware implementations has not been investigated. In this paper we propose efficient decoder architectures for both KK and MV codes and present their hardware implementations. Two serial architectures are proposed for KK and MV codes, respectively. An unfolded decoder architecture, which offers high throughput, is also proposed for KK codes. The synthesis results show that the proposed architectures for KK codes are much more efficient than rank metric decoder architectures, and demonstrate that the proposed decoder architecture for MV codes is affordable.  相似文献   

11.
Area-efficient design methodology is proposed for the analog decoding implementations of the rate-½ accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is designed using optimized decoding architecture and regularized routing network, in such a way that the overall wiring overhead is minimized and the silicon area utilization is significantly improved. The prototyping chip used to verify the approach is fully integrated in a four-metal double-poly 0.35 μm complementary metal oxide semiconductor (CMOS) technology, and includes an input-output interface that maximizes the decoder throughput. The decoding core area is 2.02 mm2 with a post-layout area utilization of 80%. The decoder was successfully tested at the maximum data rate of 10 Mbit/s, with a core power consumption of 6.78 mW at 3.3 V, which corresponds to an energy per decoded bit of 0.677 nJ. The proposed analog LDPC decoder with low processing power and high-reliability is suitable for space- and power-constrained spacecraft system.  相似文献   

12.
The Block Decoder (BD) which is an indispensable component of the JPEG 2000 image compression standard has the highest computational complexity and determines the speed of the overall decoder system. This paper proposes a high throughput pass parallel BD architecture, which can decode more than one bit per clock cycle. In BD, the dependency between context generation and arithmetic decoding unit incorporates stalling and reduces the throughput of the decoding process. The proposed selective byte input and synchronous sample skipping techniques are used to prevent stalling in the decoding process. The proposed architecture achieves 86% more throughput with 50% increment in the hardware cost than that of the best available serial BD architecture. In comparison with the best available pass parallel architecture, throughput improves almost 8.2 times with 61% increment in the hardware cost. Incorporation of the speed up techniques in the design is the main reason for more hardware consumption. The Figure of Merit of the proposed design, which is the ratio of throughput and hardware cost, is more than that of the available BD architectures for typical code block (CB) size of 32 × 32. The ASIC implementation of the proposed design consumes 66 mW power at maximum operating frequency.  相似文献   

13.
Fountain码编译码算法结构的改进   总被引:2,自引:0,他引:2  
在LT码的基础上对Raptor码进行了分析。根据IRA码的结构及其线性时间的编译码特性,提出了Raptor码和IRA码相结合的改进Fountain码结构。研究结果表明,该结构既保持了线性的编译码特性又能有效增强信元的恢复能力。利用其它编码技术与Raptor码相结合是今后进一步的研究方向。  相似文献   

14.
We propose a low-cost serial decoder architecture for low-density parity-check convolutional codes (LDPC-CCs). It has been shown that LDPC-CCs can achieve comparable performance to LDPC block codes with constraint length much less than the block length. The proposed serial decoder architecture for LDPC-CCs uses a single decoding processor. Terminated data frames are sent through the processor iteratively until correctly decoded or a maximum number of iterations is reached. This architecture saves memory consumption and uses a very small number of logic elements, making it especially suitable for strong LDPC-CCs with large code memory. The proposed architecture is realized for a (2048,3,6) regular LDPC-CC on an Altera Stratix FPGA. With a maximum of 100 iterations, the design achieves up to 9-Mb/s throughput using only a very small portion of the field-programmable gate array resources.   相似文献   

15.
This paper presents an architecture for high-throughput decoding of high-rate Low-Density Parity-Check (LDPC) codes. The proposed architecture is a modification of the sliced message passing (SMP) decoding architecture which overlaps the check-node and variable-node update stages, achieving a good tradeoff between area and throughput, and also, high hardware utilization efficiency (HUE). The proposed modification does not affect the performance of the SMP algorithm and yields an area reduction of 33%. As an example, SMP architecture and the proposed modification was synthesized in a 90 nm CMOS process for the 2048-bit LDPC code of the IEEE802.3an standard with 16 iterations achieving a throughput of 5.9 Gbps with 15.3 mm2 and 6.2 Gbps with 10.2 mm2, respectively.  相似文献   

16.
In this paper, a systematic approach is proposed to develop a high throughput decoder for quasi-cyclic low-density parity check (LDPC) codes, whose parity check matrix is constructed by circularly shifted identity matrices. Based on the properties of quasi-cyclic LDPC codes, the two stages of belief propagation decoding algorithm, namely, check node update and variable node update, could be overlapped and thus the overall decoding latency is reduced. To avoid the memory access conflict, the maximum concurrency of the two stages is explored by a novel scheduling algorithm. Consequently, the decoding throughput could be increased by about twice assuming dual-port memory is available.  相似文献   

17.
纠错编码技术通过引入冗余增加可靠性,是现代通信的关键技术之一。无速率编码是一类新兴纠错编码,其速率可以根据信道状态自适应改变,编译码算法较为简单,且性能优异,可以适用于不同的应用场景,因此受到了国内外学者和工业界的关注。介绍了4种经典或新兴的无速率编码方案,包括卢比变换(Luby Transform,LT)码、Raptor码、在线喷泉码(OFC)和BATS(Batched Sparse)码。介绍无速率编码的基本原理,通过其发展过程比较不同无速率编码的特点。阐述了这些无速率编码的编译码方法,并简要介绍其最新的研究进展。最后,介绍无速率编码在广播通信及不等差保护、无线传感器网络、车联网、存储以及分布式计算等新老场景中的应用。无速率编码是一种复杂度低、灵活度高的编码,随着新型无速率编码的发展,在未来的分布式系统等场景中将会有更广泛的应用。  相似文献   

18.
Many recent reconfigurable/multi-mode quasi-cyclic low density parity check (QC-LDPC) decoder designs have shown appealing implementation results in the literature. However, most of them are based on datapath multiplexing techniques with ad hoc matrix arrangement. There is still room for further interconnection reduction, throughput enhancement, and a more sophisticated early termination scheme. In this paper, we will focus on these issues and present a two-level design approach, which optimizes the design at (1) matrix merging level, and (2) module design level. First, direct multiplexing datapaths between multiple modes leads to great overhead on wiring complexity. In order to mitigate this problem, we merge multiple parity check matrices by proposing an efficient algorithm at matrix merging level, which helps to minimize multiplexer and wiring overhead. Second, for efficient decoding issues, we propose two design techniques at module design level. One is data wrapping scheme. It enhances the decoding throughput by using the data-wrapped memory with the proposed reconfigurable data-switching circuits (R-DSC) to conquer the data alignment problem and achieve multi-mode reconfigurability. The other is the adaptive early termination (AET) scheme. It can save the unnecessary decoding procedures under both high-SNR and low-SNR regions. Finally, to verify our design approach, we implement a triple-mode LDPC decoder chip which is compatible to IEEE 802.11n standard by using UMC 90 nm CMOS technology. This chip only occupies 3.32 mm2 and features high core utilization up to 70% with low power dissipation of 135.3 mW. The prototyping chip not only validates the proposed approach, but also outperforms the state-of-the-art QC-LDPC decoders for IEEE 802.11n systems.  相似文献   

19.

Luby变换(LT)码作为一种抗干扰编码技术,应用于认知无线电系统,可提高次用户数据传输的可靠性。编译码是影响LT码抗干扰性能的关键因素。为提高数据传输的可靠性和速度,该文提出一种适用于认知无线电系统的LT码联合泊松鲁棒孤子分布-叠层(CPRSD-H)编译码算法。编码过程中,编码器首先采用CPRSD进行编码产生编码分组和编码矩阵,随后通过编码矩阵中度数为1和度数为2对应的列向量携带双层信息:度数为1和度数为2的编码分组和与其相连接的输入分组的连接关系;部分原始数据信息。译码过程中,译码器首先通过第1层存储信息采用置信传播(BP)算法译码完成,随后一些未被成功译出的信息再通过第2层存储信息进行填补。仿真结果表明,将CPRSD-H编译码算法应用于认知无线电系统中,能够显著降低LT码的误比特率(BER),提高次用户有效吞吐量以及加快LT码编译码速度。

  相似文献   

20.
Quasi-cyclic (QC) low-density parity-check (LDPC) codes have the parity-check matrices consisting of circulant matrices. Since QC LDPC codes whose parity-check matrices consist of only circulant permutation matrices are difficult to support layered decoding and, at the same time, have a good degree distribution with respect to error correcting performance, adopting multi-weight circulant matrices to parity-check matrices is useful but it has not been much researched. In this paper, we propose a new code structure for QC LDPC codes with multi-weight circulant matrices by introducing overlapping matrices. This structure enables a system to operate on dual mode in an efficient manner, that is, a standard QC LDPC code is used when the channel is relatively good and an enhanced QC LDPC code adopting an overlapping matrix is used otherwise. We also propose a new dual mode parallel decoder which supports the layered decoding both for the standard QC LDPC codes and the enhanced QC LDPC codes. Simulation results show that QC LDPC codes with the proposed structure have considerably improved error correcting performance and decoding throughput.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号