首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
We propose a decision-feedback decoder for coded signals transmitted over finite-state Markov channels. The decoder achieves maximum-likelihood sequence detection (in the absence of feedback errors) with very low complexity by exploiting previous bit decisions and the Markov structure of the channel. We also propose a similar decoder, the output-feedback decoder, that does not use previous bit decisions and therefore does not suffer from error propagation. The decoder performance is determined using a new sliding window analysis technique as well as by simulation. Both decoders exhibit excellent bit error rate performance with a relatively low complexity that is independent of the channel decorrelation time  相似文献   

2.
This brief studies very large-scale integration (VLSI) decoder architectures for RS-based low-density parity-check (LDPC) codes, which are a special class of LDPC codes based on Reed-Solomon codes. The considered code ensemble is well known for its excellent error-correcting performance and has been selected as the forward error correction coding scheme for 10GBase-T systems. By exploiting the shift-structured properties hidden in the algebraically generated parity-check matrices, novel decoder architectures are developed with significant advantages of high level of parallel decoding, efficient usage of memory, and low complexity of interconnection. To demonstrate the effectiveness of the proposed techniques, we completed a high-speed decoder design for a (2048, 1723) regular RS-LDPC code, which achieves 10-Gb/s throughput with only 820 000 gates. Furthermore, to support all possible RS-LDPC codes, two special cases in code construction are considered, and the corresponding extensions of the decoder architecture are investigated.  相似文献   

3.
This paper presents a novel low-complexity multi-mode multi-way split-row (split by factors of 2, 4, and 8) partially parallel pipelined layered low-density parity-check (LDPC) decoder architecture that is suitable for gigabit wireless communications. The innovative feature of the proposed decoder is related to the multi-way split-row layered LDPC decoding algorithm and architecture design techniques. Furthermore, we employed an efficient parity-check matrix-reordering method that uses row reordering, column reordering, and a local switching network to develop a multi-mode decoder that can support all four code rates specified in the IEEE 802.11ad standard. The proposed decoder can effectively reduce the complexity by a factor that is equal to the splitting factor, while the effect on the overall error-performance loss is negligible. Post-synthesis implementation results on TSMC 40-nm CMOS technology show that the proposed approach achieves higher area efficiency (throughput-to-area ratio) compared with other related previous works. For all four code rates, the proposed split-row pipelined layered LDPC decoder architecture (s = 2) occupies an area of 0.168 mm2 and achieves an encoded throughput of 11.8 Gb/s at five decoding iterations.  相似文献   

4.
Turbo乘积码是一种性能卓越的前向纠错码,具有译码复杂度低,且在低信噪比时可以获得近似最优的性能。介绍基于Chase算法的Turbo乘积码软入软出(SISO)迭代译码算法,提出基于VHDL硬件描述语言的TPC译码器设计方案,并在FPGA芯片上进行了仿真和验证。仿真结果证明该译码器有很大的实用性和灵活性。  相似文献   

5.
Decoding the Golden Code: A VLSI Design   总被引:1,自引:0,他引:1  
The recently proposed Golden code is an optimal space-time block code for 2$,times,$ 2 multiple-input–multiple-output (MIMO) systems. The aim of this work is the design of a VLSI decoder for a MIMO system coded with the Golden code. The architecture is based on a rearrangement of the sphere decoding algorithm that achieves maximum-likelihood (ML) decoding performance. Compared to other approaches, the proposed solution exhibits an inherent flexibility in terms of QAM modulation size and this makes our architecture particularly suitable for adaptive modulation schemes. Relying on the flexibility of this approach two different architectures are proposed: a parametric one able to achieve high decoding throughputs ($>$ 165 Mb/s) while keeping low overall decoder complexity (45 KGates), a flexible implementation able to dynamically adapt to the modulation scheme (4-,16-,64-QAM) retaining the low complexity and high throughput features.   相似文献   

6.
针对目前数字音频广播(DAB)收音机中DSP软件AAC+解码器功耗大的问题,该文提出了低功耗AAC LC解码器的ASIC设计,以极低的硬件代价完成了最基本的DAB+节目解码,加入DAB解码芯片后巧妙地实现了DAB+和DAB两种不同标准的兼容。该文设计优化了反量化与IMDCT算法,使用了分时工作法,从而实现了低功耗。该设计的系统时钟为16.384 MHz,采用0.18 m CMOS工艺,功耗约为6.5 mW,并与DAB信道解码结合,通过了FPGA开发板上的实时验证,且完成了芯片的版图设计,芯片面积为14 mm2。  相似文献   

7.
An implementation of a 16 state, rate 8/9 six-dimensional (6-D) 8PSK rotationally invariant trellis decoder for use in a concatenated codec is described. The concatenated codec allows transmission of STM-1 signals (at the 155.52 Mb/s information rate) over a 72 MHz satellite transponder. The inner trellis decoder is used with an outer (255,239) RS block decoder. The trellis decoder operates at 165.93 Mb/s and currently has an implementation loss of only 0.2 dB. The concatenated codec achieves a bit error ratio of 10?10 at an Eb/N0 of 8.2 dB (assuming an ideal modem and AWGN channel). Details are given of many Viterbi decoding ‘tricks’ that were used in order to implement the main functions of the decoder on two 10,000 gate equivalent CMOS programmable gate arrays.  相似文献   

8.
This paper describes three new march tests for multiport memories. A read (or write) port in such a memory consists of an n-bit address register, an n-to-2n-bit decoder (with column multiplexers for the column addresses) and drivers, and a K-bit data register. This approach gives comprehensive fault coverage for both array and multiport decoder coupling faults. It lends itself to a useful BIST implementation with a modest area overhead that tests these faults and achieves low test application time.  相似文献   

9.
Turbo Decoder Using Contention-Free Interleaver and Parallel Architecture   总被引:1,自引:0,他引:1  
This paper introduces a turbo decoder that utilizes multiple soft-in/soft-out (SISO) decoders to decode one codeword. In addition, each SISO decoder is modified to allow simultaneous execution over multiple successive trellis stages. The design issues related to the architecture with parallel high-radix SISO decoders are discussed. First, a contention-free interleaver for the hybrid parallelism is presented to overcome the complicated collision problem as well as reduce interconnection network complexity. Second, two techniques for the high-speed add-compare-select (ACS) circuits are given to lessen area overhead of the SISO decoder. Third, a modification of the processing schedule is made for higher operating efficiency. Two designs with parallel architecture have been implemented. The first design with 32 SISO decoders, each of which processes 2 symbols per cycle, has 160 Mb/s and 0.22 nJ/b/iter after measurement. The second design uses 16 SISO decoders to deal with 4 symbols per cycle and achieves 100% efficiency, leading to 1000 Mb/s and 0.15 nJ/b/iter in post-layout simulation.   相似文献   

10.
A very low power consumption Viterbi decoder LSIC has been developed by using a low supply voltage 0.8 μm CMOS masterslice process technology. By employing the scarce state transition (SST) scheme, this LSIC achieves a drastic reduction in power consumption below 600 μW at a supply voltage of 1 V when the data rate is 1152 kbit/s and the bit error rate is less than 10-3. This excellent performance has paved the way to employing the strong forward error correction and low power consumption portable terminals for personal communications, mobile multimedia communications, and digital and audio broadcasting  相似文献   

11.
The sliding window (SW) approach has been proposed as an effective means of reducing the memory requirements as well as the decoding latency of the maximum a posteriori (MAP) based soft-input soft-output (SISO) decoder in a Turbo decoder. In this paper, we present sub-banked memory implementations (both single port and dual port) of the SW SISO decoder that achieves high throughput, low decoding latency, and reduced memory energy consumption. Our contributions include derivation of the optimal memory sub-banked structure for different SW configurations, study of the relationship between memory size and energy consumption for different SW configurations and study of the effect of number of sub-banks on the throughput/decoding latency for a given SW configuration.  相似文献   

12.
Orthogonal frequency-division multiplexing (OFDM) is known as an efficient technique to combat frequency-selective channels. In this paper, we show that the combination of bit-interleaved coded modulation (BICM) and OFDM achieves the full frequency diversity offered by a frequency-selective channel with any kind of power delay profile (PDP), conditioned on the minimum Hamming distance dfree of the convolutional code. This system has a simple Viterbi decoder with a modified metric. We then show that by combining such a system with space-time block coding (STBC), one can achieve the full space and frequency diversity of a frequency-selective channel with N transmit and M receive antennas. BICM-STBC-OFDM achieves the maximum diversity order of NML over L-tap frequency-selective channels regardless of the PDP of the channel. This latter system also has a simple Viterbi decoder with a properly modified metric. We verify our analytical results via simulations, including channels employed in the IEEE 802.11 standards  相似文献   

13.

Low-latency and energy-efficient multi-Gbps LDPC decoding requires fast-converging iterative schedules. Hardware decoder architectures based on such schedules can achieve high throughput at low clock speeds, resulting in reduced power consumption and relaxed timing closure requirements for physical VLSI design. In this work, a fast column message-passing (FCMP) schedule for decoding LDPC codes is presented and investigated. FCMP converges in half the number of iterations compared to existing serial decoding schedules, has a significantly lower computational complexity than residual-belief-propagation (RBP)-based schedules, and consumes less power compared to state-of-the-art schedules. An FCMP decoder architecture supporting IEEE 802.11ad (WiGig) LDPC codes is presented. The decoder is fully pipelined to decode two frames with no idle cycles. The architecture is synthesized using the TSMC 40 nm and 65 nm CMOS technology nodes, and operates at a clock-frequency of 200 MHz. The decoder achieves a throughput of 8.4 Gbps, and it consumes 72 mW of power when synthesized using the 40 nm technology node. This results in an energy efficiency of 8.6 pJ/bit, which is the best-reported energy-efficiency in the literature for a WiGig LDPC decoder.

  相似文献   

14.
This work proposes a VLSI decoding architecture for concatenated convolutional codes. The novelty of this architecture is twofold: 1) the possibility to switch on-the-fly from the universal mobile telecommunication system turbo decoder to the WiMax duo-binary turbo decoder with a limited resources overhead compared to a single-mode WiMax architecture; and 2) the design of a parallel, collision free WiMax decoder architecture. Compared to two single-mode solutions, the proposed architecture achieves a complexity reduction of 17.1% and 27.3% in terms of logic and memory, respectively. The proposed, flexible architecture has been characterized in terms of performance and complexity on a 0.13-mum standard cell technology, and sustains a maximum throughput of more than 70 Mb/s.  相似文献   

15.
根据完美空时分组码(STBC)的结构特点提出了等效的垂直-贝尔实验室空时(V-BLAST)模型,在对该模型进行最小均方误差-判决反馈均衡(MMS-DFE)预处理之后,提出一种有边界约束的Fano解码器。该解码器可达到几乎最大似然(ML)性能,在很大的信噪比区域范围内其复杂度比目前典型的解码器低,而且该解码器可用于发射天线数大于接收天线数的系统。直接在复数域计算和处理,该解码器适用于任何星座形式。该解码器可用于任何能等效成V-BLAST模型的空时系统。仿真结果表明了该解码器的有效性。  相似文献   

16.
一种600bps极低速率语音编码算法   总被引:1,自引:0,他引:1  
该文针对抗干扰通信中对低速率语音编码算法的应用需求,提出了一种600bps极低速率语音编码算法,采用6帧超帧结构,超帧中包括2个基本帧与4个插值帧。插值帧的线性预测(LPC)参数采用基于闭环最优一阶线性预测的4阶段残差矩阵量化;在解码端,提出了闭环的激励脉冲幅度估计方法,提高了合成语音的自然度与鼻音音节的清晰度。该算法可以提供良好的合成语音质量,DRT测试结果达到88.55分。  相似文献   

17.
For pt. I see ibid. vol.46, p.1592-1601 (1998). Soft-decision maximum-likelihood decoding of convolutional codes over GF(q) can be accomplished via searching through an error-trellis for the least weighing error sequence. The error-trellis is obtained by a syndrome-based construction. Its structure lends itself particularly well to the application of expedited search procedures. The method to carry out such error-trellis-based decoding is formulated by four algorithms. Three of these algorithms are aimed at reducing the worst case computational complexity, whereas by applying the fourth algorithm, the average computational complexity is reduced under low to moderate channel wise level. The syndrome decoder achieves substantial worst case and average computational gains in comparison with the conventional maximum-likelihood decoder, namely the Viterbi decoder, which searches for the most likely codeword directly within the code  相似文献   

18.
Turbo code is a computationally intensive channel code that is widely used in current and upcoming wireless standards. General-purpose graphics processor unit (GPGPU) is a programmable commodity processor that achieves high performance computation power by using many simple cores. In this paper, we present a 3GPP LTE compliant Turbo decoder accelerator that takes advantage of the processing power of GPU to offer fast Turbo decoding throughput. Several techniques are used to improve the performance of the decoder. To fully utilize the computational resources on GPU, our decoder can decode multiple codewords simultaneously, divide the workload for a single codeword across multiple cores, and pack multiple codewords to fit the single instruction multiple data (SIMD) instruction width. In addition, we use shared memory judiciously to enable hundreds of concurrent multiple threads while keeping frequently used data local to keep memory access fast. To improve efficiency of the decoder in the high SNR regime, we also present a low complexity early termination scheme based on average extrinsic LLR statistics. Finally, we examine how different workload partitioning choices affect the error correction performance and the decoder throughput.  相似文献   

19.
Area-efficient design methodology is proposed for the analog decoding implementations of the rate-½ accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is designed using optimized decoding architecture and regularized routing network, in such a way that the overall wiring overhead is minimized and the silicon area utilization is significantly improved. The prototyping chip used to verify the approach is fully integrated in a four-metal double-poly 0.35 μm complementary metal oxide semiconductor (CMOS) technology, and includes an input-output interface that maximizes the decoder throughput. The decoding core area is 2.02 mm2 with a post-layout area utilization of 80%. The decoder was successfully tested at the maximum data rate of 10 Mbit/s, with a core power consumption of 6.78 mW at 3.3 V, which corresponds to an energy per decoded bit of 0.677 nJ. The proposed analog LDPC decoder with low processing power and high-reliability is suitable for space- and power-constrained spacecraft system.  相似文献   

20.
In this paper, we propose an improvement of the normalized min-sum (MS) decoding algorithm and novel MS decoder architectures with reduced word length using nonuniform quantization schemes for low-density parity-check (LDPC) codes. The proposed normalized MS algorithm introduces a more exact adjustment with two optimized correction factors for check-node-updating computations, while the conventional normalized MS algorithm applies only one correction factor. The proposed algorithm provides a significant performance gain without any additional computation or hardware complexity. The finite word-length analysis in implementing an LDPC decoder is a very important factor since it directly impacts the size of memory to store the intrinsic and extrinsic messages and the overall hardware area in the partially parallel LDPC decoder. The proposed nonuniform quantization scheme can reduce the finite word length while achieving similar performances compared to a conventional quantization scheme. From simulation results, it is shown that the proposed 4-bit nonuniform quantization scheme achieves an acceptable decoding performance, unlike the conventional 4-bit uniform quantization scheme. Finally, the proposed MS decoder architectures by the nonuniform quantization scheme provide significant reductions of 20% and up to 8% for the memory area and combinational logic area, respectively, compared to the conventional 5-bit ones.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号