首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Current-mode circuits are presented for implementing analog min-sum (MS) iterative decoders. These decoders are used to efficiently decode the best known error correcting codes such as low-density parity-check (LDPC) codes and turbo codes. The proposed circuits are devised based on current mirrors, and thus, in any fabrication technology that accurate current mirrors can be designed, analog MS decoders can be implemented. The functionality of the proposed circuits is verified by implementing an analog MS decoder for a (32,8) LDPC code in a 0.18-mum CMOS technology. This decoder is the first reported analog MS decoder. For low signal to noise ratios where the circuit imperfections are dominated by the noise of the channel, the measured error correcting performance of this chip in steady-state condition surpasses that of the conventional floating-point discrete-time synchronous MS decoder. When data throughput is 6 Mb/s, loss in the coding gain compared to the conventional MS decoder at BER of 10-3 is about 0.3 dB and power consumption is about 5 mW. This is the first time that an analog decoder has been successfully tested for an LDPC code, though a short one  相似文献   

2.
This paper describes a 4-state rate-1/2 analog convolutional decoder fabricated in 0.8-μm CMOS technology. Although analog implementations have been described in the literature, this decoder is the first to be reported realizing the add-compare-select section entirely with current-mode analog circuits. It operates at data rates up to 115 Mb/s (channel rate 230 Mb/s) and consumes 39 mW at that rate from a single 2.8-V power supply. At a rate of 100 Mb/s, the power consumption per trellis state is about 1/3 that of a comparable digital system. In addition, at 50 Mb/s (the only rate at which comparative data were available), the power consumption per trellis state is similarly about 1/3 that of the best competing analog realization (i.e., excluding, for example, PR4 detectors which use a simplified form of the Viterbi algorithm). The chip contains 3.7 K transistors of which less than 1 K are used in the analog part of the decoder. The die has a core area of 1 mm2, of which about 1/3 contains the analog section. The measured performance is close to that of an ideal Viterbi decoder with infinite quantization. In addition, a technique is described which extends the application of the circuits to decoders with a larger number of states. A typical example is a 64-state decoder for use in high-speed satellite communications  相似文献   

3.
This work presents the design and the test results of an analog decoder for the 40-bit block length, rate 1/3, Turbo Code defined in the UMTS standard. The prototype is fully integrated in a three-metal double-poly 0.35-/spl mu/m CMOS technology, and includes an I/O interface that maximizes the decoder throughput. After the successful implementation of proof-of-concept analog iterative decoders by different research groups in both bipolar and CMOS technologies, this is the first reported prototype of an analog decoder for a realistic error-correcting code. The decoder was successfully tested at the maximum data rate defined in the standard (2 Mb/s), with an overall power consumption of 10.3 mW at 3.3 V, going down to 7.6 mW with the decoder core operated at 2 V, and an extremely low energy per decoded bit and trellis state (0.85 nJ for the decoder core alone).  相似文献   

4.
Iterative decoders such as turbo decoders have become integral components of modern broadband communication systems because of their ability to provide substantial coding gains. A key computational kernel in iterative decoders is the maximum a posteriori probability (MAP) decoder. The MAP decoder is recursive and complex, which makes high-speed implementations extremely difficult to realize. In this paper, we present block-interleaved pipelining (BIP) as a new high-throughput technique for MAP decoders. An area-efficient symbol-based BIP MAP decoder architecture is proposed by combining BIP with the well-known look-ahead computation. These architectures are compared with conventional parallel architectures in terms of speed-up, memory and logic complexity, and area. Compared to the parallel architecture, the BIP architecture provides the same speed-up with a reduction in logic complexity by a factor of M, where M is the level of parallelism. The symbol-based architecture provides a speed-up in the range from 1 to 2 with a logic complexity that grows exponentially with M and a state metric storage requirement that is reduced by a factor of M as compared to a parallel architecture. The symbol-based BIP architecture provides speed-up in the range M to 2M with an exponentially higher logic complexity and a reduced memory complexity compared to a parallel architecture. These high-throughput architectures are synthesized in a 2.5-V 0.25-/spl mu/m CMOS standard cell library and post-layout simulations are conducted. For turbo decoder applications, we find that the BIP architecture provides a throughput gain of 1.96 at the cost of 63% area overhead. For turbo equalizer applications, the symbol-based BIP architecture enables us to achieve a throughput gain of 1.79 with an area savings of 25%.  相似文献   

5.
Two eight-state 7-bit soft-output Viterbi decoders matched to an EPR4 channel and a rate-8/9 convolutional code are implemented in a 0.18-/spl mu/m CMOS technology. The throughput of the decoders is increased through architectural transformation of the add-compare-select recursion, with a small area overhead. The survivor-path decoding logic of a conventional Viterbi decoder register exchange is adapted to detect the two most likely paths. The 4-mm/sup 2/ chip has been verified to decode at 500 Mb/s with 1.8-V supply. These decoders can be used as constituent decoders for Turbo codes in high-performance applications requiring information rates that are very close to the Shannon limit.  相似文献   

6.
Turbo Decoder Using Contention-Free Interleaver and Parallel Architecture   总被引:1,自引:0,他引:1  
This paper introduces a turbo decoder that utilizes multiple soft-in/soft-out (SISO) decoders to decode one codeword. In addition, each SISO decoder is modified to allow simultaneous execution over multiple successive trellis stages. The design issues related to the architecture with parallel high-radix SISO decoders are discussed. First, a contention-free interleaver for the hybrid parallelism is presented to overcome the complicated collision problem as well as reduce interconnection network complexity. Second, two techniques for the high-speed add-compare-select (ACS) circuits are given to lessen area overhead of the SISO decoder. Third, a modification of the processing schedule is made for higher operating efficiency. Two designs with parallel architecture have been implemented. The first design with 32 SISO decoders, each of which processes 2 symbols per cycle, has 160 Mb/s and 0.22 nJ/b/iter after measurement. The second design uses 16 SISO decoders to deal with 4 symbols per cycle and achieves 100% efficiency, leading to 1000 Mb/s and 0.15 nJ/b/iter in post-layout simulation.   相似文献   

7.
Although recent implementations of analog iterative decoders have proven their potential for higher decoding speed and less power consumption than their digital counterparts, the CMOS or conventional BiCMOS technologies used so far seem to be incapable to cope with the need for high throughput that high-speed applications require. Within this context this work presents the design and test results of a high-speed analog SISO (Soft-Input Soft-Output) channel decoder for an 8-bit trellis code by exploiting the high-speed features of SiGe heterojunction bipolar transistors (HBTs). It is one of the first successful implementations of an error-correcting decoder in SiGe BiCMOS technology, which incorporates a high-speed I/O interface. A high-level model of the mismatch effects indicates that there is no significant performance penalty. Moreover, simulations and performance evaluations of an analog Turbo decoder based on the designed SISO decoder are provided. Even though the IC of the SISO module was tested at a throughput up to 3 Mbps, simulation results show that the decoder is capable to operate at 50 Mbps. The measured power consumption is 860 mW and the die area is 3.4 × 3 mm2.  相似文献   

8.
Wireless communication standards make use of parallel turbo decoder for higher data rate at the cost of large hardware resources. This paper presents a memory-reduced back-trace technique, which is based on a new method of estimating backward-recursion factors, for the maximum a posteriori probability (MAP) decoding. Mathematical reformulations of branch-metric equations are performed to reduce the memory requirement of branch metrics for each trellis stage. Subsequently, an architecture of MAP decoder and its scheduling based on the proposed back trace as well as branch-metric reformulation are presented in this work. Comparative analysis of bit-error-rate (BER) performances in additive white Gaussian noise channel environment for MAP as well as parallel turbo decoders are carried out. It has shown that a MAP decoder with a code rate of 1/2 and a parallel turbo decoder with a code rate of 1/3 have achieved coding gains of 1.28 dB at a BER of 10\(^{-5}\) and of 0.4 dB at a BER of 10\(^{-4}\), respectively. In order to meet high-data-rate benchmarks of recently deployed wireless communication standards, very large scale integration implementations of parallel turbo decoder with 8–64 MAP decoders have been reported. Thereby, savings of hardware resources by such parallel turbo decoders based on the suggested memory-reduced techniques are accounted in terms of complementary metal oxide semiconductor transistor count. It has shown that the parallel turbo decoder with 32 and 64 MAP decoders has shown hardware savings of 34 and 44 % respectively.  相似文献   

9.
Turbo decoder     
We propose an adaptive channel SNR estimation algorithm required for the iterative MAP decoding of turbo decoders. The proposed algorithm uses the extrinsic values generated within the iterative MAP decoder to update the channel SNR estimate toward its optimum value per each decoder iteration or per each turbo code frame  相似文献   

10.
One of the most significant impediments to the use of LDPC codes in many communication and storage systems is the error-rate floor phenomenon associated with their iterative decoders. The error floor has been attributed to certain subgraphs of an LDPC code?s Tanner graph induced by so-called trapping sets. We show in this paper that once we identify the trapping sets of an LDPC code of interest, a sum-product algorithm (SPA) decoder can be custom-designed to yield floors that are orders of magnitude lower than floors of the the conventional SPA decoder. We present three classes of such decoders: (1) a bi-mode decoder, (2) a bit-pinning decoder which utilizes one or more outer algebraic codes, and (3) three generalized-LDPC decoders. We demonstrate the effectiveness of these decoders for two codes, the rate-1/2 (2640,1320) Margulis code which is notorious for its floors and a rate-0.3 (640,192) quasi-cyclic code which has been devised for this study. Although the paper focuses on these two codes, the decoder design techniques presented are fully generalizable to any LDPC code.  相似文献   

11.
The turbo decoder is the most challenging component in a digital HSDPA receiver in terms of computation requirement and power consumption, where large block size and recursive algorithm prevent pipelining or parallelism to be effectively deployed. This paper addresses the complexity and power consumption issues at algorithmic, arithmetic and gate levels of ASIC design, in order to bring power consumption and die area of turbo decoders to a level commensurate with wireless application. Realized in 0.13  mum CMOS technology, the turbo decoder ASIC measures 1.2 mm2 excluding pads, and can achieve 10.8 Mb/s throughput while consuming only 32 mW.  相似文献   

12.
Reed-Solomon (RS) codes play an important role in providing the error correction and the data integrity in various communication/storage applications. For high-speed applications, most RS decoders are implemented as dedicated application-specified integrated circuits (ASICs) based on parallel architectures, which can deliver high data throughput rate. For lower-speed applications, the RS decoding operations are usually performed by using fine-grained processing elements (PE) controlled by a programmable digital signal processing (DSP) core, which provides high flexibility. In this paper, we propose a novel m-PE multi-symbol-sliced (MSS) RS datapath structure. The m-PE RS architecture is a highly scalable design and can be dynamically reconfigured at 1-PE, 2-PE,...,m/2-PE, and m-PE modes to deliver necessary data throughput rate. With the help of the gated-clock scheme to turn off the idle PEs, the proposed runtime configurable ASIC design provides good tradeoff between the data throughput rate and the power consumption. Hence, it can save energy to extend the battery life of the portable devices. We demonstrate a prototyping design using 4 PEs by using UMC 0.18-/spl mu/m CMOS technology. The design can be dynamically reconfigured to be operated at 1-PE, 2-PE, and 4-PE modes, with performance of 140 Mb/s at 18.91 mW, 280 Mb/s at 28.77 mW, and 560 Mb/s at 48.47 mW, respectively. Compared with existing RS designs, the proposed m-PE RS decoder has better normalized area/power efficiency than most DSP-type and ASIC-type RS designs. The reconfigurable feature makes our design a good candidate for the error control coding (ECC) unit of the storage system in power-aware portable devices.  相似文献   

13.
The Viterbi algorithm is a maximum likelihood means for decoding convolutional codes and has thus played an important role in applications ranging from satellite communications to cellular telephony. In the past, Viterbi decoders have usually been implemented using digital circuits. The speed of these digital decoders is directly related to the amount of parallelism in the design. As the constraint length of the code increases, parallelism becomes problematic due to the complexity of the decoder. In this paper an artificial neural network (ANN) Viterbi decoder is presented. The ANN decoder is significantly faster than comparable digital-only designs due to its fully parallel architecture. The fully parallel structure is obtained by implementing most of the Viterbi algorithm using analog neurons as opposed to digital circuits. Several modifications to the ANN decoder are considered, including an analog/digital hybrid design that results in an extremely fast and efficient decoder. The ANN decoder requires one-sixth the number of transistors required by the digital decoder. The connection weights of the ANN decoder are either +1 or -1, so weight considerations in the implementation are eliminated. This, together with the design's modularity and local connectivity, makes the ANN Viterbi decoder a natural fit for VLSI implementation. Simulation results are provided to show that the performance of the ANN decoder matches that of an ideal Viterbi decoder  相似文献   

14.
This paper investigates VLSI architectures for low-density parity-check (LDPC) decoders amenable to low- voltage and low-power operation. First, a highly-parallel decoder architecture with low routing overhead is described. Second, we propose an efficient method to detect early convergence of the iterative decoder and terminate the computations, thereby reducing dynamic power. We report on a bit-serial fully-parallel LDPC decoder fabricated in a 0.13-$mu{hbox{m}}$ CMOS process and show how the above techniques affect the power consumption. With early termination, the prototype is capable of decoding with 10.4 pJ/bit/iteration, while performing within 3 dB of the Shannon limit at a BER of 10$^{-5}$ and with 3.3 Gb/s total throughput. If operated from a 0.6 V supply, the energy consumption can be further reduced to 2.7 pJ/bit/iteration while maintaining a total throughput of 648 Mb/s, due to the highly-parallel architecture. To demonstrate the applicability of the proposed architecture for longer codes, we also report on a bit-serial fully-parallel decoder for the (2048, 1723) LDPC code in 10GBase-T standard synthesized with a 90-nm CMOS library.   相似文献   

15.
In this paper, we present an all-analog implementation of the rate-1/3, block length 40, UMTS turbo decoder. The prototype was designed and fabricated in a 0.35$mu$m CMOS technology and operates at 3.3 V. We also introduce a discrete-time first-order model for analog decoders which allows fast BER simulations, while taking into account circuit transient behavior and component mismatch. The model is applied to the rate-1/3 analog turbo decoder for UMTS defined in the 3GPP standard, and the discrete-time model predictions are compared with the decoder experimental performance and the transistor-level simulations. These results demonstrated that this model can be successfully used as a tool to both predict analog decoder performance and give design guidelines for complex decoders, for which circuit-level simulations are impractical.  相似文献   

16.
Turbo codes have received tremendous attention and have commenced their practical applications due to their excellent error-correcting capability. Investigation of efficient iterative decoder realizations is of particular interest because the underlying soft-input soft-output decoding algorithms usually lead to highly complicated implementation. This paper describes the architectural design and analysis of sliding-window (SW) Log-MAP decoders in terms of a set of predetermined parameters. The derived mathematical representations can be applied to construct a variety of VLSI architectures for different applications. Based on our development, a SW-Log-MAP decoder complying with the specification of third-generation mobile radio systems is realized to demonstrate the performance tradeoffs among latency, average decoding rate, area/computation complexity, and memory power consumption. This paper thus provides useful and general information on practical implementation of SW-Log-MAP decoders.  相似文献   

17.
This paper analyses different VLSI architectures for 3GPP LTE/LTE-advanced turbo decoders for trade-offs in terms of throughput and area requirement. Data flow graphs for standard SISO MAP (maximum a posteriori) turbo decoder, SW – SISO MAP turbo decoder, PW SISO MAP turbo decoder have been presented, thus analysing their performance. Two variants of quadratic permutation polynomial (QPP) interleaver have been proposed which tend to simplify the complexity of ‘mod’ operator implementation and provide best compromise between area, delay and power dissipation. Implementation of decoder using one variant of QPP interleaver has also been discussed. A novel approach for area optimisation has been proposed to reduce required number of interleavers for parallel window turbo decoder. Multi-port memory has also been used for parallel turbo decoder. To increase the throughput without any effective increase in area complexity, circuit-level pipelining and retiming have been used. Proposed architectures have been synthesised using Synopsys Design Compiler using 45-nm CMOS technology.  相似文献   

18.
Design of a 20-mb/s 256-state Viterbi decoder   总被引:1,自引:0,他引:1  
The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 /spl mu/m standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W.  相似文献   

19.
Presented in this paper is a pipelined 285-MHz maximum a posteriori probability (MAP) decoder IC. The 8.7-mm/sup 2/ IC is implemented in a 1.8-V 0.18-/spl mu/m CMOS technology and consumes 330 mW at maximum frequency. The MAP decoder chip features a block-interleaved pipelined architecture, which enables the pipelining of the add-compare-select kernels. Measured results indicate that a turbo decoder based on the presented MAP decoder core can achieve: 1) a decoding throughput of 27.6 Mb/s with an energy-efficiency of 2.36 nJ/b/iter; 2) the highest clock frequency compared to existing 0.18-/spl mu/m designs with the smallest area; and 3) comparable throughput with an area reduction of 3-4.3/spl times/ with reference to a look-ahead based high-speed design (Radix-4 design), and a parallel architecture.  相似文献   

20.
A data separator that can work in Winchester disk drives at a read/write speed of up to 30 Mb/s is described. To realize high stability and accuracy in reproducing data in high-speed transfers, a digital synchronization field detector and an analog dual-mode phase-locked loop (PLL) that has a phase detector which has constant gain in the data field, independent of pattern, are used. The dual-mode analog PLL has a wide decode margin, locks up quickly, and operates stably without being affected by the frequency deviation of data. The digital sync field detector is adjustment-free and detects sync fields very accurately. The IC incorporates a RLL 2-7 code encoder/decoder and a write compensator. Use of the 2-μm BiCMOS process keeps the total power consumption as low as 400 mW even at the high transfer rate of 30 Mb/s  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号