共查询到20条相似文献,搜索用时 15 毫秒
1.
Fei Sun Tong Zhang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1013-1022
Limited search trellis decoding algorithms have great potentials of realizing low power due to their largely reduced computational complexity compared with the widely used Viterbi algorithm. However, because of the lack of operational parallelism and regularity in their original formulations, the limited search decoding algorithms have been traditionally ruled out for applications demanding very high throughput. We believe that, through appropriate algorithm and hardware architecture co-design, certain limited search trellis decoding algorithms can become serious competitors to the Viterbi algorithm for high-throughout applications. Focusing on the well-known T-algorithm, this paper presents techniques at the algorithm and VLSI architecture levels to design fully parallel T-algorithm limited search trellis decoders. We first develop a modified T-algorithm, called SPEC-T, to improve the algorithmic parallelism. Then, based on the conventional state-parallel register exchange Viterbi decoder, we develop a parallel SPEC-T decoder architecture that can effectively transform the reduced computational complexity at the algorithm level to the reduced switching activities in the hardware. We demonstrate the effectiveness of the SPEC-T design solution in the context of convolutional code decoding. Compared with state-parallel register exchange Viterbi decoders, the SPEC-T convolutional code decoders can achieve almost the same throughput and decoding performance, while realizing up to 56% power savings. For the first time, this work provides an approach to exploit the low power potential of the T-algorithm in very high throughput applications. 相似文献
2.
This paper describes a 4-state rate-1/2 analog convolutional decoder fabricated in 0.8-μm CMOS technology. Although analog implementations have been described in the literature, this decoder is the first to be reported realizing the add-compare-select section entirely with current-mode analog circuits. It operates at data rates up to 115 Mb/s (channel rate 230 Mb/s) and consumes 39 mW at that rate from a single 2.8-V power supply. At a rate of 100 Mb/s, the power consumption per trellis state is about 1/3 that of a comparable digital system. In addition, at 50 Mb/s (the only rate at which comparative data were available), the power consumption per trellis state is similarly about 1/3 that of the best competing analog realization (i.e., excluding, for example, PR4 detectors which use a simplified form of the Viterbi algorithm). The chip contains 3.7 K transistors of which less than 1 K are used in the analog part of the decoder. The die has a core area of 1 mm2, of which about 1/3 contains the analog section. The measured performance is close to that of an ideal Viterbi decoder with infinite quantization. In addition, a technique is described which extends the application of the circuits to decoders with a larger number of states. A typical example is a 64-state decoder for use in high-speed satellite communications 相似文献
3.
本文介绍了高速数字流水Viterbi译码器的VLSI设计。在符号4值系统的基础上,给出Viterbi算法的新的功能分解公式,并介绍了用于译码器实现的两个重要的快速运算部件ADD和MAX的原理及其现场可编程(序)门阵列(FPGA)实现。文中详细讨论了译码器的VLSI结构、设计和性能分析。本文给出的Viterbi译码器可塑性强,并具有高度的并行性和很高的数据吞吐率。 相似文献
4.
Today's data reconstruction in digital communication systems requires designs of highest throughput rate at low power. The Viterbi algorithm is a key element in such digital signal processing applications. The nonlinear and recursive nature of the Viterbi decoder makes its high-speed implementation challenging. Several promising approaches to achieve either high throughput or low power have been proposed in the past. A combination of these is developed in this paper. Additional new concepts allow building a signal-flow graph suitable for the design of high-speed Viterbi decoders with low power. Using a flexible datapath generator facilitates the essential quantitative optimization from architectural down to physical level to fully exploit the low-power and high-speed potential of a given technology. With parameterizable design entry, this datapath generator establishes the basis of a scalable platform-based design library. Altogether, this allows coverage of the range of today's industrial interest in high throughput rates, from 150 Msymbols/s up to 1.2 Gsymbols/s using conventional CMOS logic. The features of two exemplary Viterbi decoder implementations prove the benefit of this physically oriented design methodology in terms of speed and low power, when compared to other leading edge implementations 相似文献
5.
Hyongsuk Kim Son H. Roska T. Chua L.O. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(10):2208-2218
A very-high-performance Viterbi decoder with a circularly connected two-dimensional analog cellular neural network (CNN) cell array is disclosed. In the proposed Viterbi decoder, the CNN cells with nonlinear unilateral connections are implemented with electronic circuits at nodes on a trellis diagram. The circuits are circularly connected, forming a cylindrical shape so that the cells of the last stage are connected to those of the first stage. Unilateral connections guide the information to flow circularly around the cylindrical surface. Such configuration enables the conceptually infinite length of the trellis diagram to be reduced to a circuit of limited size. The analog circuits does not require any analog-digital converters, which is the major cause of high power consumption and the quantization error. With the parallel analog processing structure, its decoding speed becomes very high. Also, the decoding mechanism using triggering wave of the CNN circuit does not require the path memory. Circuits for the proposed structure have been designed with HSPICE. Features of the proposed Viterbi decoder are compared with those of the conventional digital Viterbi decoder. 相似文献
6.
Jun Jin Kong Parhi K.K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2004,12(6):642-651
In this paper, a novel K-nested layered look-ahead method and its corresponding architecture, which combine K-trellis steps into one trellis step (where K is the encoder constraint length), are proposed for implementing low-latency high-throughput rate Viterbi decoders. The proposed method guarantees parallel paths between any two-trellis states in the look-ahead trellises and distributes the add-compare-select (ACS) computations to all trellis layers. It leads to regular and simple architecture for the Viterbi decoding algorithm. The look-ahead ACS computation latency of the proposed method increases logarithmically with respect to the look-ahead step (M) divided by the encoder constraint length (K) as opposed to linearly as in prior work. For a 4-state (i.e., K=3) convolutional code, the decoding latency of the Viterbi decoder using proposed method is reduced by 84%, at the expense of about 22% increase in hardware complexity, compared with conventional M-step look-ahead method with M=48 (where M is also the level of parallelism). The main advantage of our proposed design is that it has the least latency among all known look-ahead Viterbi decoders for a given level of parallelism. 相似文献
7.
Viterbi算法是卷积码的最优译码算法.设计并实现了一种高速(3,1,7)Viterbi译码器,该译码器由分支度量单元(BMU)、加比选单元(ACSU)、幸存路径存储单元(SMU)、控制单元(CU)组成.在StratixⅡ FPGA上实现、验证了该Viterbi译码器.验证结果表明,该译码器数据吞吐率达到231Mbit/s,在加性高斯白噪声(AWGN)信道下的误码率十分接近理论仿真值.与同类型Viterbi译码器比较,该译码器具有高速、硬件实现代价低的特点. 相似文献
8.
Martin Vaupel Uwe Lambrette Herbert Dawid Olaf Joeressen Stefan Bitterlich Heinrich Meyr Focko Frieling Karsten Müller Götz Kluge 《Design Automation for Embedded Systems》1998,3(4):255-290
This contribution describes design methodology and implementation of a single-chip timing and carrier synchronizer and channel decoder for digital video broadcasting over satellite (DVB-S). The device consists of an A /D converter with AGC, timing and carrier synchronizer with matched filter, Viterbi decoder including node synchronization, byte and frame synchronizer, convolutional de-interleaver, Reed Solomon decoder, and a descrambler.The system was designed in accordance with the DVB specifications. It is able to perform Viterbi decoding at data rates up to 56 Mbit /s and to sample the analog input values with up to 88 MHz. The chip allows automatic acquisition of the convolutional code rate and the position of the puncturing mask. The symbol synchronization is performed fully digitally by means of interpolation and controlled decimation. Hence, no external analog clock recovery circuit is needed.For algorithm design, system performance evaluation, co-verification of the building blocks, and functional hardware verification an advanced design methodology and the corresponding tool framework are presented which guarantee both short design time and highly reliable results. The chip has been fabricated in a 0.5 µm CMOS technology with three metal layers. A die photograph is included. 相似文献
9.
Wong C.-C. Lai M.-W. Lin C.-C. Chang H.-C. Lee C.-Y. 《Solid-State Circuits, IEEE Journal of》2010,45(2):422-432
10.
Winstead C. Jie Dai Shuhuan Yu Myers C. Harrison R.R. Schlegel C. 《Solid-State Circuits, IEEE Journal of》2004,39(1):122-131
Design and test results for a fully integrated translinear tail-biting MAP error-control decoder are presented. Decoder designs have been reported for various applications which make use of analog computation, mostly for Viterbi-style decoders. MAP decoders are more complex, and are necessary components of powerful iterative decoding systems such as turbo codes. Analog circuits may require less area and power than digital implementations in high-speed iterative applications. Our (8, 4) Hamming decoder, implemented in an AMI 0.5-/spl mu/m process, is the first functioning CMOS analog MAP decoder. While designed to operate in subthreshold, the decoder also functions above threshold with a small performance penalty. The chip has been tested at bit rates up to 2 Mb/s, and simulations indicate a top speed of about 10 Mb/s in strong inversion. The decoder circuit size is 0.82 mm/sup 2/, and typical power consumption is 1 mW at 1 Mb/s. 相似文献
11.
Commercial realisations of analogue Viterbi decoder are available for magnetic read channels. However, these implementations are restricted to a particular coding scheme. The authors present a general approach to the implementation of analogue Viterbi decoders having arbitrary coding schemes. The proposed method exploits the ability of simple analogue circuits to perform the required mathematical functions. Simulation results are presented for a 0.8 μm BiCMOS process 相似文献
12.
Current-mode circuits are presented for implementing analog min-sum (MS) iterative decoders. These decoders are used to efficiently decode the best known error correcting codes such as low-density parity-check (LDPC) codes and turbo codes. The proposed circuits are devised based on current mirrors, and thus, in any fabrication technology that accurate current mirrors can be designed, analog MS decoders can be implemented. The functionality of the proposed circuits is verified by implementing an analog MS decoder for a (32,8) LDPC code in a 0.18-mum CMOS technology. This decoder is the first reported analog MS decoder. For low signal to noise ratios where the circuit imperfections are dominated by the noise of the channel, the measured error correcting performance of this chip in steady-state condition surpasses that of the conventional floating-point discrete-time synchronous MS decoder. When data throughput is 6 Mb/s, loss in the coding gain compared to the conventional MS decoder at BER of 10-3 is about 0.3 dB and power consumption is about 5 mW. This is the first time that an analog decoder has been successfully tested for an LDPC code, though a short one 相似文献
13.
14.
Design of a 20-mb/s 256-state Viterbi decoder 总被引:1,自引:0,他引:1
Xun Liu Papaefthymiou M.C. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):965-975
The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 /spl mu/m standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W. 相似文献
15.
Arzel M. Lahuec C. Seguin F. Gnaedig D. Jezequel M. 《IEEE transactions on circuits and systems. I, Regular papers》2007,54(6):1305-1316
Based on multiple-slice turbo codes, a novel semi-iterative analog turbo decoding algorithm and its corresponding decoder architecture are presented. This work paves the way for integrating flexible analog decoders dealing with frame lengths over thousands of bits. The algorithm benefits from a partially continuous exchange of extrinsic information to improve decoding speed and correction performance. The proposed algorithm and architecture are applied to design an analog decoder for double-binary codes. Taking full advantage of multiple slice codes, the on-chip area is shown to be reduced by ten when compared to a conventional fully parallelized analog slice turbo decoder. The reconfigurable analog core area for frames of 40 bits up to 2432 bits is 37 nm2 in a 0.25-mum BiCMOS process. 相似文献
16.
This letter is concerned with the implementation of iterative decoding algorithms in analog integrated circuits. We study the convergence speed and the throughput of analog decoders for low-density parity-check codes, and show that they depend on the code, the decoding algorithm, the signal-to-noise ratio, and the average time constant of the analog circuit interconnections. However, they are not a function of the variance of the time constants. The analysis presented here can be used for selecting suitable codes and decoding algorithms for analog decoding. Furthermore, it can be used to estimate the throughput of an analog decoder, if the average time constant of the analog circuit is known 相似文献
17.
Convolutional codes are widely used in many communication systems due to their excellent error-control performance. High-speed Viterbi decoders for convolutional codes are of great interest for high-data-rate applications. In this paper, an improved most-significant-bit (MSB) -first bit-level pipelined add-compare select (ACS) unit structure is proposed. The ACS unit is the main bottleneck on the decoding speed of a Viterbi decoder. By balancing the settling time of different paths in the ACS unit, the length of the critical path is reduced as close as possible to the iteration bound in the ACS unit. With the proposed retimed structure, it is possible to decrease the critical path of the ACS unit by 12% to 15% compared with the conventional MSB-first structures. This reduction in critical path can reduce the level of parallelism (and area) required for a very high-speed Viterbi decoder. 相似文献
18.
19.
Yun-Nan Chang 《The Journal of VLSI Signal Processing》2003,33(3):317-324
This paper presents a novel design of Viterbi decoder based on in-place state metric update and hybrid survivor path management. By exploiting the in-place computation feature of the Viterbi algorithm, the proposed design methodology can result in high-speed and modular architectures suitable for those Viterbi applications with large constraint length. This feature is not only applied to the design of highly regular ACS units, but also exploited in the design of trace-back units for the first time. The proposed hybrid survivor path management based on the combination of register-exchange and trace-back schemes cannot only reduce the number of memory operations, but also the size of memory required. Compared with the general hybrid trace-back structure, the overhead of register-exchange circuit in our architecture is significantly less. Therefore, the proposed architecture can find promising applications in digital communication systems where high-speed large state Viterbi decoders are desirable. 相似文献
20.
LDPC codes can be designed to perform extremely close to the Shannon limit. Achieving such performance with high energy efficiency
is now a main goal in the research community. This work combines knowledge of LDPC decoder message statistics, provided by
density evolution, with knowledge of the physical implementation of decoders to predict switching activity in the decoder
interconnect. In this work we provide results for the switching activity on the interconnect for fully parallel decoders.
However, our model can be applied to partially parallel and serial implementations, and is not limited to interconnect. It
is shown that switching activity can vary by as much as 300%, depending on several hardware design choices. Results of this
work validate the usefulness of the presented model for providing designers with an understanding of how their decoder implementation
choices affect power consumption for any size of LDPC code. This knowledge can be used for making design choices that minimize
decoder power consumption very early in the hardware design process. 相似文献