期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Implementation of scalable power and area efficient high-throughputViterbi decoders

Gemmeke T. Gansen M. Noll T.G. 《Solid-State Circuits, IEEE Journal of》2002,37(7):941-948

Today's data reconstruction in digital communication systems requires designs of highest throughput rate at low power. The Viterbi algorithm is a key element in such digital signal processing applications. The nonlinear and recursive nature of the Viterbi decoder makes its high-speed implementation challenging. Several promising approaches to achieve either high throughput or low power have been proposed in the past. A combination of these is developed in this paper. Additional new concepts allow building a signal-flow graph suitable for the design of high-speed Viterbi decoders with low power. Using a flexible datapath generator facilitates the essential quantitative optimization from architectural down to physical level to fully exploit the low-power and high-speed potential of a given technology. With parameterizable design entry, this datapath generator establishes the basis of a scalable platform-based design library. Altogether, this allows coverage of the range of today's industrial interest in high throughput rates, from 150 Msymbols/s up to 1.2 Gsymbols/s using conventional CMOS logic. The features of two exemplary Viterbi decoder implementations prove the benefit of this physically oriented design methodology in terms of speed and low power, when compared to other leading edge implementations 相似文献

2.

Parallel high-throughput limited search trellis decoder VLSI design

Fei Sun Tong Zhang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1013-1022

Limited search trellis decoding algorithms have great potentials of realizing low power due to their largely reduced computational complexity compared with the widely used Viterbi algorithm. However, because of the lack of operational parallelism and regularity in their original formulations, the limited search decoding algorithms have been traditionally ruled out for applications demanding very high throughput. We believe that, through appropriate algorithm and hardware architecture co-design, certain limited search trellis decoding algorithms can become serious competitors to the Viterbi algorithm for high-throughout applications. Focusing on the well-known T-algorithm, this paper presents techniques at the algorithm and VLSI architecture levels to design fully parallel T-algorithm limited search trellis decoders. We first develop a modified T-algorithm, called SPEC-T, to improve the algorithmic parallelism. Then, based on the conventional state-parallel register exchange Viterbi decoder, we develop a parallel SPEC-T decoder architecture that can effectively transform the reduced computational complexity at the algorithm level to the reduced switching activities in the hardware. We demonstrate the effectiveness of the SPEC-T design solution in the context of convolutional code decoding. Compared with state-parallel register exchange Viterbi decoders, the SPEC-T convolutional code decoders can achieve almost the same throughput and decoding performance, while realizing up to 56% power savings. For the first time, this work provides an approach to exploit the low power potential of the T-algorithm in very high throughput applications. 相似文献

3.

A 2-Mb/s 256-state 10-mW rate-1/3 Viterbi decoder

Yun-Nan Chang Suzuki H. Parhi K.K. 《Solid-State Circuits, IEEE Journal of》2000,35(6):826-834

This paper presents a low-power bit-serial Viterbi decoder chip with the code rate r=1/3 and the constraint length K=9 (256 states) for next generation wireless communication applications. The architecture of the add-compare-select (ACS) module is based on the bit-serial arithmetic and implemented with the pass transistor logic circuit. A cluster-based ACS placement and state metric routing topology is described for the 256 bit-serial ACS units, which achieves very high area efficiency. In the trace-back operation, a power efficient trace-back scheme, allowing higher memory read access rate than memory write in a time-multiplexing method, is implemented to reduce the number of iterations required to generate a decoded output. In addition, a low-power application-specific memory suitable for the function of survivor path memory has also been developed. The chip's core, implemented using 0.5-μm CMOS technology, contains approximately 200 K transistors and occupies 2.46 mm by 4.17 mm area. This chip can achieve the decode rate of 20 Mb/s under 3.3 V and 2 Mb/s under 1.8 V. The measured power dissipation at 2 Mb/s under 1.8 V is only about 9.8 mW. The Viterbi decoder presented here can be applied to next generation wide-band code division multiple access (W-CDMA) systems 相似文献

4.

A 1-Gb/s, four-state, sliding block Viterbi decoder 总被引：1，自引：0，他引：1

Black P.J. Meng T.H.-Y. 《Solid-State Circuits, IEEE Journal of》1997,32(6):797-805

To achieve unlimited concurrency and hence throughput in an area-efficient manner, a sliding block Viterbi decoder (SBVD) is implemented that combines the filtering characteristics of a sliding block decoder with the computational efficiency of the Viterbi algorithm. The SBVD approach reduces decode of a continuous input stream to decode of independent overlapping blocks, without constraining the encoding process. A systolic SBVD architecture is presented that combines forward and backward processing of the block interval. The architecture is demonstrated in a four-state, R=1/2, eight-level soft decision Viterbi decoder that has been designed and fabricated in double-metal CMOS. The 9.21 mm×8.77 mm chip containing 150 k transistors is fully functional at a clock rate of 83 MHz and dissipates 3.0 W under typical operating conditions (V_DD=5.0 V, T_A=27°C). This corresponds to a block decode rate of 83 MHz, equivalent to a decode rate of 1 Gb/s. For low-power operation, typical parts are fully functional at a clock rate of greater than 12 MHz, equivalent to a decode rate of 144 Mb/s, and dissipate 24 mW at V_DD=1.5 V, demonstrating extremely low power consumption at such high rates 相似文献

5.

Adaptive Viterbi algorithm with ARQ for bounded complexity decoding

Harvey B.A. 《Wireless Communications, IEEE Transactions on》2004,3(6):1948-1952

Adaptive bounded computational and memory requirements for a Viterbi decoder can be achieved using an error trapping Viterbi decoder algorithm initially develop for hybrid automatic repeat request (ARQ) implementations. Partial path metrics and a sliding window are used to eliminate unreliable paths in the decoder trellis thus reducing the computational and memory requirements. An ARQ is issued if all paths are eliminated. The algorithm is adaptive allowing the receiver to dynamically allocate memory and processing, to improve reliability or received packets, or to reject packets with lower reliability to avoid buffer overruns. The result is the ability to trade off resources versus delay and throughput. 相似文献

6.

Low-Power State-Parallel Relaxed Adaptive Viterbi Decoder

Sun F. Zhang T. 《IEEE transactions on circuits and systems. I, Regular papers》2007,54(5):1060-1068

Although it possesses reduced computational complexity and great power saving potential, conventional adaptive Viterbi algorithm implementations contain a global best survivor path metric search operation that prevents it from being directly implemented in a high-throughput state-parallel decoder. This limitation also incurs power and silicon area overhead. This paper presents a modified adaptive Viterbi algorithm, referred to as the relaxed adaptive Viterbi algorithm, that completely eliminates the global best survivor path metric search operation. A state-parallel decoder VLSI architecture has been developed to implement the relaxed adaptive Viterbi algorithm. Using convolutional code decoding as a test vehicle, we demonstrate that state-parallel relaxed adaptive Viterbi decoders, versus Viterbi counterparts, can achieve significant power savings and modest silicon area reduction, while maintaining almost the same decoding performance and very high throughput 相似文献

7.

Viterbi解码器RTL级设计优化 总被引：1，自引：0，他引：1

喻希《现代电子技术》2006,29(23):137-139,142

当今芯片产业竞争激烈,速度低、面积大、功耗高的产品难以在市场中占有一席之地。Viterbi解码器作为一种基于最大后验概率的最优化卷积码解码器,被广泛应用于多种数字通信系统中,却由于其较高算法复杂程度,给芯片设计带来了挑战。针对芯片的速度、面积和功耗,通过对Viterbi解码器RTL级设计的若干优化方法进行研究和讨论,实现了一个应用于DVB-S系统的面积约为2万门的Viterbi解码器。相似文献

8.

一种高速Viterbi译码器的设计与实现 总被引：3，自引：0，他引：3

下载免费PDF全文

李刚黑勇乔树山仇玉林《电子器件》2007,30(5):1886-1889

Viterbi算法是卷积码的最优译码算法.设计并实现了一种高速(3,1,7)Viterbi译码器,该译码器由分支度量单元(BMU)、加比选单元(ACSU)、幸存路径存储单元(SMU)、控制单元(CU)组成.在StratixⅡ FPGA上实现、验证了该Viterbi译码器.验证结果表明,该译码器数据吞吐率达到231Mbit/s,在加性高斯白噪声(AWGN)信道下的误码率十分接近理论仿真值.与同类型Viterbi译码器比较,该译码器具有高速、硬件实现代价低的特点. 相似文献

9.

基于IP Core的Viterbi高速译码器测试

管立新《电子质量》2006,(9):21-23

深入研究了基于Altera的Viterbi v4.3.0 IP核实现高速维特比译码器的测试方法,详细分析了译码器的Atlantic接口信号,给出了采用Paralle1结构Viterbi译码器的仿真结果.研究结果表明应用Viterbi v4.3.0能够设计出符合不同性能要求的高性能维特比译码器,采用面向数据包传输的Atlantic接口使Viterbi译码器具有很高的吞吐量. 相似文献

10.

FPGA design and implementation of a low-power systolic array-based adaptive Viterbi decoder 总被引：1，自引：0，他引：1

Man Guo Ahmad M.O. Swamy M.N.S. Chunyan Wang 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(2):350-365

In this paper, by modifying the well-known Viterbi algorithm, an adaptive Viterbi algorithm that is based on strongly connected trellis decoding is proposed. Using this algorithm, the design and a field-programmable gate array implementation of a low-power adaptive Viterbi decoder with a constraint length of 9 and a code rate of 1/2 is presented. In this design, a novel systolic array-based architecture with time multiplexing and arithmetic pipelining for implementing the proposed algorithm is used. It is shown that the proposed algorithm can reduce by up to 70% the average number of ACS computations over that by using the nonadaptive Viterbi algorithm, without degradation in the error performance. This results in lowering the switching activities of the logic cells, with a consequent reduction in the dynamic power. Further, it is shown that the total power consumption in the implementation of the proposed algorithm can be reduced by up to 43% compared to that in the implementation of the nonadaptive Viterbi algorithm, with a negligible increase in the hardware. 相似文献

11.

一种合并状态度量计算的高效并行Turbo码译码器结构设计及FPGA实现

张茜詹明章坚武王富龙冯云开唐浩《电信科学》2022,38(2):47-58

为满足无线通信中高吞吐、低功耗的要求,并行译码器的结构设计得到了广泛的关注。基于并行Turbo码译码算法,研究了前后向度量计算中的对称性,提出了一种基于前后向合并计算的高效并行Turbo码译码器结构设计方案,并进行现场可编程门阵列(field-programmable gate array,FPGA)实现。结果表明,与已有的并行Turbo码译码器结构相比,本文提出的设计结构使状态度量计算模块的逻辑资源降低50%左右,动态功耗在125 MHz频率下降低5.26%,同时译码性能与并行算法的译码性能接近。相似文献

12.

Low-power Viterbi decoder for CDMA mobile terminals

Kang I. Willson A.N. Jr 《Solid-State Circuits, IEEE Journal of》1998,33(3):473-482

An efficient state-sequential very large scale integration (VLSI) architecture and low-power design methodologies ranging from the system-level to the layout-level are presented for a large-constraint-length Viterbi decoder for code division multiple access (CDMA) digital cellular/personal communication services (PCS) applications. The low-power design approaches are also applicable to many other systems and algorithms. VLSI implementation issues and prototype fabrication results for a state-sequential Viterbi decoder for convolutional codes of rate 1/2 and constraint-length 9 are also described. The chip's core, consisting of approximately 65 k transistors, occupies 1.9 mm by 3.4 mm in a 0.8-μm triple-layer-metal n-well CMOS technology. The chip's measured total power dissipation is 0.24 mW at a 14.4 kb/s data-rate with 0.9216 MHz clocking at a supply voltage of 1.65 V. The Viterbi decoder presented here is the lowest power and smallest area core in its class, to the best of our knowledge 相似文献

13.

A 40 Mb/s soft-output Viterbi decoder

Joeressen O.J. Meyr H. 《Solid-State Circuits, IEEE Journal of》1995,30(7):812-818

Soft-output decoding has evolved as a key technology for new error correction approaches with unprecedented performance as well as for improvement of well established transmission techniques. In this paper, we present a high-speed VLSI implementation of the soft-output Viterbi algorithm, a low complexity soft-output algorithm, for a 16-state convolutional code. The 43 mm² standard cell chip achieves a simulated throughput of 40 Mb/s, while tested samples achieved a throughput of 50 Mb/s. The chip is roughly twice as big as a 16-state Viterbi decoder without soft outputs. It is thus shown with the design that transmission schemes using soft-output decoding can be considered practical even at very high throughput. Since such decoding systems are more complex to design than hard output systems, special emphasis is placed on the employed design methodology 相似文献

14.

Design of a 20-mb/s 256-state Viterbi decoder 总被引：1，自引：0，他引：1

Xun Liu Papaefthymiou M.C. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(6):965-975

The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 /spl mu/m standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W. 相似文献

15.

Viterbi译码器的优化设计

袁金仕卢焕章《电讯技术》2005,45(3):159-161

Viterbi译码算法用FPGA实现时，其硬件资源消耗与译码速度始终是相互制约的两个方面，通过合理安排ACS单元和路径度量存储单元可有效缓解这两方面的矛盾。本文以(2，1，6)卷积码为例，基于基4算法提出的动态路径度量存储管理方法能在不影响译码速度的前提下有效降低译码器的硬件复杂度。相似文献

16.

An Efficient In-Place VLSI Architecture for Viterbi Algorithm

Yun-Nan Chang 《The Journal of VLSI Signal Processing》2003,33(3):317-324

This paper presents a novel design of Viterbi decoder based on in-place state metric update and hybrid survivor path management. By exploiting the in-place computation feature of the Viterbi algorithm, the proposed design methodology can result in high-speed and modular architectures suitable for those Viterbi applications with large constraint length. This feature is not only applied to the design of highly regular ACS units, but also exploited in the design of trace-back units for the first time. The proposed hybrid survivor path management based on the combination of register-exchange and trace-back schemes cannot only reduce the number of memory operations, but also the size of memory required. Compared with the general hybrid trace-back structure, the overhead of register-exchange circuit in our architecture is significantly less. Therefore, the proposed architecture can find promising applications in digital communication systems where high-speed large state Viterbi decoders are desirable. 相似文献

17.

数字流水Viterbi译码器的VLSI设计

冯昭志黄载禄《通信学报》1995,16(3):50-55

本文介绍了高速数字流水Ｖｉｔｅｒｂｉ译码器的ＶＬＳＩ设计。在符号４值系统的基础上，给出Ｖｉｔｅｒｂｉ算法的新的功能分解公式，并介绍了用于译码器实现的两个重要的快速运算部件ＡＤＤ和ＭＡＸ的原理及其现场可编程（序）门阵列（ＦＰＧＡ）实现。文中详细讨论了译码器的ＶＬＳＩ结构、设计和性能分析。本文给出的Ｖｉｔｅｒｂｉ译码器可塑性强，并具有高度的并行性和很高的数据吞吐率。相似文献

18.

A Viterbi Decoder with Efficient Memory Management

Chanho Lee 《ETRI Journal》2004,26(1):21-26

This paper proposes a new architecture for a Viterbi decoder with an efficient memory management scheme. The trace‐back operation is eliminated in the architecture and the memory storing intermediate decision information can be removed. The elimination of the trace‐back operation also reduces the number of operation cycles needed to determine decision bits. The memory size of the proposed scheme is reduced to 1/(5×constraint length) of that of the register exchange scheme, and the throughput is increased up to twice that of the trace‐back scheme. A Viterbi decoder complying with the IS‐95 reverse link specification is designed to verify the proposed architecture. The decoder has a code rate of 1/3, a constraint length of 9, and a trace‐forward depth of 45. 相似文献

19.

大约束度Viterbi译码器中路径存储单元的设计

王春光陈新《现代电子技术》2007,30(13):51-54

维特比(Viterbi)译码器由于其优良的纠错性能,在通信领域有着十分广泛的应用。用FPGA实现Viterbi译码算法时,其硬件资源的消耗与译码速度始终是相互制约的两个方面,通过合理安排加比选单元和路径度量存储单元可有效缓解这一矛盾。基于基4算法所提出的同址路径度量存储管理方法能在提高译码速度同时有效降低译码器的硬件资源需求。相似文献

20.

A Memory Efficient Partially Parallel Decoder Architecture for Quasi-Cyclic LDPC Codes

Wang Z. Cui Z. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):483-488

This paper presents a memory efficient partially parallel decoder architecture suited for high rate quasi-cyclic low-density parity-check (QC-LDPC) codes using (modified) min-sum algorithm for decoding. In general, over 30% of memory can be saved over conventional partially parallel decoder architectures. Efficient techniques have been developed to reduce the computation delay of the node processing units and to minimize hardware overhead for parallel processing. The proposed decoder architecture can linearly increase the decoding throughput with a small percentage of extra hardware. Consequently, it facilitates the applications of LDPC codes in area/power sensitive high-speed communication systems 相似文献