首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
A 1-Gb/s, four-state, sliding block Viterbi decoder   总被引:1,自引:0,他引:1  
To achieve unlimited concurrency and hence throughput in an area-efficient manner, a sliding block Viterbi decoder (SBVD) is implemented that combines the filtering characteristics of a sliding block decoder with the computational efficiency of the Viterbi algorithm. The SBVD approach reduces decode of a continuous input stream to decode of independent overlapping blocks, without constraining the encoding process. A systolic SBVD architecture is presented that combines forward and backward processing of the block interval. The architecture is demonstrated in a four-state, R=1/2, eight-level soft decision Viterbi decoder that has been designed and fabricated in double-metal CMOS. The 9.21 mm×8.77 mm chip containing 150 k transistors is fully functional at a clock rate of 83 MHz and dissipates 3.0 W under typical operating conditions (VDD=5.0 V, TA =27°C). This corresponds to a block decode rate of 83 MHz, equivalent to a decode rate of 1 Gb/s. For low-power operation, typical parts are fully functional at a clock rate of greater than 12 MHz, equivalent to a decode rate of 144 Mb/s, and dissipate 24 mW at VDD =1.5 V, demonstrating extremely low power consumption at such high rates  相似文献   

2.
In this paper, a low-power Viterbi decoder design based on scarce state transition (SST) is presented. A low complexity algorithm based on a limited search algorithm, which reduces the average number of the add-compare-select computation of the Viterbi algorithm, is proposed and seamlessly integrated with the SST-based decoder. The new decoding scheme has low overhead and facilitates low-power implementation for high throughput applications. We also propose an uneven-partitioned memory architecture for the trace-back survivor memory unit to reduce the overall memory access power. The new Viterbi decoder is designed and implemented in TSMC 0.18-mum CMOS process. Simulation results show that power consumption is reduced by up to 80% for high throughput wireless systems such as Multiband-OFDM Ultra-wideband applications.  相似文献   

3.
A combined 8-PSK modulation and rate 7/9 convolutional coding technique is proposed for 140 Mb/s information rate transmission over the 80 MHz INTELSAT transponders, thus achieving a bandwidth efficiency of 1.75 b/s/Hz of allocated bandwidth. The desired power efficiency is to achieve a bit error rate of 10?6 at an Eb/N0 of 11 dB, including modem and codec implementation losses. The proposed system employs an 8-PSK modem operating at a 60 MHz symbol rate (or 180 Mb/s bit rate), as well as a rate 7/9 convolutional encoder and a 16-state Viterbi algorithm decoder operating at 60 MHz. The rate 7/9 code is periodically time varying and is designed to maximize the Euclidean distance between the modulated codeword sequences, thereby achieving a 3 dB asymptotic coding gain relative to the conventional QPSK system over an AWGN channel. This code is also designed to reduce decoder complexity for high-speed operations. The performance of the proposed system over INTELSAT V and VI non-linear transponders was evaluated by Monte Carlo computer simulation. The 180 Mb/s 8 PSK modem, including the automatic frequency control, automatic gain control, carrier recovery and clock recovery circuits, has been implemented and tested. The complete Viterbi decoder is being implemented on five boards, and the critical add-compare-select (ACS) circuit of the high-speed Viterbi algorithm decoder is being implemented with hybrid technology employing 100-K series emitter-coupled logic dies on specially designed ceramic substrates. The ACS circuit operates at a speed exceeding 120 MHz, well over the design goal of 60 MHz. Construction of this codec is almost complete.  相似文献   

4.
This paper presents a channel decoder that completes both turbo and Viterbi decodings, which are pervasive in many wireless communication systems, especially those that require very low signal-to-noise ratios. The trellis decoding algorithm merges them with less redundancy. However, the implementation is still challenging due to the power consumption in wearable devices. This research investigates an optimized memory scheme and rescheduled data flow to reduce power consumption and chip area. The memory access is reduced by buffering the input symbols, and the area is reduced by reducing the embedded interleaver memory. A test chip is fabricated in a 1.8 V 0.18-/spl mu/m standard CMOS technology and verified to provide 4.25-Mb/s turbo decoding and 5.26-Mb/s Viterbi decoding. The measured power dissipation is 83 mW, while decoding a 3.1 Mb/s turbo encoded data stream with six iterations for each block. The power consumption in Viterbi decoding is 25.1 mW in the 1-Mb/s data rate. The measurement shows the power dissipation is 83 mW for the turbo decoding with six iterations at 3.1 Mb/s, and 25.1 mW for the Viterbi decoding at 1 Mb/s.  相似文献   

5.
A channel decoder chip compliant with the 3GPP mobile wireless standard is described. It supports both data and voice calls simultaneously in a unified turbo/Viterbi decoder architecture. For voice services, the decoder can process over 128 voice channels encoded with rate 1/2 or 1/3, constraint length 9 convolutional codes. For data services, the turbo decoder is capable of processing any mix of rate 1/3, constraint length 4 turbo encoded data streams with an aggregate data rate of up to 2.5 Mb/s with 10 iterations per block (or 4.1 Mb/s with six iterations). The turbo decoder uses the logMAP algorithm with a programmable logsum correction table. It features an interleaver address processor that computes the 3GPP interleaver addresses for all block sizes enabling it to quickly switch context to support different data services for several users. The decoder also contains the 3GPP first channel de-interleaving function and a post-decoder bit error rate estimation unit. The chip is fabricated in a 0.18-/spl mu/m six-layer metal CMOS technology, has an active area of 9 mm/sup 2/, and has a peak clock frequency of 110.8 MHz at 1.8 V (nominal). The power consumption is 306 mW when turbo decoding a 2-Mb/s data stream with ten iterations per block and eight voice calls simultaneously.  相似文献   

6.
An efficient state-sequential very large scale integration (VLSI) architecture and low-power design methodologies ranging from the system-level to the layout-level are presented for a large-constraint-length Viterbi decoder for code division multiple access (CDMA) digital cellular/personal communication services (PCS) applications. The low-power design approaches are also applicable to many other systems and algorithms. VLSI implementation issues and prototype fabrication results for a state-sequential Viterbi decoder for convolutional codes of rate 1/2 and constraint-length 9 are also described. The chip's core, consisting of approximately 65 k transistors, occupies 1.9 mm by 3.4 mm in a 0.8-μm triple-layer-metal n-well CMOS technology. The chip's measured total power dissipation is 0.24 mW at a 14.4 kb/s data-rate with 0.9216 MHz clocking at a supply voltage of 1.65 V. The Viterbi decoder presented here is the lowest power and smallest area core in its class, to the best of our knowledge  相似文献   

7.
We present a rate-1/2 (128,3,6) LDPC convolutional code encoder and decoder that we implemented in a 90-nm CMOS process. The 1.1-Gb/s encoder is a compact, low-power implementation that includes one-hot encoding for phase generation and built-in termination. The decoder design uses a memory-based interface with a minimum number of memory banks to deliver an information throughput of 1 b per clock cycle. The decoder shares one controller among a pipeline of decoder processors. The decoder dissipates 0.61 nJ of energy per decoded information bit at an SNR of 2 dB and a decoded throughput of 600 Mb/s. On-chip test circuitry permits accurate power measurements to be made at selectable SNR settings.   相似文献   

8.
In this paper, a 64-state four-bit soft-decision Viterbi decoder with power saving mechanism for high speed wireless local area network applications is presented. Based on path merging and prediction techniques, a survivor memory unit with hierarchical memory design is proposed to reduce memory access operations. It is found that more than 70% memory access can be reduced by taking advantage of locality. Moreover, a low complexity compare-select-add unit is also presented, leading to save 15% area and 14.3% power dissipation as compared to conventional add-compare-select design. A test chip has been designed and implemented in 0.18-/spl mu/m standard CMOS process. The test results show that 30/spl sim/40% power dissipation can be reduced, and the power efficiency reaches 0.75 mW per Mb/s at 6 Mb/s and 1.26 mW per Mb/s at 54 Mb/s as specified in IEEE 802.11a.  相似文献   

9.
This paper presents a novel design of Viterbi decoder based on in-place state metric update and hybrid survivor path management. By exploiting the in-place computation feature of the Viterbi algorithm, the proposed design methodology can result in high-speed and modular architectures suitable for those Viterbi applications with large constraint length. This feature is not only applied to the design of highly regular ACS units, but also exploited in the design of trace-back units for the first time. The proposed hybrid survivor path management based on the combination of register-exchange and trace-back schemes cannot only reduce the number of memory operations, but also the size of memory required. Compared with the general hybrid trace-back structure, the overhead of register-exchange circuit in our architecture is significantly less. Therefore, the proposed architecture can find promising applications in digital communication systems where high-speed large state Viterbi decoders are desirable.  相似文献   

10.
Two eight-state 7-bit soft-output Viterbi decoders matched to an EPR4 channel and a rate-8/9 convolutional code are implemented in a 0.18-/spl mu/m CMOS technology. The throughput of the decoders is increased through architectural transformation of the add-compare-select recursion, with a small area overhead. The survivor-path decoding logic of a conventional Viterbi decoder register exchange is adapted to detect the two most likely paths. The 4-mm/sup 2/ chip has been verified to decode at 500 Mb/s with 1.8-V supply. These decoders can be used as constituent decoders for Turbo codes in high-performance applications requiring information rates that are very close to the Shannon limit.  相似文献   

11.
A high-speed Viterbi decoder VLSI with coding rate R=1/2 and constraint length K=7 for bit-error correction has been developed using 1.5-/spl mu/m n-well CMOS technology. To reduce both hardware size and power dissipation, a recently developed scarce-state-transition (SST) Viterbi decoding scheme has been utilized. In addition, three-layer metallization and an advanced hierarchical macrocell design method (HMCM) have been adopted to improve packing density and reduce chip size. As a result, active chip area has been reduced by half, compared to the conventional standard cell design method (SCM) with two-layer metallization, and 42 K gates have been integrated on a chip with a die size of 9.52/spl times/10.0 mm/SUP 2/. The VLSI decoder has achieved a maximum data throughput rate of 23 Mb/s with a net coding gain of 4.4 dB (at 10/SUP -4/ bit-error rate). The chip dissipates only 825 mW at a data rate of 10 Mb/s.  相似文献   

12.
A 1.5-ns address access time, 256-kb BiCMOS SRAM has been developed. To attain this ultra-high-speed access time, an emitter-coupled logic (ECL) word driver is used to access 6-T CMOS memory cells, eliminating the ECL-MOS level-shifter time delay. The RAM uses a low-power active pull down ECL decoder. The chip contains 11-K, 60-ps ECL circuit gates. It provides variable RAM configurations and general logic functions. RAM power consumption is 18 W; chip power consumption is 35 W. The chip is fabricated by using a 0.5-μm BiCMOS process. The memory cell size is 58 μm2 and the chip size is 11×11 mm  相似文献   

13.
An advanced, high-speed, and universal-coding-rate Viterbi decoder VLSI implementation is presented. Two novel circuit design schemes have been proposed: scarce state transition (SST) decoding and direct high-coding-rate convolutional code generation and variable-rate decoding. SST makes it possible to omit the final decision circuit and to reduce the required path memory length without degrading error probability performance. Moreover, the power consumption of the SST Viterbi decoder is significantly reduced when implemented as a CMOS device. These features overcome the speed limits of high-speed and high-coding-gain Viterbi decoder VLSIs in the rate one-half mode imposed by the thermal limitation. The other Viterbi decoding scheme makes it possible to realize a simple and variable coding-rate forward-error-correction circuit by changing only the branch metric calculation ROM tables. By employing these schemes, high-speed (25-Mb/s) and universal-coding-rate Viterbi decoder VLSIs have been developed  相似文献   

14.
This paper investigates VLSI architectures for low-density parity-check (LDPC) decoders amenable to low- voltage and low-power operation. First, a highly-parallel decoder architecture with low routing overhead is described. Second, we propose an efficient method to detect early convergence of the iterative decoder and terminate the computations, thereby reducing dynamic power. We report on a bit-serial fully-parallel LDPC decoder fabricated in a 0.13-$mu{hbox{m}}$ CMOS process and show how the above techniques affect the power consumption. With early termination, the prototype is capable of decoding with 10.4 pJ/bit/iteration, while performing within 3 dB of the Shannon limit at a BER of 10$^{-5}$ and with 3.3 Gb/s total throughput. If operated from a 0.6 V supply, the energy consumption can be further reduced to 2.7 pJ/bit/iteration while maintaining a total throughput of 648 Mb/s, due to the highly-parallel architecture. To demonstrate the applicability of the proposed architecture for longer codes, we also report on a bit-serial fully-parallel decoder for the (2048, 1723) LDPC code in 10GBase-T standard synthesized with a 90-nm CMOS library.   相似文献   

15.
Soft-output decoding has evolved as a key technology for new error correction approaches with unprecedented performance as well as for improvement of well established transmission techniques. In this paper, we present a high-speed VLSI implementation of the soft-output Viterbi algorithm, a low complexity soft-output algorithm, for a 16-state convolutional code. The 43 mm2 standard cell chip achieves a simulated throughput of 40 Mb/s, while tested samples achieved a throughput of 50 Mb/s. The chip is roughly twice as big as a 16-state Viterbi decoder without soft outputs. It is thus shown with the design that transmission schemes using soft-output decoding can be considered practical even at very high throughput. Since such decoding systems are more complex to design than hard output systems, special emphasis is placed on the employed design methodology  相似文献   

16.
A node-parallel Viterbi decoding architecture and bit-serial processing and communication are presented. An important aspect of this structure is that short-constraint-length decoders may be interconnected, without loss of throughput, to implement a Viterbi decoder of larger constraint length. The convolutional encoder trellis is modeled by appropriate wiring of decoder processing nodes: a variety of generating codes can be accommodated. Bit-serial communication links between nodes require only a single wire each and thus interconnection area is relatively small. During each decoding cycle, more than 50 b need to be communicated on each serial link and thus the technique is limited to moderate bit rate applications. A constraint length K=4 `proof of concept' chip was developed using 9860 transistors in 3 μm CMOS on a 4.51-mm×4.51-mm die size. The complete circuit operates at 280 kb/s and supports any rate 1/2 or 1/3 code with eight-level soft decision  相似文献   

17.
袁金仕  卢焕章 《电讯技术》2005,45(3):159-161
Viterbi译码算法用FPGA实现时,其硬件资源消耗与译码速度始终是相互制约的两个方面,通过合理安排ACS单元和路径度量存储单元可有效缓解这两方面的矛盾。本文以(2,1,6)卷积码为例,基于基4算法提出的动态路径度量存储管理方法能在不影响译码速度的前提下有效降低译码器的硬件复杂度。  相似文献   

18.
Design of a 20-mb/s 256-state Viterbi decoder   总被引:1,自引:0,他引:1  
The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often consist of long interconnect wires, resulting in large area and high power consumption. In this paper, we propose a data transfer oriented design methodology to implement a low-power 256-state rate-1/3 Viterbi decoder. Our architectural level scheme uses operation partitioning, packing, and scheduling to analyze and optimize interconnect effects in early design stages. In comparison with other published Viterbi decoders, our approach reduces the global data transfers by up to 75% and decreases the amount of global buses by up to 48%, while enabling the use of deeply pipelined datapaths with no data forwarding. In the register-transfer level (RTL) implementation, we apply precomputation in conjunction with saturation arithmetic to further reduce power dissipation with provably no coding performance degradation. Designed using a 0.25 /spl mu/m standard cell library, our decoder achieves a throughput of 20 Mb/s in simulation and dissipates only 0.45 W.  相似文献   

19.
This paper describes a 256 Mb DRAM chip architecture which provides up to ×32 wide organization. In order to minimize the die size, three new techniques: an exchangeable hierarchical data line structure, an irregular sense amp layout, and a split address bus with local redrive scheme in the both-ends DQ were introduced. A chip has been developed based on the architecture with 0.25 μm CMOS technology. The chip measures 13.25 mm×21.55 mm, which is the smallest 256 Mb DRAM ever reported. A row address strobe (RAS) access time of 26 ns was obtained under 2.8 V power supply and 85°C. In addition, a 100 MHz×32 page mode operation, namely 400 M byte/s data rate, in the standard extended data output (EDO) cycle has been successfully demonstrated  相似文献   

20.
A 640-Mb/s 2048-bit programmable LDPC decoder chip   总被引:3,自引:0,他引:3  
A 14.3-mm/sup 2/ code-programmable and code-rate tunable decoder chip for 2048-bit low-density parity-check (LDPC) codes is presented. The chip implements the turbo-decoding message-passing (TDMP) algorithm for architecture-aware (AA-)LDPC codes which has a faster convergence rate and hence a throughput advantage over the standard decoding algorithm. It employs a reduced complexity message computation mechanism free of lookup tables, and features a programmable network for message interleaving based on the code structure. The chip decodes any mix of 2048-bit rate-1/2 (3,6)-regular AA-LDPC codes in standard mode by programming the network, and attains a throughput of 640 Mb/s at 125 MHz for 10 TDMP-decoding iterations. In augmented mode, the code rate can be tuned up to 14/16 in steps of 1/16 by augmenting the code. The chip is fabricated in 0.18-/spl mu/m six-metal-layer CMOS technology, operates at a peak clock frequency of 125 MHz at 1.8 V (nominal), and dissipates an average power of 787 mW.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号