共查询到20条相似文献,搜索用时 531 毫秒
1.
This paper proposes a high-speed and area-efficient three-parallel Reed-Solomon (RS) decoder using the simplified degree computationless
modified Euclid (S-DCME) algorithm for the key equation solver (KES) block. To achieve a high throughput rate, the inner signals,
such as the syndrome, error locator and error value polynomials, are computed in parallel. In addition, the key equations are solved by using the S-DCME algorithm to
reduce the hardware complexity. To handle the many problems caused by applying the S-DCME algorithm to the KES block, we modify
the architectures of some of the blocks in the three-parallel RS decoder. The proposed RS architecture can reduce the hardware
complexity by about 80% with respect to the KES block. In addition, the proposed RS architecture has an approximately 25%
shorter latency than the conventional parallel RS architectures. 相似文献
2.
3.
New degree computationless modified euclid algorithm and architecture for Reed-Solomon decoder 总被引:3,自引:0,他引:3
Baek J.H. Sunwoo M.H. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(8):915-920
This paper proposes a new degree computationless modified Euclid (DCME) algorithm and its dedicated architecture for Reed-Solomon (RS) decoder. This architecture has low hardware complexity compared with conventional modified Euclid (ME) architectures, since it can completely remove the degree computation and comparison circuits. The architecture employing a systolic array requires only the latency of 2t clock cycles to solve the key equation without initial latency. In addition, the DCME architecture using 3t+2 basic cells has regularity and scalability since it uses only one processing element. Hence, the proposed DCME architecture provides the short latency and low-cost RS decoding. The DCME architecture has been synthesized using the 0.25-/spl mu/m Faraday CMOS standard cell library and operates at 200 MHz. The gate count of the DCME architecture is 21 760. Hence, the RS decoder using the proposed DCME architecture can reduce the total gate count by at least 23% and the total latency to at least 10% compared with conventional ME decoders. 相似文献
4.
Xinmiao Zhang Parhi K.K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(4):413-426
Reed-Solomon (RS) codes are among the most widely utilized block error-correcting codes in modern communication and computer systems. Compared to its hard-decision counterpart, soft-decision decoding offers considerably higher error-correcting capability. The recent development of soft-decision RS decoding algorithms makes their hardware implementations feasible. Among these algorithms, the Koetter-Vardy (KV) algorithm can achieve substantial coding gain for high-rate RS codes, while maintaining a polynomial complexity with respect to the code length. In the KV algorithm, the factorization step can consume a major part of the decoding latency. A novel architecture based on root-order prediction is proposed in this paper to speed up the factorization step. As a result, the time-consuming exhaustive-search-based root computation in each iteration level, except the first one, of the factorization step is circumvented with more than 99% probability. Using the proposed architecture, a speedup of 141% can be achieved over prior efforts for a (255, 239) RS code, while the area consumption is reduced to 31.4%. 相似文献
5.
本文提出了一种全新的低延滞、高吞吐率、可编程的VLSI树型结构,它能十分有效地实现FSA和TSSA运动估计算法。该结构比其它树型结构少1/3的处理单元(PE),而且PE单元的延时减少一半。独特的ME窗缓冲结构使I/O带宽和I/O管脚大大减小,交叉流水线技术使硬件利用率可达到100%。这些特点使得该结构适合VLSI实现。 相似文献
6.
A taxonomy of VLSI grid model layouts is presented for the implementation of certain types of digital communication receivers based on the Viterbi algorithm. We deal principally with networks of many simple processors connected to perform the Viterbi algorithm in a highly parallel way. Two interconnection patterns of interest are the "shuffleexchange" and the "cube-connected cycles." The results are generally applicable to the development of area-efficient VLSI circuits for decoding: convolutional codes, coded modulation with multilevel/phase signals, punctured convolutional codes, correlatively encoded MSK signals and for maximum likelihood sequence estimation ofM -ary signals on intersymbol interference channels. In a companion paper, we elaborate on how the concepts presented here can be applied to the problem of building encoded MSK Viterbi receivers. Lower bounds are established on the product (chip area) * (baud rate)-2and on the energy consumption that any VLSI implementation of the Viterbi algorithm must obey, regardless of the architecture employed or the intended application. 相似文献
7.
《IEEE transactions on circuits and systems. I, Regular papers》2008,55(10):3050-3062
8.
9.
Lin J. Sha J. Wang Z. Li L. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2010,57(1):51-55
10.
Reed-Solomon (RS) codes play an important role in providing error protection and data integrity. Among various Reed-Solomon decoding algorithms, the Peterson-Gorenstein-Zierler (PGZ) algorithm in general has the least computational complexity for small t values. However, unlike the iterative approaches (e.g., Berlekamp-Massey and Euclidean algorithms), it will encounter divided-by-zero problems in solving multiple t values. In this paper, we propose a multi-mode hardware architecture for error numbers ranging from zero to three. We first propose a cost-down technique to reduce the hardware complexity of a t = 3 decoder. A Finite-field Inversion (FFI) elimination scheme is also proposed in our PGZ kernel. Next, we perform an algorithmic-level derivation to identify the configurable feature of our design. With those manipulations, we are able to perform multi-mode RS decoding in one unified VLSI architecture with very simple control scheme. The very low cost and simple data-path make our design a good choice in small-footprint embedded VLSI systems such as Error Control Coding (ECC) in memory/storage systems. 相似文献
11.
Cardarilli G.C. Pontarelli S. Re M. Salsano A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(7):842-846
Reed-Solomon (RS) codes are widely used to identify and correct errors in transmission and storage systems. When RS codes are used for high reliable systems, the designer should also take into account the occurrence of faults in the encoder and decoder subsystems. In this paper, self-checking RS encoder and decoder architectures are presented. The RS encoder architecture exploits some properties of the arithmetic operations in GF(2m). These properties are related to the parity of the binary representation of the elements of the Galois field. In the RS decoder, the implicit redundancy of the received codeword, under suitable assumptions explained in this paper, allows implementing concurrent error detection schemes useful for a wide range of different decoding algorithms with no intervention on the decoder architecture. Moreover, performances in terms of area and delay overhead for the proposed circuits are presented. 相似文献
12.
Tzu-Der Chuang Yu-Jen Chen Yi-Hau Chen Shao-Yi Chien Liang-Gee Chen 《Journal of Signal Processing Systems》2010,60(3):363-375
In addition to coding efficiency, the scalable extension of H.264/AVC provides good functionality for video adaptation in
heterogeneous environments. Fine grain scalability (FGS) is a technique to extract video bitstream at the finest quality level
under the given bandwidth. In this paper, an architecture of FGS encoder with low external memory bandwidth and low hardware
cost is proposed. Up to 99% of bandwidth reduction can be attained by the proposed scan bucket algorithm, early context modeling
with context reduction, and first scan pre-encoding. The area-efficient hardware architecture is implemented by layer-wise
hardware reuse. Besides, three design strategies for enhancement layer coder are explored so that the trade-off between external
memory bandwidth and silicon area is allowed. The proposed hardware architecture can real-time encode HDTV 1920×1080 video
with two FGS enhancement layers at 200 MHz working frequency, or HDTV 1280×720 video with three FGS enhancement layers at
130 MHz working frequency. 相似文献
13.
The Lee metric measures the circular distance between two elements in a cyclic group and is particularly appropriate as a measure of distance for data transmission under phase-shift-keying modulation over a white noise channel. In this paper, using newly derived properties on Newton?s identities, we initially investigate the Lee distance properties of a class of BCH codes and show that (for an appropriate range of parameters) their minimum Lee distance is at least twice their designed Hamming distance. We then make use of properties of these codes to devise an efficient algebraic decoding algorithm that successfully decodes within the above lower bound of the Lee error-correction capability. Finally, we propose an attractive design for the corresponding VLSI architecture that is only mildly more complex than popular decoder architectures under the Hamming metric; since the proposed architecture can also be used for decoding under the Hamming metric without extra hardware, one can use the proposed architecture to decode under both distance metrics (Lee and Hamming). 相似文献
14.
Reed-Solomon (RS) codes are among the most widely utilized error-correcting codes in digital communication and storage systems. Among the decoding algorithms of RS codes, the recently developed Koetter-Vardy (KV) soft-decision decoding algorithm can achieve substantial coding gain, while has a polynomial complexity. One of the major steps of the KV algorithm is the factorization. Each iteration of the factorization mainly consists of root computations over finite fields and polynomial updating. To speed up the factorization step, a fast factorization architecture has been proposed to circumvent the exhaustive-search-based root computation from the second iteration level by using a root-order prediction scheme. Based on this scheme, a partial parallel factorization architecture was proposed to combine the polynomial updating in adjacent iteration levels. However, in both of these architectures, the root computation in the first iteration level is still carried out by exhaustive search, which accounts for a significant part of the overall factorization latency. In this paper, a novel iterative prediction scheme is proposed for the root computation in the first iteration level. The proposed scheme can substantially reduce the latency of the factorization, while only incurs negligible area overhead. Applying this scheme to a (255, 239) RS code, speedups of 36% and 46% can be achieved over the fast factorization and partial parallel factorization architectures, respectively. 相似文献
15.
In this paper, we propose a low complexity decoder architecture for low-density parity-check (LDPC) codes using a variable quantization scheme as well as an efficient highly-parallel decoding scheme. In the sum-product algorithm for decoding LDPC codes, the finite precision implementations have an important tradeoff between decoding performance and hardware complexity caused by two dominant area-consuming factors: one is the memory for updated messages storage and the other is the look-up table (LUT) for implementation of the nonlinear function Ψ(x). The proposed variable quantization schemes offer a large reduction in the hardware complexities for LUT and memory. Also, an efficient highly-parallel decoder architecture for quasi-cyclic (QC) LDPC codes can be implemented with the reduced hardware complexity by using the partially block overlapped decoding scheme and the minimized power consumption by reducing the total number of memory accesses for updated messages. For (3, 6) QC LDPC codes, our proposed schemes in implementing the highly-parallel decoder architecture offer a great reduction of implementation area by 33% for memory area and approximately by 28% for the check node unit and variable node unit computation units without significant performance degradation. Also, the memory accesses are reduced by 20%. 相似文献
16.
Bo Yuan Zhongfeng Wang Li Li Minglun Gao Jin Sha Chuan Zhang 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2009,56(6):469-473
A high-speed low-complexity Reed-Solomon (RS) decoder architecture based on the recursive degree computationless modified Euclidean (rDCME) algorithm is presented in this brief. The proposed architecture has very low hardware complexity compared with the conventional modified Euclidean and degree computationless modified Euclidean (DCME) architectures, since it can reduce the degree computation circuitry and replace the conventional systolic architecture that uses many processing elements (PEs) with a recursive architecture using a single PE. A high-throughput data rate is also facilitated by employing a pipelining technique. The proposed rDCME architecture has been designed and implemented using SMIC 0.18-mum CMOS technology. Synthesized results show that the proposed RS (255, 239) decoder requires only about 18 K gates and can operate at 640 MHz to achieve a throughput of 5.1 Gb/s, which meets the requirement of modern high-speed optical communications. 相似文献
17.
Random linear network coding is an efficient technique for disseminating information in networks, but it is highly susceptible to errors. Kötter-Kschischang (KK) codes and Mahdavifar-Vardy (MV) codes are two important families of subspace codes that provide error control in noncoherent random linear network coding. List decoding has been used to decode MV codes beyond half distance. Existing hardware implementations of the rank metric decoder for KK codes suffer from limited throughput, long latency and high area complexity. The interpolation-based list decoding algorithm for MV codes still has high computational complexity, and its feasibility for hardware implementations has not been investigated. In this paper we propose efficient decoder architectures for both KK and MV codes and present their hardware implementations. Two serial architectures are proposed for KK and MV codes, respectively. An unfolded decoder architecture, which offers high throughput, is also proposed for KK codes. The synthesis results show that the proposed architectures for KK codes are much more efficient than rank metric decoder architectures, and demonstrate that the proposed decoder architecture for MV codes is affordable. 相似文献
18.
Two efficient approaches are proposed to improve the performance of soft-output Viterbi (1998) algorithm (SOVA)-based turbo decoders. In the first approach, an easily obtainable variable and a simple mapping function are used to compute a target scaling factor to normalize the extrinsic information output from turbo decoders. An extra coding gain of 0.5 dB can be obtained with additive white Gaussian noise channels. This approach does not introduce extra latency and the hardware overhead is negligible. In the second approach, an adaptive upper bound based on the channel reliability is set for computing the metric difference between competing paths. By combining the two approaches, we show that the new SOVA-based turbo decoders can approach maximum a posteriori probability (MAP)-based turbo decoders within 0.1 dB when the target bit-error rate (BER) is moderately low (e.g., BER<10/sup -4/ for 1/2 rate codes). Following this, practical implementation issues are discussed and finite precision simulation results are provided. An area-efficient parallel decoding architecture is presented in this paper as an effective approach to design high-throughput turbo/SOVA decoders. With the efficient parallel architecture, multiple times throughput of a conventional serial decoder can be obtained by increasing the overall hardware by a small percentage. To resolve the problem of multiple memory accesses per cycle for the efficient parallel architecture, a novel two-level hierarchical interleaver architecture is proposed. Simulation results show that the proposed interleaver architecture performs as well as random interleavers, while requiring much less storage of random patterns. 相似文献
19.
20.
Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that
are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required.
However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this
paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output
(SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it.
We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about
15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support
LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput.
As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock
frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding. 相似文献