首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A universal variable-to-fixed length algorithm for binary memoryless sources which converges to the entropy of the source at the optimal rate is known. We study the problem of universal variable-to-fixed length coding for the class of Markov sources with finite alphabets. We give an upper bound on the performance of the code for large dictionary sizes and show that the code is optimal in the sense that no codes exist that have better asymptotic performance. The optimal redundancy is shown to be H log log M/log M where H is the entropy rate of the source and M is the code size. This result is analogous to Rissanen's (1984) result for fixed-to-variable length codes. We investigate the performance of a variable-to-fixed coding method which does not need to store the dictionaries, either at the coder or the decoder. We also consider the performance of both these source codes on individual sequences. For individual sequences we bound the performance in terms of the best code length achievable by a class of coders. All the codes that we consider are prefix-free and complete  相似文献   

2.
Universal lossless source coding with the Burrows Wheeler transform   总被引:6,自引:0,他引:6  
The Burrows Wheeler transform (1994) is a reversible sequence transformation used in a variety of practical lossless source-coding algorithms. In each, the BWT is followed by a lossless source code that attempts to exploit the natural ordering of the BWT coefficients. BWT-based compression schemes are widely touted as low-complexity algorithms giving lossless coding rates better than those of the Ziv-Lempel codes (commonly known as LZ'77 and LZ'78) and almost as good as those achieved by prediction by partial matching (PPM) algorithms. To date, the coding performance claims have been made primarily on the basis of experimental results. This work gives a theoretical evaluation of BWT-based coding. The main results of this theoretical evaluation include: (1) statistical characterizations of the BWT output on both finite strings and sequences of length n → ∞, (2) a variety of very simple new techniques for BWT-based lossless source coding, and (3) proofs of the universality and bounds on the rates of convergence of both new and existing BWT-based codes for finite-memory and stationary ergodic sources. The end result is a theoretical justification and validation of the experimentally derived conclusions: BWT-based lossless source codes achieve universal lossless coding performance that converges to the optimal coding performance more quickly than the rate of convergence observed in Ziv-Lempel style codes and, for some BWT-based codes, within a constant factor of the optimal rate of convergence for finite-memory sources  相似文献   

3.
Fisher information and stochastic complexity   总被引:8,自引:0,他引:8  
By taking into account the Fisher information and removing an inherent redundancy in earlier two-part codes, a sharper code length as the stochastic complexity and the associated universal process are derived for a class of parametric processes. The main condition required is that the maximum-likelihood estimates satisfy the central limit theorem. The same code length is also obtained from the so-called maximum-likelihood code  相似文献   

4.
A new lower bound for the mean code length of all one-to-one codes for a random variable withnoutcomes is derived. The bound, which is tight, improves an earlier one due to Leung-Yan-Cheong and Cover. Another bound for one-to-one codes for binary information sources is derived.  相似文献   

5.
A class of distortionless codes designed by Bayes decision theory   总被引:1,自引:0,他引:1  
The problem of distortionless encoding when the parameters of the probabilistic model of a source are unknown is considered from a statistical decision theory point of view. A class of predictive and nonpredictive codes is proposed that are optimal within this framework. Specifically, it is shown that the codeword length of the proposed predictive code coincides with that of the proposed nonpredictive code for any source sequence. A bound for the redundancy for universal coding is given in terms of the supremum of the Bayes risk. If this supremum exists, then there exists a minimax code whose mean code length approaches it in the proposed class of codes, and the minimax code is given by the Bayes solution relative to the prior distribution of the source parameters that maximizes the Bayes risk  相似文献   

6.
Redundancy of universal codes for a class of sources determines by how much the actual code length exceeds the optimal code length. In the minimax scenario, one designs the best code for the worst source within the class. Such minimax redundancy comes in two flavors: average minimax or worst case minimax. We study the worst case minimax redundancy of universal block codes for Markovian sources of any order. We prove that the maximal minimax redundancy for Markov sources of order r is asymptotically equal to 1/2m/sup r/(m-1)log/sub 2/n+log/sub 2/A/sub m//sup r/-(lnlnm/sup 1/(m-1)/)/lnm+o(1), where n is the length of a source sequence, m is the size of the alphabet, and A/sub m//sup r/ is an explicit constant (e.g., we find that for a binary alphabet m=2 and Markov of order r=1 the constant A/sub 2//sup 1/=16/spl middot/G/spl ap/14.655449504 where G is the Catalan number). Unlike previous attempts, we view the redundancy problem as an asymptotic evaluation of certain sums over a set of matrices representing Markov types. The enumeration of Markov types is accomplished by reducing it to counting Eulerian paths in a multigraph. In particular, we propose exact and asymptotic formulas for the number of strings of a given Markov type. All of these findings are obtained by analytic and combinatorial tools of analysis of algorithms.  相似文献   

7.
Complexity of strings in the class of Markov sources   总被引:1,自引:0,他引:1  
Shannon's self-information of a string is generalized to its complexity relative to the class of finite-state-machine (FSM) defined sources. Unlike an earlier generalization, the new one is valid for both short and long strings. The definition is justified in part by a theorem stating that, asymptotically, the mean complexity provides a tight lower bound for the mean length of all so-called regular codes. This also generalizes Shannon's noiseless coding theorem. For a large subclass of FSM sources a simple algorithm is described for computing the complexity.  相似文献   

8.
A lower bound is derived on the achievable redundancy for universal lossless coding of parametric sources with piecewise stationary, abruptly changing, occasionally repeating statistics. In particular, it is shown that if the number of repeating statistical parameter vectors (or states) is not too large, for any uniquely decipherable code, for almost every set of states that govern all the different segments in the data sequence, for almost every arrangement of these states in the different segments, and for almost every vector of transition times, the minimum achievable redundancy is composed of 0.5 log d extra code bits for each unknown component of each state, log m extra code bits for each unknown transition time, and log s extra code bits for each repetition of a state, where d is the average duration of each state in the input string, TO is the average length of a segment, and s is the total number of states. If s is essentially large compared to TO, it is shown that the minimum redundancy is composed of 0.5 log 77i bits for each unknown component in each segment and log TO bits for each unknown transition time, which is the same lower bound as that of general piecewise stationary sources (PSSs). These results are true also in the minimax and maximin senses. The lower bound is shown to be achievable through construction of mixture and estimation based codes. Different special cases are reviewed, and it is shown that unless s is essentially large compared to m, optimal codes that are designed particularly for sources with repeating statistics outperform codes designed for PSSs when coding sources with repeating statistics. In particular, the bound for general PSSs is shown to be a special case of the new bound.  相似文献   

9.
Universal lossless compression via multilevel pattern matching   总被引:2,自引:0,他引:2  
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported  相似文献   

10.
The random coding bound of information theory provides a well-known upper bound to the probability of decoding error for the best code of a given rate and block length. The bound is constructed by upper-bounding the average error probability over an ensemble of codes. The bound is known to give the correct exponential dependence of error probability on block length for transmission rates above the critical rate, but it gives an incorrect exponential dependence at rates below a second lower critical rate. Here we derive an asymptotic expression for the average error probability over the ensemble of codes used in the random coding bound. The result shows that the weakness of the random coding bound at rates below the second critical rate is due not to upperbounding the ensemble average, but rather to the fact that the best codes are much better than the average at low rates.  相似文献   

11.
Perfect binary codes: constructions, properties, and enumeration   总被引:3,自引:0,他引:3  
Properties of nonlinear perfect binary codes are investigated and several new constructions of perfect codes are derived from these properties. An upper bound on the cardinality of the intersection of two perfect codes of length n is presented, and perfect codes whose intersection attains the upper bound are constructed for all n. As an immediate consequence of the proof of the upper bound the authors obtain a simple closed-form expression for the weight distribution of a perfect code. Furthermore, they prove that the characters of a perfect code satisfy certain constraints, and provide a sufficient condition for a binary code to be perfect. The latter result is employed to derive a generalization of the construction of Phelps (1983), which is shown to give rise to some perfect codes that are nonequivalent to the perfect codes obtained from the known constructions. Moreover, for any m⩾4 the authors construct full-rank perfect binary codes of length 2m -1. These codes are obviously nonequivalent to any of the previously known perfect codes. Furthermore the latter construction exhibits the existence of full-rank perfect tilings. Finally, they construct a set of 2(2cn) nonequivalent perfect codes of length n, for sufficiently large n and a constant c=0.5-ϵ. Precise enumeration of the number of codes in this set provides a slight improvement over the results reported by Phelps  相似文献   

12.
Universal codeword sets and representations of the integers   总被引:9,自引:0,他引:9  
Countable prefix codeword sets are constructed with the universal property that assigning messages in order of decreasing probability to codewords in order of increasing length gives an average code-word length, for any message set with positive entropy, less than a constant times the optimal average codeword length for that source. Some of the sets also have the asymptotically optimal property that the ratio of average codeword length to entropy approaches one uniformly as entropy increases. An application is the construction of a uniformly universal sequence of codes for countable memoryless sources, in which thenth code has a ratio of average codeword length to source rate bounded by a function ofnfor all sources with positive rate; the bound is less than two forn = 0and approaches one asnincreases.  相似文献   

13.
14.
The attractiveness of majority-logic decoding is its simple implementation. Several classes of majority-logic decodable block codes have been discovered for the past two decades. In this paper, a method of constructing a new class of majority-logic decodable block codes is presented. Each code in this class is formed by combining majority-logic decodable codes of shorter lengths. A procedure for orthogonalizing codes of this class is formulated. For each code, a lower bound on the number of correctable errors with majority-logic decoding is obtained. An upper bound on the number of orthogonalization steps for decoding each code is derived. Several majority-logic decodable codes that have more information digits than the Reed-Muller codes of the same length and the same minimum distance are found. Some results presented in this paper are extensions of the results of Lin and Weldon [11] and Gore [12] on the majority-logic decoding of direct product codes.  相似文献   

15.
Although the existence of universal noiseless variable-rate codes for the class of discrete stationary ergodic sources has previously been established, very few practical universal encoding methods are available. Efficient implementable universal source coding techniques are discussed in this paper. Results are presented on source codes for which a small value of the maximum redundancy is achieved with a relatively short block length. A constructive proof of the existence of universal noiseless codes for discrete stationary sources is first presented. The proof is shown to provide a method for obtaining efficient universal noiseless variable-rate codes for various classes of sources. For memoryless sources, upper and lower bounds are obtained for the minimax redundancy as a function of the block length of the code. Several techniques for constructing universal noiseless source codes for memoryless sources are presented and their redundancies are compared with the bounds. Consideration is given to possible applications to data compression for certain nonstationary sources.  相似文献   

16.
On beamforming with finite rate feedback in multiple-antenna systems   总被引:9,自引:0,他引:9  
We study a multiple-antenna system where the transmitter is equipped with quantized information about instantaneous channel realizations. Assuming that the transmitter uses the quantized information for beamforming, we derive a universal lower bound on the outage probability for any finite set of beamformers. The universal lower bound provides a concise characterization of the gain with each additional bit of feedback information regarding the channel. Using the bound, it is shown that finite information systems approach the perfect information case as (t-1)2/sup -B/t-1/, where B is the number of feedback bits and t is the number of transmit antennas. The geometrical bounding technique, used in the proof of the lower bound, also leads to a design criterion for good beamformers, whose outage performance approaches the lower bound. The design criterion minimizes the maximum inner product between any two beamforming vectors in the beamformer codebook, and is equivalent to the problem of designing unitary space-time codes under certain conditions. Finally, we show that good beamformers are good packings of two-dimensional subspaces in a 2t-dimensional real Grassmannian manifold with chordal distance as the metric.  相似文献   

17.
An upper bound on the minimum distance of turbo codes is derived, which depends only on the interleaver length and the component scramblers employed. The derivation of this bound considers exclusively turbo encoder input words of weight 2. The bound does not only hold for a particular interleaver but for all possible interleavers including the best. It is shown that in contrast to general linear binary codes the minimum distance of turbo codes cannot grow stronger than the square root of the block length. This implies that turbo codes are asymptotically bad. A rigorous proof for the bound is provided, which is based on a geometric approach  相似文献   

18.
We discuss algorithms for determining exactly the lower terms of the weight distribution of a turbo code. Several improvements on the recently introduced algorithm by Garello et al. are outlined. The techniques presented in this letter improve the observed asymptotic complexity by a factor proportional to the information length. As an example, the improved algorithm is applied to the determination of the minimum distance of all universal mobile telecommunications system turbo codes. We further apply the improved algorithm to high-rate turbo codes using high-rate nonpunctured constituent codes. To reduce complexity, the constituent codes are represented by a minimal information bit-oriented trellis.  相似文献   

19.
In this correspondence, we consider one-to-one encodings for a discrete memoryless source, which are "one-shot" encodings associating a distinct codeword with each source symbol. Such encodings could be employed when only a single source symbol rather than a sequence of source symbols needs to be transmitted. For example, such a situation can arise when the last message must be acknowledged before the next message can be transmitted. We consider two slightly different types of one-to-one encodings (depending on whether the empty codeword is used or not) and obtain lower and upper bounds on the expected length of optimal one-to-one codes. We first give an extension of a known tight lower bound on the expected length of optimal one-to-one codes for the case that the the size of the source alphabet is finite and partial information about the source symbol probabilities is available. As expected, our lower bound is no less than the previously known lower bound obtained without side information about the source symbol probabilities. We then consider the case that the source entropy is available and derive arbitrarily tight lower bounds on the expected length of optimal one-to-one codes. We also derive arbitrarily tight lower bounds for the case that the source entropy and the probability of the most likely source symbol are available. Finally, given that the probability of the most likely source symbol is available, we obtain an upper bound on the expected length of optimal one-to-one codes. Our upper bound is tighter than the best upper bound known in the literature  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号