首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. In this paper, a greedy grammar transform is first presented; this grammar transform constructs sequentially a sequence of irreducible grammars from which the original data sequence can be recovered incrementally. Based on this grammar transform, three universal lossless data compression algorithms, a sequential algorithm, an improved sequential algorithm, and a hierarchical algorithm, are then developed. These algorithms combine the power of arithmetic coding with that of string matching. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source. Moreover, it is proved that their worst case redundancies among all individual sequences of length n are upper-bounded by c log log n/log n, where c is a constant. Simulation results show that the proposed algorithms outperform the Unix Compress and Gzip algorithms, which are based on LZ78 and LZ77, respectively  相似文献   

2.
Grammar-based codes: a new class of universal lossless source codes   总被引:4,自引:0,他引:4  
We investigate a type of lossless source code called a grammar-based code, which, in response to any input data string x over a fixed finite alphabet, selects a context-free grammar Gx representing x in the sense that x is the unique string belonging to the language generated by Gx. Lossless compression of x takes place indirectly via compression of the production rules of the grammar Gx. It is shown that, subject to some mild restrictions, a grammar-based code is a universal code with respect to the family of finite-state information sources over the finite alphabet. Redundancy bounds for grammar-based codes are established. Reduction rules for designing grammar-based codes are presented  相似文献   

3.
Universal lossless compression via multilevel pattern matching   总被引:2,自引:0,他引:2  
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported  相似文献   

4.
A grammar transform is a transformation that converts any data sequence to be compressed into a grammar from which the original data sequence can be fully reconstructed. In a grammar-based code, a data sequence is first converted into a grammar by a grammar transform and then losslessly encoded. Among several previously proposed grammar transforms is the multilevel pattern matching (MPM) grammar transform. In this paper, the MPM grammar transform is first extended to the case of side information known to both the encoder and decoder, yielding a conditional MPM (CMPM) grammar transform. A new simple linear-time and space complexity algorithm is then proposed to implement the MPM and CMPM grammar transforms. Based on the CMPM grammar transform, a universal lossless data compression algorithm with side information is developed, which can achieve asymptotically the conditional entropy rate of any stationary, ergodic source pair. It is shown that the algorithm's worst case redundancy/sample against the k-contest conditional empirical entropy among all individual sequences of length n is upper-bounded by c(1/logn), where c is a constant. The proposed algorithm with side information is the first in the coming family of conditional grammar-based codes, whose expected high efficiency is due to the efficiency of the corresponding unconditional codes  相似文献   

5.
Traditionally, the performance of distributed algorithms has been measured in terms of time and message complexity.Message complexity concerns the number of messages transmitted over all the edges during the course of the algorithm. However, in energy-constrained ad hoc wireless networks (e.g., sensor networks), energy is a critical factor in measuring the efficiency of a distributed algorithm. Transmitting a message between two nodes has an associated cost (energy) and moreover this cost can depend on the two nodes (e.g., the distance between them among other things). Thus in addition to the time and message complexity, it is important to consider energy complexity that accounts for the total energy associated with the messages exchanged among the nodes in a distributed algorithm. This paper addresses the minimum spanning tree (MST) problem, a fundamental problem in distributed computing and communication networks. We study energy-efficient distributed algorithms for the Euclidean MST problem assuming random distribution of nodes. We show a non-trivial lower bound of ω(log n) on the energy complexity of any distributed MST algorithm. We then give an energy-optimal distributed algorithm that constructs an optimal MST with energy complexity O(log n) on average and O(log n log log n) with high probability. This is an improvement over the previous best known bound on the average energy complexity of ?(log2 n). Our energy-optimal algorithm exploits a novel property of the giant component of sparse random geometric graphs. All of the above results assume that nodes do not know their geometric coordinates. If the nodes know their own coordinates, then we give an algorithm with O(1) energy complexity (which is the best possible) that gives an O(1) approximation to the MST.  相似文献   

6.
Better OPM/L Text Compression   总被引:1,自引:0,他引:1  
An OPM/L data compression scheme suggested by Ziv and Lempel, LZ77, is applied to text compression. A slightly modified version suggested by Storer and Szymanski, LZSS, is found to achieve compression ratios as good as most existing schemes for a wide range of texts. LZSS decoding is very fast, and comparatively little memory is required for encoding and decoding. Although the time complexity of LZ77 and LZSS encoding isO(M)for a text ofMcharacters, straightforward implementations are very slow. The time consuming step of these algorithms is a search for the longest string match. Here a binary search tree is used to find the longest string match, and experiments show that this results in a dramatic increase in encoding speed. The binary tree algorithm can be used to speed up other OPM/L schemes, and other applications where a longest string match is required. Although the LZSS scheme imposes a limit on the length of a match, the binary tree algorithm will work without any limit.  相似文献   

7.
We clarify, extend, and solve a long-standing open problem concerning the computational complexity for packet scheduling algorithms to achieve tight end-to-end delay bounds. We first focus on the difference between the time a packet finishes service in a scheduling algorithm and its virtual finish time under a GPS (General Processor Sharing) scheduler, called GPS-relative delay. We prove that, under a slightly restrictive but reasonable computational model, the lower bound computational complexity of any scheduling algorithm that guarantees O(1) GPS-relative delay bound is /spl Omega/(logn). We also discover that, surprisingly, the complexity lower bound remains the same even if the delay bound is relaxed to O(n/sup a/) for 0相似文献   

8.
For pt. I see ibid., vol.46, p.755-88 (2000). The concept of context-free grammar (CFG)-based coding is extended to the case of countable-context models, yielding context-dependent grammar (CDG)-based coding. Given a countable-context model, a greedy CDG transform is proposed. Based on this greedy CDG transform, two universal lossless data compression algorithms, an improved sequential context-dependent algorithm and a hierarchical context-dependent algorithm, are then developed. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source with a finite alphabet. Moreover, it is proved that these algorithms' worst case redundancies among all individual sequences of length n from a finite alphabet are upper-bounded by d log log n/log n, as long as the number of distinct contexts grows with the sequence length n in the order of O(n/sup a/), where 0 < /spl alpha/ < 1 and d are positive constants. It is further shown that for some nonstationary sources, the proposed context-dependent algorithms can achieve better expected redundancies than any existing CFG-based codes, including the Lempel-Ziv (1978) algorithm, the multilevel pattern matching algorithm, and the context-free algorithms in Part I of this series of papers.  相似文献   

9.
A sequence y=(y/sub 1/,...,y/sub n/) is said to be a coarsening of a given finite-alphabet source sequence x=(x/sub 1/,...,x/sub n/) if, for some function /spl phi/, y/sub i/=/spl phi/(x/sub i/) (i=1,...,n). In lossless refinement source coding, it is assumed that the decoder already possesses a coarsening y of a given source sequence x. It is the job of the lossless refinement source encoder to furnish the decoder with a binary codeword B(x|y) which the decoder can employ in combination with y to obtain x. We present a natural grammar-based approach for finding the binary codeword B(x|y) in two steps. In the first step of the grammar-based approach, the encoder furnishes the decoder with O(/spl radic/nlog/sub 2/n) code bits at the beginning of B(x|y) which tell the decoder how to build a context-free grammar G/sub y/ which represents y. The encoder possesses a context-free grammar G/sub x/ which represents x; in the second step of the grammar-based approach, the encoder furnishes the decoder with code bits in the rest of B(x|y) which tell the decoder how to build G/sub x/ from G/sub y/. We prove that our grammar-based lossless refinement source coding scheme is universal in the sense that its maximal redundancy per sample is O(1/log/sub 2/n) for n source samples, with respect to any finite-state lossless refinement source coding scheme. As a by-product, we provide a useful notion of the conditional entropy H(G/sub x/|G/sub y/) of the grammar G/sub x/ given the grammar G/sub y/, which is approximately equal to the length of the codeword B(x|y).  相似文献   

10.
In this paper, we investigate the performance of grammar-based codes for sources with countably infinite alphabets. Let /spl Lambda/ denote an arbitrary class of stationary, ergodic sources with a countably infinite alphabet. It is shown that grammar-based codes can be modified so that they are universal with respect to any /spl Lambda/ if and only if there exists a universal code for /spl Lambda/. Moreover, upper bounds on the worst case redundancies of grammar-based codes among large sets of length-n individual sequences from a countably infinite alphabet are established. Depending upon the conditions satisfied by length-n individual sequences, these bounds range from O(loglogn/logn) to O(1/log/sup 1-/spl alpha//n) for some 0相似文献   

11.
Zero-delay lossy source coding schemes are considered for both individual sequences and random sources. Performance is measured by the distortion redundancy, which is defined as the difference between the normalized cumulative mean squared distortion of the scheme and the normalized cumulative distortion of the best scalar quantizer of the same rate that is matched to the entire sequence to be encoded. By improving and generalizing a scheme of Linder and Lugosi, Weissman and Merhav showed the existence of a randomized scheme that, for any bounded individual sequence of length n, achieves a distortion redundancy O(n/sup -1/3/logn). However, both schemes have prohibitive complexity (both space and time), which makes practical implementation infeasible. In this paper, we present an algorithm that computes Weissman and Merhav's scheme efficiently. In particular, we introduce an algorithm with encoding complexity O(n/sup 4/3/) and distortion redundancy O(n/sup -1/3/logn). The complexity can be made linear in the sequence length n at the price of increasing the distortion redundancy to O(n/sup -1/4//spl radic/logn). We also consider the problem of minimax distortion redundancy in zero-delay lossy coding of random sources. By introducing a simplistic scheme and proving a lower bound, we show that for the class of bounded memoryless sources, the minimax expected distortion redundancy is upper and lower bounded by constant multiples of n/sup -1/2/.  相似文献   

12.
The compression performance of grammar-based codes is revisited from a new perspective. Previously, the compression performance of grammar-based codes was evaluated against that of the best arithmetic coding algorithm with finite contexts. In this correspondence, we first define semifinite-state sources and finite-order semi-Markov sources. Based on the definitions of semifinite-state sources and finite-order semi-Markov sources, and the idea of run-length encoding (RLE), we then extend traditional RLE algorithms to context-based RLE algorithms: RLE algorithms with k contexts and RLE algorithms of order k, where k is a nonnegative integer. For each individual sequence x, let r/sup *//sub sr,k/(x) and r/sup *//sub sr|k/(x) be the best compression rate given by RLE algorithms with k contexts and by RLE algorithms of order k, respectively. It is proved that for any x, r/sup *//sub sr,k/ is no greater than the best compression rate among all arithmetic coding algorithms with k contexts. Furthermore, it is shown that there exist stationary, ergodic semi-Markov sources for which the best RLE algorithms without any context outperform the best arithmetic coding algorithms with any finite number of contexts. Finally, we show that the worst case redundancies of grammar-based codes against r/sup *//sub sr,k/(x) and r/sup *//sub sr|k/(x) among all length- n individual sequences x from a finite alphabet are upper-bounded by d/sub 1/loglogn/logn and d/sub 2/loglogn/logn, respectively, where d/sub 1/ and d/sub 2/ are constants. This redundancy result is stronger than all previous corresponding results.  相似文献   

13.
An efficient algorithm for extraction of three-dimensional (3-D) capacitance on multilayered and lossy substrates is presented. The new algorithm presents a major improvement over the quasi-3-D approach used in a Green's function-based solver and takes into consideration the sidewalls of 3-D conductors. To improve the efficiency of the computation and the transformation of the Green's function, a nonuniform grid is adopted. The most computationally intensive part in the transformation of the Green's function is computed separately as technology-independent matrices Tk foremost. Once computed, Tk can be stored and used for any technology, thus the storage requirement and computational complexity are reduced from O (S/sup 2/) and O (S/sup 2/ log S/sup 2/), respectively, to just O [(log S/sub max/)/sup 2/]. Extensive tests have been performed to verify the new algorithm, and its accuracy has been established by comparing with other programs.  相似文献   

14.
In this note, we propose a m log m algorithm to find the k most probable configurations of a system of n multi-mode independent components, with at most d modes each. m denotes the size of the problem, i.e. max (nd, k). This problem originates in network performance analyzes, in which focusing on the most probable configurations is a means to reduce computational costs. Up to this note, the best known algorithm to extract the most probable configurations was in O(n/sup 2/d/sup 2/ + k log k). Our algorithm achieves thus a substantial improvement.  相似文献   

15.
Binary sequences with high linear complexity are of interest in cryptography. The linear complexity should remain high even when a small number of changes are made to the sequence. The error linear complexity spectrum of a sequence reveals how the linear complexity of the sequence varies as an increasing number of the bits of the sequence are changed. We present an algorithm which computes the error linear complexity for binary sequences of period /spl lscr/=2/sup n/ using O(/spl lscr/(log/spl lscr/)/sup 2/) bit operations. The algorithm generalizes both the Games-Chan (1983) and Stamp-Martin (1993) algorithms, which compute the linear complexity and the k-error linear complexity of a binary sequence of period /spl lscr/=2/sup n/, respectively. We also discuss an application of an extension of our algorithm to decoding a class of linear subcodes of Reed-Muller codes.  相似文献   

16.
A direct method for the computation of 2-D DCT/IDCT on a linear-array architecture is presented. The 2-D DCT/IDCT is first converted into its corresponding I-D DCT/IDCT problem through proper input/output index reordering. Then, a new coefficient matrix factorisation is derived, leading to a cascade of several basic computation blocks. Unlike other previously proposed high-speed 2-D N /spl times/ N DCT/IDCT processors that usually require intermediate transpose memory and have computation complexity O(N/sup 3/), the proposed hardware-efficient architecture with distributed memory structure has computation complexity O(N/sup 2/ log/sub 2/ N) and requires only log/sub 2/ N multipliers. The new pipelinable and scalable 2-D DCT/IDCT processor uses storage elements local to the processing elements and thus does not require any address generation hardware or global memory-to-array routing.  相似文献   

17.
This study presents an O(k/sup 2//spl middot/log(n)) algorithm for computing the reliability of a linear as well as a circular consecutive-k-out-of-n: F system. The proposed algorithm is more efficient and much simpler than the O(k/sup 3//spl middot/log(n/k)) algorithm of Hwang & Wright.  相似文献   

18.
In this paper, we settle a long-standing open problem concerning the average redundancy rn of the Lempel-Ziv'78 (LZ78) code. We prove that for a memoryless source the average redundancy rate attains asymptotically Ern=(A+δ(n))/log n+ O(log log n/log2 n), where A is an explicitly given constant that depends on source characteristics, and δ(x) is a fluctuating function with a small amplitude. We also derive the leading term for the kth moment of the number of phrases. We conclude by conjecturing a precise formula on the expected redundancy for a Markovian source. The main result of this paper is a consequence of the second-order properties of the Lempel-Ziv algorithm obtained by Jacquet and Szpankowski (1995). These findings have been established by analytical techniques of the precise analysis of algorithms. We give a brief survey of these results since they are interesting in their own right, and shed some light on the probabilistic behavior of pattern matching based data compression  相似文献   

19.
Efficient pulse compressor for Golay complementary sequences   总被引:4,自引:0,他引:4  
Budisin  S.Z. 《Electronics letters》1991,27(3):219-220
The highly regular structure of binary Golay complementary sequences of length 2/sup N/ makes possible a very simple correlator for these sequences. The algorithm called fast Golay correlation (FGC) performs the correlation of the input radar signal with the Golay sequence using only 2.log/sub 2/(M) operations per input sample as opposed to M operations required by standard correlators. This algorithm is also valid for complex sequences of the same length. This fast correlator and the good autocorrelation properties of Golay sequences make them the ideal choice for high speed, long sequence pulse compression. Sequence agility on the pulse to pulse basis is also possible at the expense of some added complexity.<>  相似文献   

20.
This paper introduces a new data compression algorithm. The goal underlying this new code design is to achieve a single lossless compression algorithm with the excellent compression ratios of the prediction by partial mapping (PPM) algorithms and the low complexity of codes based on the Burrows Wheeler Transform (BWT). Like the BWT-based codes, the proposed algorithm requires worst case O(n) computational complexity and memory; in contrast, the unbounded-context PPM algorithm, called PPM*, requires worst case O(n2) computational complexity. Like PPM*, the proposed algorithm allows the use of unbounded contexts. Using standard data sets for comparison, the proposed algorithm achieves compression performance better than that of the BWT-based codes and comparable to that of PPM*. In particular, the proposed algorithm yields an average rate of 2.29 bits per character (bpc) on the Calgary corpus; this result compares favorably with the 2.33 and 2.34 bpc of PPM5 and PPM* (PPM algorithms), the 2.43 bpc of BW94 (the original BWT-based code), and the 3.64 and 2.69 bpc of compress and gzip (popular Unix compression algorithms based on Lempel-Ziv (LZ) coding techniques) on the same data set. The given code does not, however, match the best reported compression performance-2.12 bpc with PPMZ9-listed on the Calgary corpus results web page at the time of this publication. Results on the Canterbury corpus give a similar relative standing. The proposed algorithm gives an average rate of 2.15 bpc on the Canterbury corpus, while the Canterbury corpus web page gives average rates of 1.99 bpc for PPMZ9, 2.11 bpc for PPM5, 2.15 bpc for PPM7, 2.23 bpc for BZIP2 (a popular BWT-based code), and 3.31 and 2.53 bpc for compress and gzip, respectively  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号