首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
In this paper we focus our attention on the second step algorithms of the Burrows–Wheeler compression algorithm, which in the original version is the Move To Front transform. We discuss many of its replacements presented so far, and compare compression results obtained using these replacements. Then we propose a new algorithm that yields a better compression ratio than the previous algorithms. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

2.
    
The Burrows–Wheeler Transform (BWT ) produces a permutation of a string X, denoted X?, by sorting the n cyclic rotations of X into full lexicographical order and taking the last column of the resulting n×n matrix to be X?. The transformation is reversible in time. In this paper, we consider an alteration to the process, called k‐BWT , where rotations are only sorted to a depth k. We propose new approaches to the forward and reverse transform, and show that the methods are efficient in practice. More than a decade ago, two algorithms were independently discovered for reversing k‐BWT , both of which run in time. Two recent algorithms have lowered the bounds for the reverse transformation to and, respectively. We examine the practical performance for these reversal algorithms. We find that the original approach is most efficient in practice, and investigates new approaches, aimed at further speeding reversal, which store precomputed context boundaries in the compressed file. By explicitly encoding the context boundaries, we present an reversal technique that is both efficient and effective. Finally, our study elucidates an inherently cache‐friendly – and hitherto unobserved – behavior in the reverse k‐BWT , which could lead to new applications of the k‐BWT transform. In contrast to previous empirical studies, we show that the partial transform can be reversed significantly faster than the full transform, without significantly affecting compression effectiveness. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

3.
Jürgen Abel 《Software》2010,40(9):751-777
The lossless Burrows–Wheeler compression algorithm has received considerable attention over recent years for both its simplicity and effectiveness. It is based on a permutation of the input sequence—the Burrows–Wheeler transformation (BWT)—which groups symbols with a similar context close together. In the original version, this permutation was followed by a Move‐To‐Front transformation and a final entropy coding stage. Later versions used different algorithms, placed after the BWT, since the following stages have a significant influence on the compression rate. This paper describes different algorithms and improvements for these post BWT stages including a new context‐based approach. The results for compression rates are presented together with compression and decompression times on the Calgary corpus, the Canterbury corpus, the large Canterbury corpus and the Lukas 2D 16‐bit medical image corpus. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

4.
Jürgen Abel 《Software》2007,37(3):247-265
The stage after the Burrows–Wheeler transform (BWT) has a key function inside the Burrows–Wheeler compression algorithm as it transforms the BWT output from a local context into a global context. This paper presents the Incremental Frequency Count stage, a post‐BWT stage. The new stage is paired with a run length encoding stage between the BWT and the entropy coding stage of the algorithm. It offers high throughput similar to a Move To Front stage, and at the same time good compression rates like the strong but slow Weighted Frequency Count stage. The properties of the Incremental Frequency Count stage are compared to the Move To Front and Weighted Frequency Count stages by their compression rates and speeds on the Calgary and large Canterbury corpora. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

5.
    
Peter Fenwick 《Software》1998,28(5):547-559
Methods used by Shannon in 1951 when investigating the information content of English text are shown to lead to a class of ‘symbol ranking’ text compressors. Included in this class are some recent compressors which combine high performance with completely novel techniques. These are described under the aegis of the symbol ranking classification. Some new compressors, deliberately designed as symbol ranking, are then described. One of these has operated at over 1 Mbyte per second in software, and should be suitable for hardware implementation at tens of megabytes per second, or higher. © 1998 John Wiley & Sons, Ltd.  相似文献   

6.
We propose a fast and memory-efficient algorithm for lexicographically sorting the suffixes of a string, a problem that has important applications in data compression as well as string matching.  相似文献   

7.
8.
9.
    
Peter Fenwick 《Software》2002,32(13):1307-1316
The final coder in Burrows–Wheeler compression is usually either an adaptive Huffman coder (for speed) or a complex of arithmetic coders for better compression. This article describes the use of conventional pre‐defined variable length codes or universal codes and shows that they too can give excellent compression. The paper also describes a ‘sticky Move‐to‐Front’ modification which gives a useful improvement in compression for most files. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

10.
Suffix array (SA) construction is a time-and-memory bottleneck in many string processing applications. In this paper we improve the runtime of a small-space — semi-external — SA construction algorithm by Kärkkäinen (TCS, 2007) [5]. We achieve a speedup in practice of 2–4 times, without increasing memory usage. Our main contribution is a way to implement the “pointer copying” heuristic, used in less space-efficient SA construction algorithms, in a memory-efficient way.  相似文献   

11.
In this paper, we present a new technique for worst-case analysis of compression algorithms which are based on the Burrows–Wheeler Transform. We mainly deal with the algorithm proposed by Burrows and Wheeler in their first paper on the subject [M. Burrows, D.J. Wheeler, A block sorting lossless data compression algorithm, Technical Report 124, Digital Equipment Corporation, Palo Alto, California, 1994], called bw0. This algorithm consists of the following three essential steps: (1) Obtain the Burrows–Wheeler Transform of the text, (2) Convert the transform into a sequence of integers using the move-to-front algorithm, (3) Encode the integers using Arithmetic code or any order-0 encoding (possibly with run-length encoding).  相似文献   

12.
We introduce a combinatorial optimization framework that naturally induces a class of optimal word permutations with respect to a suitably defined cost function taking into account various measures of relatedness between words. The Burrows and Wheeler transform (bwt) (cf. [M. Burrows, D. Wheeler, A block sorting lossless data compression algorithm, Technical Report 124, Digital Equipment Corporation, 1994]), and its analog for labelled trees (cf. [P. Ferragina, F. Luccio, G. Manzini, S. Muthukrishnan, Structuring labeled trees for optimal succinctness, and beyond, in: Proc. of the 45th Annual IEEE Symposium on Foundations of Computer Science, 2005, pp. 198–207]), are special cases in the class. We also show that the class of optimal word permutations defined here is identical to the one identified by Ferragina et al. for compression boosting [P. Ferragina, R. Giancarlo, G. Manzini, M. Sciortino, Boosting textual compression in optimal linear time, Journal of the ACM 52 (2005) 688–713]. Therefore, they are all highly compressible. We also provide, by using techniques from Combinatorics on Words, a fast method to compute bwt without using any end-of-string symbol. We also investigate more general classes of optimal word permutations, where relatedness of symbols may be measured by functions more complex than context length. For this general problem we provide an instance that is MAX-SNP hard, and therefore unlikely to be solved or approximated efficiently. The results presented here indicate that a key feature of the Burrows and Wheeler transform seems to be, besides compressibility, the existence of efficient algorithms for its computation and inversion.  相似文献   

13.
14.
We present a four-stage algorithm that updates the Burrows–Wheeler Transform of a text TT, when this text is modified. The Burrows–Wheeler Transform is used by many text compression applications and some self-index data structures. It operates by reordering the letters of a text TT to obtain a new text bwt(T)bwt(T) which can be better compressed.  相似文献   

15.
通过对CCSDS(国际空间数据系统咨询委员会)建议的无损数据压缩标准的研究,以及对目前常用压缩算法的调查,它阐述了一种具有延迟小速度快抗差错能力强等特点的无损数据压缩算法,即Rice压缩算法,压缩率超过50%以上,而且对多种类型的数据都会达到满意的效果.它对算法中的零值块部分作了较为详细地阐述,因为经过预处理过的数据通常都很小,对于图像来说有相当多的零值.因此,对零值较多的情况下采取零值块压缩处理,效果很好,经过软件测试,结果符合CCSDS的要求标准.  相似文献   

16.
无损(也叫可逆)水印主要用于数据(如图象)的完整性认证,其特点之一是当加水印图象没有被修改时,所嵌入的水印可以完整地从加水印图象中去除,从而可以在接收端无损地恢复出原始图象。本文提出一种新的基于可逆整数变换的无损水印算法,利用整数变换后图象象素对均值和差值之间所存在的冗余,嵌入认证信息。文献中已有的方法只利用了差值冗余,而我们的算法则进一步考虑了均值和差值的相关性。利用算术编码压缩由二者所得的相关位流,可以获得较大的存储空间和较高的加水印图象质量。  相似文献   

17.
数据压缩算法研究与设计   总被引:1,自引:0,他引:1  
文章应用Java编程实现了基于统计模型、字典模型、RLE的压缩算法的数据压缩程序并进行了数据实验。实验结果表明目前的算法不是对所有数据都是最优的。文章还依据文件存储的本质,即任何一个文件本质上都可以转换为一个数字序列,讨论了基于数字序列的无损压缩算法和给出了表整数为幂和的方法,从实验的结果来看,这两种算法在某些情况下的压缩比率比现有压缩算法有所提高,能够将有的文件压缩到40%~95%左右。  相似文献   

18.
文档图像作为图像的一种,在生活中的应用越来越广泛,然而其又不同于常规的文本文档或图像,它主要由具有特定含义的不同形状的字符串组成,其局部像素变化比较剧烈,高频分量相对丰富,采用常规的压缩方式很难获得较高的压缩率。常用的压缩方式没有考虑文档图像的特殊性,因而压缩性能有限。本文针对文档图像的特点,采用分块匹配的方法对文档图像进行压缩,即按照特定的规则对整幅图像进行分割,然后将分割的块图像进行分类并编码,从而在二维空间上消除了文档图像的相关性,获得了远高于常规无损压缩方式的压缩率。文中对分块匹配算法进行了描述,并对其性能进行了理论分析和仿真。  相似文献   

19.
DPCM与整数小波变换相结合的图像无损压缩   总被引:1,自引:1,他引:1  
论文讨论了将DPCM变换与整数小波变换相结合的方法来实现图像的无失真压缩。在论文压缩算法中,首先对图像进行DPCM预测,将差值图像经过整数小波变换,然后再用无损SPIHT算法进行压缩编码,最后再经过相应的逆变换即可以得到重构的无失真图像。该方法简单易懂,硬件实现方便。仿真结果表明,这是一种效果很好的图像无损压缩方法。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号