期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Balancing and clustering of words in the Burrows-Wheeler transform

Antonio Restivo Giovanna Rosone 《Theoretical computer science》2011,412(27):3019-3032

Compression algorithms based on Burrows-Wheeler transform (BWT) take advantage of the fact that the word output of BWT shows a local similarity and then turns out to be highly compressible. The aim of the present paper is to study such “clustering effect” by using notions and methods from Combinatorics on Words.The notion of balance of a word plays a central role in our investigation. Empirical observations suggest that balance is actually the combinatorial property of input word that ensure optimal BWT compression. Moreover, it is reasonable to assume that the more balanced the input word is, the more local similarity we have after BWT (and therefore the better the compression is). This hypothesis is here corroborated by experiments on “real” text, by using local entropy as a measure of the degree of balance of a word.In the setting of Combinatorics on Words, a sound confirmation of previous hypothesis is given by a result of Mantaci et al. (2003) [27], which states that, in the case of a binary alphabet, there is an equivalence between circularly balanced words, words having a clusterized BWT, and the conjugates of standard words. In the case of alphabets of size greater than two, there is no more equivalence. The last section of the present paper is devoted to investigate the relationships between these notions, and other related ones (as, for instance, palindromic richness) in the case of a general alphabet. 相似文献

2.

Burrows-Wheeler transform and Sturmian words

S. Mantaci M. Sciortino 《Information Processing Letters》2003,86(5):241-246

相似文献

3.

On the possible patterns of inputs for block sorting in the Burrows-Wheeler transformation

Takashi Saso Atsuyoshi Nakamura 《Information Processing Letters》2011,111(12):595-599

Block sorting in the Burrows-Wheeler transformation is to sort all of the n circular shifts of a string of length n lexicographically. We introduce a notion called the width of a sequence of n strings of length n and show that the values of widths are very different between the two types of sequences of strings; (1) a sequence of n randomly generated strings of length n, and (2) the sequence of n circular shifts of a randomly generated string of length n. 相似文献

4.

PPM with the extended alphabet

Przemys?aw Skibiński 《Information Sciences》2006,176(7):861-874

In the following paper we propose modification of Prediction by Partial Matching (PPM)—a lossless data compression algorithm, which extends an alphabet, used in the PPM method, to long repeated strings. Usually the PPM algorithm’s alphabet consists of 256 characters only. We show, on the basis of the Calgary corpus [T.C. Bell, J. Cleary, I.H. Witten, Text compression. Advanced Reference Series, Prentice Hall, Englewood Cliffs, New Jersey, 1990], that for ordinary files such a modification improves the compression performance in lower, but not greater than 10, orders. However, for some kind of files, this modification gives much better compression performance than any known lossless data compression algorithm. 相似文献

5.

Arto Niemi Jukka Teuhola 《Software》2020,50(9):1858-1874

Lossless compression methods based on the Burrows-Wheeler transform (BWT) are regarded as an excellent compromise between speed and compression efficiency: they provide compression rates close to the PPM algorithms, with the speed of dictionary-based methods. Instead of the laborious statistics-gathering process used in PPM, the BWT reversibly sorts the input symbols, using as the sort key as many following characters as necessary to make the sort unique. Characters occurring in similar contexts are sorted close together, resulting in a clustered symbol sequence. Run-length encoding and Move-to-Front (MTF) recoding, combined with a statistical Huffman or arithmetic coder, is then typically used to exploit the clustering. A drawback of the MTF recoding is that knowledge of the character that produced the MTF number is lost. In this paper, we present a new, competitive Burrows-Wheeler posttransform stage that takes advantage of interpolative coding—a fast binary encoding method for integer sequences, being able to exploit clusters without requiring explicit statistics. We introduce a fast and simple way to retain knowledge of the run characters during the MTF recoding and use this to improve the clustering of MTF numbers and run-lengths by applying reversible, stable sorting, with the run characters as sort keys, achieving significant improvement in the compression rate, as shown here by experiments on common text corpora. 相似文献

6.

Sending compressed messages to a learned receiver on a bidirectional line

Bruno Carpentieri 《Information Processing Letters》2002,83(2):63-70

In this paper we present communication protocols that allow a “learned” sender and a “learned” receiver to communicate on a bidirectional line, compressing files with (static or dynamic) dictionaries initially built independently by the sender and the receiver from the examples of the source output they have available. We show that this leads to an improvement in compression paid with a negligible or almost null possibility of communication errors. 相似文献

7.

A linear-time algorithm for Euclidean feature transform sets

Wim H. Hesselink 《Information Processing Letters》2007,102(5):181-186

The Euclidean distance transform of a binary image is the function that assigns to every pixel the Euclidean distance to the background. The Euclidean feature transform is the function that assigns to every pixel the set of background pixels with this distance. We present an algorithm to compute the exact Euclidean feature transform sets in linear time. The algorithm is applicable in arbitrary dimensions. 相似文献

8.

Perceptually adaptive spread transform image watermarking scheme using Hadamard transform

Santi P. Maity Malay K. Kundu 《Information Sciences》2011,181(3):450-465

The present paper proposes a digital image watermarking scheme using the characteristics of the human visual system (HVS), spread transform technique and statistical information measure. Spread transform (ST) scheme is implemented using the transform coefficients of both the host and the watermark signal. Watermark embedding strength is adaptively adjusted using frequency sensitivity, luminance, contrast and entropy masking of HVS model. The choice of Hadamard transform as watermark embedding domain offers several advantages, such as low loss in image information (higher image fidelity), greater reliability of watermark detection and higher data hiding capacity at high degree of compression. Performance of the proposed method is compared with a number of recently reported watermarking schemes based on spread spectrum (SS) and quantization index modulation (QIM). 相似文献

9.

Faster index for property matching

Costas S. Iliopoulos 《Information Processing Letters》2008,105(6):218-223

In this paper, we revisit the Property Matching problem studied by Amir et al. [Property Matching and Weighted Matching, CPM 2006] and present a better indexing scheme for the problem. In particular, the data structure by Amir et al., namely PST, requires O(nlog|Σ|+nloglogn) construction time and O(mlog|Σ|+K) query time, where n and m are the length of, respectively, the text and the pattern, Σ is the alphabet and K is the output size. On the other hand, the construction time of our data structure, namely IDS_PIP, is dominated by the suffix tree construction time and hence is O(n) time for alphabets that are natural numbers from 1 to a polynomial in n and O(nlogσ) time otherwise, where σ=min(n,|Σ|). The query time is same as that of PST. Also, IDS_PIP has the advantage that it can be built on either a suffix tree or a suffix array and additionally, it retains the capability of answering normal pattern matching queries. 相似文献

10.

Optimal prefix and suffix queries on texts

Maxime Crochemore Costas S. Iliopoulos M. Sohel Rahman 《Information Processing Letters》2008,108(5):320-325

In this paper, we study a restricted version of the position restricted pattern matching problem introduced and studied by Mäkinen and Navarro [V. Mäkinen, G. Navarro, Position-restricted substring searching, in: J.R. Correa, A. Hevia, M.A. Kiwi (Eds.), LATIN, in: Lecture Notes in Computer Science, vol. 3887, Springer, 2006, pp. 703-714]. In the problem handled in this paper, we are interested in those occurrences of the pattern that lies in a suffix or in a prefix of the given text. We achieve optimal query time for our problem against a data structure which is an extension of the classic suffix tree data structure. The time and space complexity of the data structure is dominated by that of the suffix tree. Notably, the (best) algorithm by Mäkinen and Navarro, if applied to our problem, gives sub-optimal query time and the corresponding data structure also requires more time and space. 相似文献

11.

A new algorithm for the embedding of a prediction mechanism into the JPEG2000 coding chain

Marco Aguzzi Maria Grazia Albanesi Marco Martelli 《Pattern recognition》2006,39(8):1492-1508

In this paper, we propose the embedding of a prediction mechanism into a part of the coding structure of JPEG2000 image compression standard, in order to reduce the amount of bits sent to the arithmetic coder, without any significant changes into the standard architecture and without loosing performance. The prediction is based upon an innovative processing of the data structures used by the standard JPEG2000 in progressive coding and the addition of a Prediction Matrix, whose computation does not add any overhead at the decoder side. Experiments are performed to test the efficacy of the prediction mechanism, and results are compared to the standard JPEG2000 and other similar approaches. Tests are documented over a set of well-known images from literature, also against different kinds of added noise. Performance, in terms of saved bits are reported, and a new figure of merit is defined to test the efficiency of Prediction. The results prove that the new proposal overcomes the standard and other related approaches for the entire set of referenced images, with significant gain in synthetic images, also in presence of noise. 相似文献

12.

AVS标准中整数DCT变换快速算法的硬件设计

蔡海涛刘荣科《电子技术应用》2007,33(12):68-71

介绍了AVS标准中整数DCT变换矩阵的化简方法,该方法提高了一维整数DCT变换硬件实现的速度。基于此一维整数DCT变换,采用模块复用和流水线设计,实现了二维整数DCT直接变换在一个时钟周期内完成,工作频率可达160MHz。仿真结果证实了该算法的有效性。相似文献

13.

SPIHT算法新的上下文建模方法

王振华田金文柳健《计算机工程与应用》2005,41(21):74-76

SPIHT算法不仅具有递进的传输能力,而且是一种效率较高的静止图像压缩方法。针对SPIHT编码过程中熵编码简单的上下文建模过程,提出了一种改进的方法。试验结果表明在多级子带分解后,适当的上下文建模可以进一步有效地提高SPIHT算法的性能,在相同输出码率的条件下,改进的SPIHT算法比原算法峰值信噪比提高将近0.2dB。相似文献

14.

小波分析和偏最小二乘法相结合用于示波计时电位同时测定铅和铊中的研究

李华张书玲申琦高鸿《计算机与应用化学》2003,(Z1)

提出了基于小波分析和偏最小二乘(Partial Least Squares,PLS)基础上的化学计量学方法用于示波计时电位同时测定铅和铊的研究。利用小波变换可方便地从dE/dt-E信号中滤噪,提取与去极剂浓度变化有关的信号,获得利于多组分测定的示波图。该方法为示波过程分析奠定了一定的基础。相似文献

15.

Context-based embedded image compression using binary wavelet transform

Hong Pan Li-Zuo JinXiao-Hui Yuan Si-Yu XiaLiang-Zheng Xia 《Image and vision computing》2010

Binary wavelet transform (BWT) has several distinct advantages over the real wavelet transform (RWT), such as the conservation of alphabet size of wavelet coefficients, no quantization introduced during the transform and the simple Boolean operations involved. Thus, less coding passes are engaged and no sign bits are required in the compression of transformed coefficients. However, the use of BWT for the embedded grayscale image compression is not well established. This paper proposes a novel Context-based Binary Wavelet Transform Coding approach (CBWTC) that combines the BWT with a high-order context-based arithmetic coding scheme to embedded compression of grayscale images. In our CBWTC algorithm, BWT is applied to decorrelate the linear correlations among image coefficients without expansion of the alphabet size of symbols. In order to match up with the CBWTC algorithm, we employ the gray code representation (GCR) to remove the statistical dependencies among bi-level bitplane images and develop a combined arithmetic coding scheme. In the proposed combined arithmetic coding scheme, three highpass BWT coefficients at the same location are combined to form an octave symbol and then encoded with a ternary arithmetic coder. In this way, the compression performance of our CBWTC algorithm is improved in that it not only alleviate the degradation of predictability caused by the BWT, but also eliminate the correlation of BWT coefficients in the same level subbands. The conditional context of the CBWTC is properly modeled by exploiting the characteristics of the BWT as well as taking advantages of non-causal adaptive context modeling. Experimental results show that the average coding performance of the CBWTC is superior to that of the state-of-the-art grayscale image coders, and always outperforms the JBIG2 algorithm and other BWT-based binary coding technique for a set of test images with different characteristics and resolutions. 相似文献

16.

Moving-window discrete Fourier transform

Vladimir Ya. Krakovsky 《Journal of Real-Time Image Processing》2006,1(2):153-161

The moving-window discrete Fourier transform (MWDFT) is a dynamic spectrum analysis in which the next analysis interval differs from the previous one by including the next signal sample and excluding the first one from the previous analysis interval (Dillard in IEEE Trans Inform Theory 13:2–6, 1967, Comput Elect Eng 1:143–152, 1973, USA Patent 4023028, May 10, 1977). Such a spectrum analysis is necessary for time–frequency localization of analyzed signals with given peculiarities (Tolimieri and An in Time–frequency representations. Birkhauser, Basel, 1998). Using the well-known fast Fourier transform (FFT) towards this aim is not effective. Recursive algorithms which use only one complex multiplication for computing one spectrum sample during each analysis interval are more effective. The author improved one algorithm so that it is possible to use only one complex multiplication for computing two, four, and even eight (for complex signals) spectrum samples simultaneously. Problems of realization and application of the MWDFT are also considered in the paper. 相似文献

17.

用FPGA实现(5,3)小波变换 总被引：1，自引：0，他引：1

马伍新崔占忠代方震《计算机工程与设计》2003,24(1):47-49

小波变换是当代信号处理技术常用的数学方法之一,它在视频压缩方面也有很好的前景,根据Daubechies等人于1996年提出的整型小波变换方法,用FPGA设计并实现了4层整数（5,3）小波变换,硬件实现结果与软件实现结果基本一致。相似文献

18.

Fast 2D discrete cosine transform on compressed image in restricted quadtree and shading format

Kuo-Liang Chung Wen-Ming Yan 《Information Processing Letters》2002,81(1):13-21

Given a compressed image in the restricted quadtree and shading format, this paper presents a fast algorithm for computing 2D discrete cosine transform (DCT) on the compressed grey image directly without the need to decompress the compressed image. The proposed new DCT algorithm takes O(K²logK+N²) time where the decompressed image is of size N×N and K denotes the number of nodes in the restricted quadtree. Since commonly K<N, the proposed algorithm is faster than the indirect method by decompressing the compressed image first, then applying the conventional DCT algorithm on the decompressed image. The indirect method takes O(N²logN) time. 相似文献

19.

基于物联网的分布式通信数据高效压缩仿真

万川梅朱参世唐宏《计算机仿真》2020,(6):410-415

针对物联网中节点能量是有限的,无法长时间支持通信网络的大量数据传输,导致传统数据压缩方法存在整合数据失误率高、数据压缩率低以及传感节点能量消耗快等问题,提出一种基于物联网的分布式通信数据高效压缩方法.通过物联网分布式结构分析边界效应,采用小波函数对不同节点传送的数据进行数据变换,根据Mallat算法,计算因数据变换而得... 相似文献

20.

Edit distance of run-length encoded strings

Ora Arbell Gad M. Landau Joseph S.B. Mitchell 《Information Processing Letters》2002,83(6):307-314

Let X and Y be two run-length encoded strings, of encoded lengths k and l, respectively. We present a simple O(|X|l+|Y|k) time algorithm that computes their edit distance. 相似文献