期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improvements to Burrows–Wheeler compression algorithm

Sebastian Deorowicz 《Software》2000,30(13):1465-1483

In 1994 Burrows and Wheeler presented a new algorithm for lossless data compression. The compression ratio that can be achieved using their algorithm is comparable with the best known other algorithms, whilst its complexity is relatively small. In this paper we explain the internals of this algorithm and discuss its various modifications that have been presented so far. Then we propose new improvements for its effectiveness. They allow us to obtain a compression ratio equal to 2.271 bpc for the Calgary Corpus files, which is the best result in the class of Burrows–Wheeler transform based algorithms. Copyright © 2000 John Wiley & Sons, Ltd. 相似文献

2.

Post BWT stages of the Burrows–Wheeler compression algorithm

Jürgen Abel 《Software》2010,40(9):751-777

The lossless Burrows–Wheeler compression algorithm has received considerable attention over recent years for both its simplicity and effectiveness. It is based on a permutation of the input sequence—the Burrows–Wheeler transformation (BWT)—which groups symbols with a similar context close together. In the original version, this permutation was followed by a Move‐To‐Front transformation and a final entropy coding stage. Later versions used different algorithms, placed after the BWT, since the following stages have a significant influence on the compression rate. This paper describes different algorithms and improvements for these post BWT stages including a new context‐based approach. The results for compression rates are presented together with compression and decompression times on the Calgary corpus, the Canterbury corpus, the large Canterbury corpus and the Lukas 2D 16‐bit medical image corpus. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

3.

Incremental frequency count—a post BWT‐stage for the Burrows–Wheeler compression algorithm

Jürgen Abel 《Software》2007,37(3):247-265

The stage after the Burrows–Wheeler transform (BWT) has a key function inside the Burrows–Wheeler compression algorithm as it transforms the BWT output from a local context into a global context. This paper presents the Incremental Frequency Count stage, a post‐BWT stage. The new stage is paired with a run length encoding stage between the BWT and the entropy coding stage of the algorithm. It offers high throughput similar to a Move To Front stage, and at the same time good compression rates like the strong but slow Weighted Frequency Count stage. The properties of the Incremental Frequency Count stage are compared to the Move To Front and Weighted Frequency Count stages by their compression rates and speeds on the Calgary and large Canterbury corpora. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献

4.

Burrows–Wheeler compression with variable length integer codes

Peter Fenwick 《Software》2002,32(13):1307-1316

The final coder in Burrows–Wheeler compression is usually either an adaptive Huffman coder (for speed) or a complex of arithmetic coders for better compression. This article describes the use of conventional pre‐defined variable length codes or universal codes and shows that they too can give excellent compression. The paper also describes a ‘sticky Move‐to‐Front’ modification which gives a useful improvement in compression for most files. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献

5.

Revisiting bounded context block‐sorting transformations

J. Shane Culpepper Matthias Petri Simon J. Puglisi 《Software》2012,42(8):1037-1054

The Burrows–Wheeler Transform (BWT ) produces a permutation of a string X, denoted X^?, by sorting the n cyclic rotations of X into full lexicographical order and taking the last column of the resulting n×n matrix to be X^?. The transformation is reversible in time. In this paper, we consider an alteration to the process, called k‐BWT , where rotations are only sorted to a depth k. We propose new approaches to the forward and reverse transform, and show that the methods are efficient in practice. More than a decade ago, two algorithms were independently discovered for reversing k‐BWT , both of which run in time. Two recent algorithms have lowered the bounds for the reverse transformation to and, respectively. We examine the practical performance for these reversal algorithms. We find that the original approach is most efficient in practice, and investigates new approaches, aimed at further speeding reversal, which store precomputed context boundaries in the compressed file. By explicitly encoding the context boundaries, we present an reversal technique that is both efficient and effective. Finally, our study elucidates an inherently cache‐friendly – and hitherto unobserved – behavior in the reverse k‐BWT , which could lead to new applications of the k‐BWT transform. In contrast to previous empirical studies, we show that the partial transform can be reversed significantly faster than the full transform, without significantly affecting compression effectiveness. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

6.

关于空间数据系统无损压缩算法的零值块研究

徐志立苏依拉《计算机工程与设计》2007,28(14):3416-3418

通过对CCSDS(国际空间数据系统咨询委员会)建议的无损数据压缩标准的研究,以及对目前常用压缩算法的调查,它阐述了一种具有延迟小速度快抗差错能力强等特点的无损数据压缩算法,即Rice压缩算法,压缩率超过50%以上,而且对多种类型的数据都会达到满意的效果.它对算法中的零值块部分作了较为详细地阐述,因为经过预处理过的数据通常都很小,对于图像来说有相当多的零值.因此,对零值较多的情况下采取零值块压缩处理,效果很好,经过软件测试,结果符合CCSDS的要求标准. 相似文献

7.

Solving the block–Toeplitz least‐squares problem in parallel

P. Alonso J. M. Badía A. M. Vidal 《Concurrency and Computation》2005,17(1):49-67

In this paper we present two versions of a parallel algorithm to solve the block–Toeplitz least‐squares problem on distributed‐memory architectures. We derive a parallel algorithm based on the seminormal equations arising from the triangular decomposition of the product T^TT. Our parallel algorithm exploits the displacement structure of the Toeplitz‐like matrices using the Generalized Schur Algorithm to obtain the solution in O(mn) flops instead of O(mn²) flops of the algorithms for non‐structured matrices. The strong regularity of the previous product of matrices and an appropriate computation of the hyperbolic rotations improve the stability of the algorithms. We have reduced the communication cost of previous versions, and have also reduced the memory access cost by appropriately arranging the elements of the matrices. Furthermore, the second version of the algorithm has a very low spatial cost, because it does not store the triangular factor of the decomposition. The experimental results show a good scalability of the parallel algorithm on two different clusters of personal computers. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

8.

Scalable large‐scale fluid–structure interaction solvers in the Uintah framework via hybrid task‐based parallelism algorithms

Qingyu Meng Martin Berzins 《Concurrency and Computation》2014,26(7):1388-1407

Uintah is a software framework that provides an environment for solving fluid–structure interaction problems on structured adaptive grids for large‐scale science and engineering problems involving the solution of partial differential equations. Uintah uses a combination of fluid flow solvers and particle‐based methods for solids, together with adaptive meshing and a novel asynchronous task‐based approach with fully automated load balancing. When applying Uintah to fluid–structure interaction problems, the combination of adaptive meshing and the movement of structures through space present a formidable challenge in terms of achieving scalability on large‐scale parallel computers. The Uintah approach to the growth of the number of core counts per socket together with the prospect of less memory per core is to adopt a model that uses MPI to communicate between nodes and a shared memory model on‐node so as to achieve scalability on large‐scale systems. For this approach to be successful, it is necessary to design data structures that large numbers of cores can simultaneously access without contention. This scalability challenge is addressed here for Uintah, by the development of new hybrid runtime and scheduling algorithms combined with novel lock‐free data structures, making it possible for Uintah to achieve excellent scalability for a challenging fluid–structure problem with mesh refinement on as many as 260K cores. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

9.

Pareto–Koopmans efficiency in two‐stage network data envelopment analysis in the presence of undesirable intermediate products and nondiscretionary factors

Samira Maleki Ali Ebrahimnejad Reza Kazemi Matin 《Expert Systems》2019,36(3)

Many studies have recently been conducted on the evaluation of system performance with a two‐stage network structure in data envelopment analysis (DEA) literature. One of the topics of interest to researchers has been the mitigation of undesirable products or nondiscretionary factors into their corresponding possible production set (PPS) and their impact on overall efficiency calculations. Determination of decision‐making units (DMUs) with Pareto–Koopmans efficiency status is decisive in identifying benchmark units. The calculated overall efficiency status is compromised when both undesirable products and nondiscretionary factors are present. This work utilizes an axiomatic approach. A novel PPS for a two‐stage network in presence of undesirable intermediate products and nondiscretionary exogenous inputs is introduced. Based on this PPS and by focusing on the principle of mathematical dominance, new models for evaluating overall and divisional efficiencies are presented. In addition, by proposing a two‐step network DEA approach, a necessary and sufficient condition for detection of DMUs with Pareto–Koopmans efficiency status is provided. And by introducing a two‐step algorithm, a novel technique for determining overall efficiency conditions is produced. Finally, the proposed technology is applied to a practical example, and outcomes are discussed. 相似文献

10.

Measuring the overhead of Intel C++ Concurrent Collections over Threading Building Blocks for Gauss–Jordan elimination

Peiyi Tang 《Concurrency and Computation》2012,24(18):2282-2301

The most efficient way to parallelize computation is to build and evaluate the task graph constrained only by the data dependencies between the tasks. Both Intel's C++ Concurrent Collections (CnC) and Threading Building Blocks (TBB) libraries allow such task‐based parallel programming. CnC also adapts the macro data flow model by providing only single‐assignment data objects in its global data space. Although CnC makes parallel programming easier, by specifying data flow dependencies only through single‐assignment data objects, its macro data flow model incurs overhead. Intel's C++ CnC library is implemented on top of its C++ TBB library. We can measure the overhead of CnC by comparing its performance with that of TBB. In this paper, we analyze all three types of data dependencies in the tiled in‐place Gauss–Jordan elimination algorithm for the first time. We implement the task‐based parallel tiled Gauss–Jordan algorithm in TBB using the data dependencies analyzed and compare its performance with that of the CnC implementation. We find that the overhead of CnC over TBB is only 12%– 15% of the TBB time, and CnC can deliver as much as 87%– 89% of the TBB performance for Gauss–Jordan elimination, using the optimal tile size. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献