首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

Homomorphisms between information systems as a kind of tools to study communication between information systems are based on data compression, i.e. a complex massive information system is compressed into a relatively small-scale information system by means of homomorphisms under the premise that some of the same data structures are maintained. This paper is devoted to obtaining some invariant and inverse invariant characters of information systems under homomorphisms based on data compression. Saturation reductions of an information system is first proposed. Then, relationships between reductions and saturation reductions of the same information system are given. Next, information structures in information systems are further investigated. Finally, it is proved that coordinate sets, necessary attributes, necessary s-attributes, saturated coordinate sets, saturation reductions, some dependence and independence between information structures in an information system are both invariant and inverse invariant under homomorphisms based on data compression, in other words, some of the same data structures of an information system are obtained. Moreover, an example is used to illustrate that information granulation, rough entropy, information entropy and information amount are neither invariant nor inverse invariant under homomorphisms based on data compression.  相似文献   

2.
Bitmap indexes are commonly used in databases and search engines. By exploiting bit‐level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus, we might prefer compressed bitmap indexes. Following Oracle's lead, bitmaps are often compressed using run‐length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it uses packed arrays for compression instead of RLE. We compare it to two high‐performance RLE‐based bitmap encoding techniques: Word Aligned Hybrid compression scheme and Compressed ‘n’ Composable Integer Set. On synthetic and real data, we find that Roaring bitmaps (1) often compress significantly better (e.g., 2×) and (2) are faster than the compressed alternatives (up to 900× faster for intersections). Our results challenge the view that RLE‐based bitmap compression is best. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

3.
Simon Gog  Matthias Petri 《Software》2014,44(11):1287-1314
Succinct data structures provide the same functionality as their corresponding traditional data structure in compact space. We improve on functions rank and select, which are the basic building blocks of FM‐indexes and other succinct data structures. First, we present a cache‐optimal, uncompressed bitvector representation that outperforms all existing approaches. Next, we improve, in both space and time, on a recent result by Navarro and Providel on compressed bitvectors. Last, we show techniques to perform rank and select on 64‐bit words that are up to three times faster than existing methods. In our experimental evaluation, we first show how our improvements affect cache and runtime performance of both operations on data sets larger than commonly used in the evaluation of succinct data structures. Our experiments show that our improvements to these basic operations significantly improve the runtime performance and compression effectiveness of FM‐indexes on small and large data sets. To our knowledge, our improvements result in FM‐indexes that are either smaller or faster than all current state of the art implementations. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Variable-resolution Compression of Vector Data   总被引:1,自引:0,他引:1  
The compression of spatial data is a promising solution to reduce the space of data storage and to decrease the transmission time of spatial data over the Internet. This paper proposes a new method for variable-resolution compression of vector data. Three key steps are encompassed in the proposed method, namely, the simplification of vector data via the elimination of vertices, the compression of removed vertices, and the decoding of the compressed vector data. The proposed compression method was implemented and applied to compress vector data to investigate its performance in terms of the compression ratio, distortions of geometric shapes. The results show that the proposed method provides a feasible and efficient solution for the compression of vector data, is able to achieve good compression ratios and maintains the main shape characteristics of the spatial objects within the compressed vector data.
Bisheng YangEmail:
  相似文献   

5.
M. Farach  M. Thorup 《Algorithmica》1998,20(4):388-404
String matching and compression are two widely studied areas of computer science. The theory of string matching has a long association with compression algorithms. Data structures from string matching can be used to derive fast implementations of many important compression schemes, most notably the Lempel—Ziv (LZ77) algorithm. Intuitively, once a string has been compressed—and therefore its repetitive nature has been elucidated—one might be tempted to exploit this knowledge to speed up string matching. The Compressed Matching Problem is that of performing string matching in a compressed text, without uncompressing it. More formally, let T be a text, let Z be the compressed string representing T , and let P be a pattern. The Compressed Matching Problem is that of deciding if P occurs in T , given only P and Z . Compressed matching algorithms have been given for several compression schemes such as LZW. In this paper we give the first nontrivial compressed matching algorithm for the classic adaptive compression scheme, the LZ77 algorithm. In practice, the LZ77 algorithm is known to compress more than other dictionary compression schemes, such as LZ78 and LZW, though for strings with constant per bit entropy, all these schemes compress optimally in the limit. However, for strings with o(1) per bit entropy, while it was recently shown that the LZ77 gives compression to within a constant factor of optimal, schemes such as LZ78 and LZW may deviate from optimality by an exponential factor. Asymptotically, compressed matching is only relevant if |Z|=o(|T|) , i.e., if the compression ratio |T|/|Z| is more than a constant. These results show that LZ77 is the appropriate compression method in such settings. We present an LZ77 compressed matching algorithm which runs in time O(n log 2 u/n + p) where n=|Z| , u=|T| , and p=|P| . Compare with the na?ve ``decompresion' algorithm, which takes time Θ(u+p) to decide if P occurs in T . Writing u+p as (n u)/n+p , we see that we have improved the complexity, replacing the compression factor u/n by a factor log 2 u/n . Our algorithm is competitive in the sense that O(n log 2 u/n + p)=O(u+p) , and opportunistic in the sense that O(n log 2 u/n + p)=o(u+p) if n=o(u) and p=o(u) . Received December 20, 1995; revised October 29, 1996, and February 6, 1997.  相似文献   

6.
William H. Hsu  Amy E. Zwarico 《Software》1995,25(10):1097-1116
We present a compression technique for heterogeneous files, those files which contain multiple types of data such as text, images, binary, audio, or animation. The system uses statistical methods to determine the best algorithm to use in compressing each block of data in a file (possibly a different algorithm for each block). The file is then compressed by applying the appropriate algorithm to each block. We obtain better savings than possible by using a single algorithm for compressing the file. The implementation of a working version of this heterogeneous compressor is described, along with examples of its value toward improving compression both in theoretical and applied contexts. We compare our results with those obtained using four commercially available compression programs, PKZIP, Unix compress, Stufflt, and Compact Pro, and show that our system provides better space savings.  相似文献   

7.
面向XPath执行的XML数据流压缩方法   总被引:13,自引:0,他引:13       下载免费PDF全文
由于XML(extensible markup language)本身是自描述的,所以XML数据流中存在大量冗余的结构信息.如何压缩XML数据流,使得在减少网络传输代价的同时有效支持压缩数据流上的查询处理,成为一个新的研究领域.目前已有的XML数据压缩技术,都需要扫描数据多遍,或者不支持数据流之上的实时查询处理.提出了一种XML数据流的压缩技术XSC(XML stream compression),实时完成XML数据流的压缩和解压缩,XSC动态构建XML元素事件序列字典并输出相关索引,能够根据XML数据流所遵从的DTD,产生XML元素事件序列图,在压缩扫描之前,产生更加合理的结构序列编码.压缩的XML数据流能够直接解压缩用于XPath的执行.实验表明,在XML数据流环境中,XSC在数据压缩率和压缩时间上要优于传统算法.同时,在压缩数据之上查询的执行代价是可以接受的.  相似文献   

8.
Voxel‐based approaches are today's standard to encode volume data. Recently, directed acyclic graphs (DAGs) were successfully used for compressing sparse voxel scenes as well, but they are restricted to a single bit of (geometry) information per voxel. We present a method to compress arbitrary data, such as colors, normals, or reflectance information. By decoupling geometry and voxel data via a novel mapping scheme, we are able to apply the DAG principle to encode the topology, while using a palette‐based compression for the voxel attributes, leading to a drastic memory reduction. Our method outperforms existing state‐of‐the‐art techniques and is well‐suited for GPU architectures. We achieve real‐time performance on commodity hardware for colored scenes with up to 17 hierarchical levels (a 128K3voxel resolution), which are stored fully in core.  相似文献   

9.
李鸣鹏  高宏  邹兆年 《软件学报》2016,27(9):2265-2277
研究了基于图压缩的最大Steiner连通k核查询处理,提出了一种支持最大Steiner连通k核查询的图压缩算法SC,证明了基于SC压缩算法的查询正确性.由于最大Steiner连通k核查询仅需要找到符合要求的连通区域,提出了图压缩算法TC,进一步将压缩图压缩为树.证明了基于压缩树的查询正确性,并提出了线性时间的无需解压缩的查询处理算法.真实和虚拟数据上的实验结果表明:压缩算法平均可将原始图压缩掉88%,且对于稠密的原始图,压缩算法的压缩效果更好,可将原始图压缩掉90%,与在原始图上直接进行查询处理相比,基于压缩图的查询处理算法效率更好,平均提升了1~2个数量级.  相似文献   

10.
电力系统故障录波数据是分析电网故障的主要依据,录波数据压缩有益于减小数据存储容量和提高数据传输效率。针对电力故障录波数据的格式及构成特点,提出了一种基于傅里叶变换和小波包变换的数据压缩新算法。采用离散傅里叶变换对录波模拟量通道的B时段数据进行压缩和重构,根据重构误差判断该通道是否为故障通道;对故障通道的暂态扰动时段采用小波包变换进行压缩,对正常通道及故障通道的其他时段采用傅里叶变换进行压缩。大量录波文件的压缩结果和工程实际应用表明,所提算法可以同时获得很高的压缩率和压缩精度,具有广阔的应用前景。  相似文献   

11.
无线传感器网络(WSN)节点能量与带宽均非常有限,难以适应大量数据长时间传输的需求,所以非常有必要对原始采集的数据进行聚合或压缩处理。利用传感数据间存在的时间相关性,提出分段常量近似与Haar小波压缩相结合的二级压缩算法,在误差可调的情况下压缩该类时间相关的传感数据。通过真实数据集上的实验,分析该算法的数据重构误差、数据压缩比与压缩耗时情况,并与其他压缩算法进行对比。实验结果表明,该算法能够有效地利用传感数据中存在的时间相关性,显著减少冗余数据,有较高的压缩比并保证数据精度。  相似文献   

12.
无线传感器网络中存在大量的数据冗余,数据融合技术通过对采样数据进行压缩,消除冗余,有效的减少了节点发送的数据量,延长传感器网络的寿命.提出了压缩感知与数据转发相结合的数据融合算法,在网络采样数据收集的过程中根据节点的子节点个数选择利用压缩感知对数据进行压缩还是直接对数据进行数据转发.仿真结果表明,和基于压缩感知的数据融合算法相比,数据转发与压缩感知相结合的数据融合算法,有效地在平衡节点间负载的同时减少节点的发送量.  相似文献   

13.
近年来由于科技的发展和互联网的兴起,图像资料已被广泛地应用在网络上,而图像压缩不但可以节省图像资料占用的内存空间,并且可以加速其传输速度,因此图像压缩技术目前被广泛应用于医学、手机、数据传输、多媒体影音、互联网络等。这里主要是针对无失真的图像压缩技术,先将原始图像转成256色的GIF格式,然后再建立一个索引矩阵,矩阵中元素是由原色RGB信息对应所组成,利用索引矩阵排序法并配合编码簿,再加上配合LZW、CALIC、JPEG2000、JPEG-LS等不同压缩算法,来比较压缩的效果。  相似文献   

14.
We propose GP-zip2, a new approach to lossless data compression based on Genetic Programming (GP). GP is used to optimally combine well-known lossless compression algorithms to maximise data compression. GP-zip2 evolves programs with multiple components. One component analyses statistical features extracted by sequentially scanning the data to be compressed and divides the data into blocks. These blocks are projected onto a two-dimensional Euclidean space via two further (evolved) program components. K-means clustering is then applied to group similar data blocks. Each cluster is labelled with the optimal compression algorithm for its member blocks. After evolution, evolved programs can be used to compress unseen data. The compression algorithms available to GP-zip2 are: Arithmetic coding, Lempel-Ziv-Welch, Unbounded Prediction by Partial Matching, Run Length Encoding, and Bzip2. Experimentation shows that the results produced by GP-zip2 are human-competitive, being typically superior to well-established human-designed compression algorithms in terms of the compression ratios achieved in heterogeneous archive files.  相似文献   

15.
基于小波分析的等高线数据压缩模型   总被引:9,自引:0,他引:9       下载免费PDF全文
鉴于矢量地图数据压缩在地形环境仿真、制图综合、GIS等研究中具有重要作用,为此,利用小波变换理论和矢量地图数据的特点,提出了一种用于等高线数据压缩的模型和方法,即首先,基于小波变换的特征,提出了基于小波变换的等高线数据压缩的基本思想;然后,根据等高线数据的小波变换特点,研究了小波变换的边界处理;同时,给出了用于等高线数据压缩的特征点选取方法;并提出了基于小波变换的等高线数据压缩模型;最后,根据提出的压缩模型,对实际等高线数据进行了实验。理论分析和实验结果表明,该方法不仅能保持较高的压缩比,而且能使压缩后的数据保留原来数据的变化趋势,从而较好地反映了原数据的内在特性和规律性。  相似文献   

16.
张宇  刘燕兵  熊刚  贾焰  刘萍  郭莉 《软件学报》2014,25(9):1937-1952
对包含亿万个节点和边的图数据进行高效、紧凑的表示和压缩,是大规模图数据分析处理的基础.图数据压缩技术可以有效地降低图数据的存储空间,同时支持在压缩形式的图数据上进行快速访问.通过深入分析该技术的发展现状,将该技术分为基于传统存储结构的压缩技术、网页图压缩技术、社交网络图压缩技术、面向特定查询的图压缩技术4类.分别对每类技术详细分析了其代表方法并比较了它们之间的性能差异.最后对该技术进行了总结和展望.  相似文献   

17.
Modern computers typically make use of 64‐bit words as the fundamental unit of data access. However the decade‐long migration from 32‐bit architectures has not been reflected in compression technology, because of a widespread assumption that effective compression techniques operate in terms of bits or bytes, rather than words. Here we demonstrate that the use of 64‐bit access units, especially in connection with word‐bounded codes, does indeed provide the opportunity for improving the compression performance. In particular, we extend several 32‐bit word‐bounded coding schemes to 64‐bit operation and explore their uses in information retrieval applications. Our results show that the Simple‐8b approach, a 64‐bit word‐bounded code, is an excellent self‐skipping code, and has a clear advantage over its competitors in supporting fast query evaluation when the data being compressed represents the inverted index for a large text collection. The advantages of the new code also accrue on 32‐bit architectures, and for all of Boolean, ranked, and phrase queries; which means that it can be used in any situation. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
目的 针对现有的加密域可逆信息隐藏算法在对位平面压缩时未能充分利用位平面间的相关性的问题,为了降低位平面的压缩率从而提高嵌入容量,提出一种减少相邻位平面间冗余度的加密域可逆信息隐藏算法。方法 算法将图像进行分块并将块的位置进行置乱,置乱并未改变位平面的块内像素的相关性,使得位平面的块同样利于压缩。将块置乱后的图像的高位平面与次高位进行异或操作后得到新的次高位平面,再用新的次高位异或比它低一位的位平面。依次对其余的低位平面进行同样的操作后得到新的低7个位平面,将它们与原始最高位相结合得到新的图像的8个位平面。使用BBE(binary-block embeding)算法对新的图像的位平面进行压缩为嵌入信息腾出空间。为了保证加密图像的安全性,对腾出空间后的图像进行异或加密。结果 对相邻位平面进行异或后使除了最高位平面外的低位平面更平滑,减少了不能使用BBE算法压缩的块及压缩的不好的块的个数,更有利于用BBE算法对图像进行压缩。提出的算法与现有的基于位平面压缩的算法相比得到了较高的嵌入率,对不同纹理的图像而言,嵌入的容量平均提高了0.4 bit/像素。结论 实验结果表明,提出的算法在保证安全性的同时可以腾出更多的空间来嵌入额外的信息,在实际生活中能根据需求灵活地嵌入信息。嵌入的信息能无损地提取,且图像能完全恢复。总的来说,提出的算法具有良好的性能。  相似文献   

19.
庄越  庄浩 《计算机时代》2009,(7):9-10,14
针对小变化量数据的传输提出一种转置矩阵位压缩算法,将变化位和非变化位清晰地分开,并对变化位以位为单位进行压缩。实验结果表明:该算法的对小变化量数据的压缩效果很好,即使使用极为简单的行程编码压缩算法也可以取得极高的压缩率;该算法用于处理类似于工业控制系统状态参数等数据时,可缓解系统的网络负荷压力,提高网络的传输效率。  相似文献   

20.
Microprocessor speed has been growing exponentially faster than memory system speed in the recent past. This paper explores the long term implications of this trend. We define scalable locality, which measures our ability to apply ever faster processors to increasingly large problems (just as scalable parallelism measures our ability to apply more numerous processors to larger problems). We provide an algorithm called time skewing that derives an execution order and storage mapping to produce any desired degree of locality, for certain programs that can be made to exhibit scalable locality. Our approach is unusual in that it derives the transformation from the algorithm's dataflow (a fundamental characteristic of the algorithm) instead of searching a space of transformations of the execution order and array layout used by the programmer (artifacts of the expression of the algorithm). We provide empirical results for data sets using L2 cache, main memory, and virtual memory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号