首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到8条相似文献,搜索用时 0 毫秒
1.
2.
3.
Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest.  相似文献   

4.
We consider succinct, or highly space-efficient, representations of a (static) string consisting of n   pairs of balanced parentheses, which support natural operations such as finding the matching parenthesis for a given parenthesis, or finding the pair of parentheses that most tightly enclose a given pair. This problem was considered by Jacobson [Space-efficient static trees and graphs, in: Proc. of the 30th FOCS, 1989, pp. 549–554] and Munro and Raman [Succinct representation of balanced parentheses and static trees, SIAM J. Comput. 31 (2001) 762–776] who gave O(n)O(n)-bit and 2n+o(n)2n+o(n)-bit representations, respectively, that supported the above operations in O(1)O(1) time on the RAM model of computation. This data structure is a fundamental tool in succinct representations, and has applications in representing suffix trees, ordinal trees, planar graphs and permutations.  相似文献   

5.
We develop a new lower bound technique for data structures. We show an optimal Ω(nlglgn/lgn)Ω(nlglgn/lgn) space lower bounds for storing an index that allows to implement rank and select queries on a bit vector BB provided that BB is stored explicitly. These results improve upon [Peter Bro Miltersen, Lower bounds on the size of selection and rank indexes, in: Proceedings of the 16th Annual ACM–SIAM Symposium on Discrete Algorithms, 2005, pp. 11–12]. We show Ω((m/t)lgt)Ω((m/t)lgt) lower bounds for storing rank/select index in the case where BB has mm 11-bits in it and the algorithm is allowed to probe tt bits of BB. We also present an improved data structure that implements both rank and select queries with an index of size (1+o(1))(nlglgn/lgn)+O(n/lgn)(1+o(1))(nlglgn/lgn)+O(n/lgn), that is, compared to existing results we give an explicit constant for storage in the RAM model with word size lgnlgn. An advantage of this data structure is that both rank and select indexes share the most space consuming part of order Θ(nlglgn/lgn)Θ(nlglgn/lgn) making it more practical for implementation.  相似文献   

6.
7.
The methods most heavily used by search engines to answer conjunctive queries on binary relations (such as one associating keywords with web-pages) are based on computing the intersection of postings lists stored as sorted arrays and using variants of binary search. We show that a succinct representation of the binary relation permits much better results, while using less space than traditional methods. We apply our results not only to conjunctive queries on binary relations, but also to queries on semi-structured documents such as XML documents or file-system indexes, using a variant of an adaptive algorithm used to solve conjunctive queries on binary relations.  相似文献   

8.
Moritz G. Maaß 《Software》2006,36(3):305-331
We present new algorithms for computing matching statistics with suffix arrays. We show how the Multiple Common Substring Problem can be solved efficiently in practice with a new approach using matching statistics. This problem consists of finding the common substrings of a set of strings. For the computation of matching statistics we compare seven different methods based on suffix trees and suffix arrays. Most of the suffix array algorithms have an inferior asymptotic worst case running time but a very low memory overhead and small constants in the running time complexity. Our experiments show a good performance in practice. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号