首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we study a restricted version of the position restricted pattern matching problem introduced and studied by Mäkinen and Navarro [V. Mäkinen, G. Navarro, Position-restricted substring searching, in: J.R. Correa, A. Hevia, M.A. Kiwi (Eds.), LATIN, in: Lecture Notes in Computer Science, vol. 3887, Springer, 2006, pp. 703-714]. In the problem handled in this paper, we are interested in those occurrences of the pattern that lies in a suffix or in a prefix of the given text. We achieve optimal query time for our problem against a data structure which is an extension of the classic suffix tree data structure. The time and space complexity of the data structure is dominated by that of the suffix tree. Notably, the (best) algorithm by Mäkinen and Navarro, if applied to our problem, gives sub-optimal query time and the corresponding data structure also requires more time and space.  相似文献   

2.
In [C.S. Iliopoulos, M.S. Rahman, Faster index for property matching, Information Processing Letters 105 (2008) 218-223], Iliopoulos and Rahman proposed a data structure called IDS-PIP for solving the property indexing pattern matching problem. IDS-PIP can be constructed in O(n) time, where n is the length of the text. Then, based on ID-PIP, each query takes O(mlog|Σ|+K) time, where m is the length the pattern, Σ is the alphabet, and K is the output size. They assume that all intervals in property π are disjoint. If the intervals in property π are not disjoint, then they create an equivalent set of disjoint intervals π to replace π. However, property π is not equivalent to property π at all. In this erratum, we propose a way for finding correct property π and modify IDS-PIP slightly.  相似文献   

3.
In this paper we present a concise O(n) implementation of Cleary's algorithm for generating a sequence of restricted rotations between any two binary trees. The algorithm is described directly in terms of the binary trees, without using any intermediate representation.  相似文献   

4.
The problem of pattern matching with wildcards is to find all the occurrences of a pattern of length m in a text of length n over a finite alphabet Σ (both the text and the pattern are allowed to contain wildcards). Based on the prime number encoding scheme (Chaim Linhart, Ron Shamir, Faster pattern matching with character classes using prime number encoding, J. Comput. Syst. Sci. 75 (3) (2009) 155-162), we present a new integer encoding and an efficient fast Fourier transforms based algorithm for this problem. The algorithm takes time to search the pattern in the text by computing one convolution. For matching with wildcards, our encoding uses fewer prime numbers and has shorter code words comparing with the prime number encoding. We use at most 2lg|Σ| prime numbers to encode the symbols while in the prime number encoding |Σ| prime numbers are required. This number reduces to 1.5lg|Σ| when |Σ|>40. The code word used in the algorithm is at most 2⌊lg|Σ|⌋⌈lg(5m)⌉ bits while in the prime encoding it is at least bits. We also show that the length of words can be further reduced by increasing the number of convolutions computed.  相似文献   

5.
The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In classical pattern matching, both the text and pattern are strings. Applications such as searching in xml or searching in hypertext require searching strings in non-linear structures such as trees or graphs. There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(nlogm) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete.  相似文献   

6.
It has long been known that pattern matching in the Hamming distance metric can be done in time, where n is the length of the text, m is the length of the pattern, and Σ is the alphabet. The classic algorithm for this is due to Abrahamson and Kosaraju. This paper considers the following generalization, motivated by the situation where the entries in the text and pattern are analog, or distorted by additive noise, or imprecisely given for some other reason: in any alignment of the pattern with the text, two aligned symbols a and b contribute +1 to the similarity score if they differ by no more than a given threshold θ, otherwise they contribute zero. We give an time algorithm for this more general version of the problem; the classic Hamming distance matching problem is the special case of θ=0.  相似文献   

7.
8.
We show that i-directable nondeterministic automata can be i-directed with a word of length O(2n) for i=1,2, where n stands for the number of states. Since for i=1,2 there exist i-directable automata having i-directing words of length Ω(2n), these upper bounds are asymptotically optimal. We also show that a 3-directable nondeterministic automaton with n states can be 3-directed with a word of length , improving the previously known upper bound O(2n). Here the best known lower bound is .  相似文献   

9.
10.
We present a filtering based algorithm for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols in either p or t (but not both), and a bound k, our algorithm finds all the places that the pattern matches the text with at most k mismatches. The algorithm is deterministic and runs in Θ(nm1/3k1/3log2/3m) time.  相似文献   

11.
We introduce a simple, linear time algorithm for recognizing trivially perfect (TP) graphs. It improves upon the algorithm of Yan et al. [J.-H. Yan, J.-J. Chen, G.J. Chang, Quasi-threshold graphs, Discrete Appl. Math. 69 (3) (1996) 247–255] in that it is certifying, producing a P4 or a C4 when the graph is not TP. In addition, our algorithm can be easily modified to recognize the complement of TP graphs (co-TP) in linear time as well. It is based on lexicographic BFS, and in particular the technique of partition refinement, which has been used in the recognition of many other graph classes [D.G. Corneil, Lexicographic breadth first search—a survey, in: WG 2004, in: Lecture Notes in Comput. Sci., vol. 3353, Springer, 2004, pp. 1–19].  相似文献   

12.
13.
Approximate pattern matching algorithms have become an important tool in computer assisted music analysis and information retrieval. The number of different problem formulations has greatly expanded in recent years, not least because of the subjective nature of measuring musical similarity. From an algorithmic perspective, the complexity of each problem depends crucially on the exact definition of the difference between two strings. We present an overview of advances in approximate string matching in this field focusing on new measures of approximation.  相似文献   

14.
一种快速的单模式匹配算法   总被引:4,自引:0,他引:4  
在对Boyer-Moore(BM)算法及其改进的Tuned Boyer-Moore(TunedBM)算法进行分析的基础上,提出了一种更加快速的单模式匹配算法--NFS.该算法利用当前尝试中匹配失败字符的位置信息进行更大的尝试位置移动,使算法具有更高的效率.实验结果表明,NFS算法的性能优于同类的其他算法,特别是在模式长度较短的情况下,优势更为明显.  相似文献   

15.
New efficient algorithms for the LCS and constrained LCS problems   总被引:1,自引:0,他引:1  
In this paper, we study the classic and well-studied longest common subsequence (LCS) problem and a recent variant of it, namely the constrained LCS (CLCS) problem. In the CLCS problem, the computed LCS must also be a supersequence of a third given string. In this paper, we first present an efficient algorithm for the traditional LCS problem that runs in O(Rloglogn+n) time, where R is the total number of ordered pairs of positions at which the two strings match and n is the length of the two given strings. Then, using this algorithm, we devise an algorithm for the CLCS problem having time complexity O(pRloglogn+n) in the worst case, where p is the length of the third string.  相似文献   

16.
17.
Simple deterministic wildcard matching   总被引:2,自引:0,他引:2  
We present a simple and fast deterministic solution to the string matching with don't cares problem. The task is to determine all positions in a text where a pattern occurs, allowing both pattern and text to contain single character wildcards. Our algorithm takes O(nlogm) time for a text of length n and a pattern of length m and in our view the algorithm is conceptually simpler than previous approaches.  相似文献   

18.
19.
The k-clique problem is a cornerstone of NP-completeness and parametrized complexity. When k is a fixed constant, the asymptotically fastest known algorithm for finding a k-clique in an n-node graph runs in O(n0.792k) time (given by Nešet?il and Poljak). However, this algorithm is infamously inapplicable, as it relies on Coppersmith and Winograd's fast matrix multiplication.We present good combinatorial algorithms for solving k-clique problems. These algorithms do not require large constants in their runtime, they can be readily implemented in any reasonable random access model, and are very space-efficient compared to their algebraic counterparts. Our results are the following:
We give an algorithm for k-clique that runs in O(nk/(εlogn)k−1) time and O(nε) space, for all ε>0, on graphs with n nodes. This is the first algorithm to take o(nk) time and O(nc) space for c independent of k.
Let k be even. Define a k-semiclique to be a k-node graph G that can be divided into two disjoint subgraphs U={u1,…,uk/2} and V={v1,…,vk/2} such that U and V are cliques, and for all i?j, the graph G contains the edge {ui,vj}. We give an time algorithm for determining if a graph has a k-semiclique. This yields an approximation algorithm for k-clique, in the following sense: if a given graph contains a k-clique, then our algorithm returns a subgraph with at least 3/4 of the edges in a k-clique.
  相似文献   

20.
We give a new algorithm for computing a prepositional Horn CNF formula given the set of its models. Its running time is O(|R|n(|R|+n)), where |R| is the number of models and n that of variables, and the computed CNF contains at most |R|n clauses. This algorithm also uses the well-known closure property of Horn relations in a new manner.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号