首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 156 毫秒
1.
Simple deterministic wildcard matching   总被引:2,自引:0,他引:2  
We present a simple and fast deterministic solution to the string matching with don't cares problem. The task is to determine all positions in a text where a pattern occurs, allowing both pattern and text to contain single character wildcards. Our algorithm takes O(nlogm) time for a text of length n and a pattern of length m and in our view the algorithm is conceptually simpler than previous approaches.  相似文献   

2.
We present a filtering based algorithm for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols in either p or t (but not both), and a bound k, our algorithm finds all the places that the pattern matches the text with at most k mismatches. The algorithm is deterministic and runs in Θ(nm1/3k1/3log2/3m) time.  相似文献   

3.
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlogkn)/B) disk pages and finds all k-error matches with O((|P|+occ)/B+logknloglogBn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)/B) disk pages, and the I/O complexity is O((|P|+occ)/B+logk(k+1)nloglogn).  相似文献   

4.
We present a randomized and a deterministic data structure for maintaining a dynamic family of sequences under equality tests of pairs of sequences and creations of new sequences by joining or splitting existing sequences. Both data structures support equality tests inO(1) time. The randomized version supports new sequence creations inO(log2 n) expected time wheren is the length of the sequence created. The deterministic solution supports sequence creations inO(logn(logmlog* m+logn)) time for themth operation. This work was supported by the ESPRIT Basic Research Actions Program, under Contract No. 7141 (Project ALCOM II).  相似文献   

5.
In this paper, we revisit the Property Matching problem studied by Amir et al. [Property Matching and Weighted Matching, CPM 2006] and present a better indexing scheme for the problem. In particular, the data structure by Amir et al., namely PST, requires O(nlog|Σ|+nloglogn) construction time and O(mlog|Σ|+K) query time, where n and m are the length of, respectively, the text and the pattern, Σ is the alphabet and K is the output size. On the other hand, the construction time of our data structure, namely IDS_PIP, is dominated by the suffix tree construction time and hence is O(n) time for alphabets that are natural numbers from 1 to a polynomial in n and O(nlogσ) time otherwise, where σ=min(n,|Σ|). The query time is same as that of PST. Also, IDS_PIP has the advantage that it can be built on either a suffix tree or a suffix array and additionally, it retains the capability of answering normal pattern matching queries.  相似文献   

6.
We consider the following problem: Given an unsorted array of n elements, and a sequence of intervals in the array, compute the median in each of the subarrays defined by the intervals. We describe a simple algorithm which needs O(nlogk+klogn) time to answer k such median queries. This improves previous algorithms by a logarithmic factor and matches a comparison lower bound for k=O(n). The space complexity of our simple algorithm is O(nlogn) in the pointer machine model, and O(n) in the RAM model. In the latter model, a more involved O(n) space data structure can be constructed in O(nlogn) time where the time per query is reduced to O(logn/loglogn). We also give efficient dynamic variants of both data structures, achieving O(log2n) query time using O(nlogn) space in the comparison model and O((logn/loglogn)2) query time using O(nlogn/loglogn) space in the RAM model, and show that in the cell-probe model, any data structure which supports updates in O(logO(1)n) time must have Ω(logn/loglogn) query time.Our approach naturally generalizes to higher-dimensional range median problems, where element positions and query ranges are multidimensional—it reduces a range median query to a logarithmic number of range counting queries.  相似文献   

7.
The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In classical pattern matching, both the text and pattern are strings. Applications such as searching in xml or searching in hypertext require searching strings in non-linear structures such as trees or graphs. There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(nlogm) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete.  相似文献   

8.
In [C.S. Iliopoulos, M.S. Rahman, Faster index for property matching, Information Processing Letters 105 (2008) 218-223], Iliopoulos and Rahman proposed a data structure called IDS-PIP for solving the property indexing pattern matching problem. IDS-PIP can be constructed in O(n) time, where n is the length of the text. Then, based on ID-PIP, each query takes O(mlog|Σ|+K) time, where m is the length the pattern, Σ is the alphabet, and K is the output size. They assume that all intervals in property π are disjoint. If the intervals in property π are not disjoint, then they create an equivalent set of disjoint intervals π to replace π. However, property π is not equivalent to property π at all. In this erratum, we propose a way for finding correct property π and modify IDS-PIP slightly.  相似文献   

9.
We deal with the problem of routing messages on a slotted ring network in this paper. We study the computational complexity and algorithms for this routing by means of the results known in the literature for the multi-slot just-in-time scheduling problem. We consider two criteria for the routing problem: makespan, or minimum routing time, and diagonal makespan. A?diagonal is simply a schedule of ring links i=0,??,q?1 in q consecutive time slots, respectively. The number of diagonals between the earliest and the latest diagonals with non-empty packets is referred to as the diagonal makespan. For the former, we show that the optimal routing of messages of size k, is NP-hard in the strong sense, while an optimal routing when k=q can be computed in O(n 2log2 n) time. We also give an O(nlogn)-time constant factor approximation algorithm for unit size messages. For the latter, we prove that the optimal routing of messages of size k, where k divides the size of the ring q, is NP-hard in the strong sense even for any fixed k??1, while an optimal routing when k=q can be computed in O(nlogn) time. We also give an O(nlogn)-time approximation algorithm with an absolute error 2q?k.  相似文献   

10.
Given a list of n items and a function defined over sub-lists, we study the space required for computing the function for arbitrary sub-lists in constant time.For the function mode we improve the previously known space bound O(n2/logn) to O(n2loglogn/log2n) words.For median the space bound is improved to O(n2loglog2n/log2n) words from O(n2⋅log(k)n/logn), where k is an arbitrary constant and log(k) is the iterated logarithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号