首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The problem of pattern matching with wildcards is to find all the occurrences of a pattern of length m in a text of length n over a finite alphabet Σ (both the text and the pattern are allowed to contain wildcards). Based on the prime number encoding scheme (Chaim Linhart, Ron Shamir, Faster pattern matching with character classes using prime number encoding, J. Comput. Syst. Sci. 75 (3) (2009) 155-162), we present a new integer encoding and an efficient fast Fourier transforms based algorithm for this problem. The algorithm takes time to search the pattern in the text by computing one convolution. For matching with wildcards, our encoding uses fewer prime numbers and has shorter code words comparing with the prime number encoding. We use at most 2lg|Σ| prime numbers to encode the symbols while in the prime number encoding |Σ| prime numbers are required. This number reduces to 1.5lg|Σ| when |Σ|>40. The code word used in the algorithm is at most 2⌊lg|Σ|⌋⌈lg(5m)⌉ bits while in the prime encoding it is at least bits. We also show that the length of words can be further reduced by increasing the number of convolutions computed.  相似文献   

2.
3.
In this paper, we revisit the Property Matching problem studied by Amir et al. [Property Matching and Weighted Matching, CPM 2006] and present a better indexing scheme for the problem. In particular, the data structure by Amir et al., namely PST, requires O(nlog|Σ|+nloglogn) construction time and O(mlog|Σ|+K) query time, where n and m are the length of, respectively, the text and the pattern, Σ is the alphabet and K is the output size. On the other hand, the construction time of our data structure, namely IDS_PIP, is dominated by the suffix tree construction time and hence is O(n) time for alphabets that are natural numbers from 1 to a polynomial in n and O(nlogσ) time otherwise, where σ=min(n,|Σ|). The query time is same as that of PST. Also, IDS_PIP has the advantage that it can be built on either a suffix tree or a suffix array and additionally, it retains the capability of answering normal pattern matching queries.  相似文献   

4.
5.
6.
Given a string P of length m over an alphabet Σ of size σ, a swapped version of P is a string derived from P   by a series of local swaps, i.e., swaps of adjacent symbols, such that each symbol can participate in at most one swap. We present a theoretical analysis of the nondeterministic finite automaton for the language ?PΠPΣ?P?PΠPΣ?P (swap automaton, for short), where ΠPΠP is the set of swapped versions of P  . Our study is based on the bit-parallel simulation of the same automaton due to Fredriksson, and reveals an interesting combinatorial property that links the automaton to the one for the language Σ?PΣ?P. By exploiting this property and the method presented by Cantone et al. (2012), we obtain a bit-parallel encoding of the swap automaton which takes O(σ2⌈k/w⌉)O(σ2k/w) space and allows one to simulate the automaton on a string of length n   in time O(n⌈k/w⌉)O(nk/w), where ⌈m/σ⌉?k?mm/σ?k?m is the size of a specific factorization of P defined by Cantone et al. (2012) and w is the word size in bits.  相似文献   

7.
A special class of map labeling problem is studied. Let P={p1,p2,…,pn} be a set of point sites distributed on a 2D map. A label associated with each point pi is an axis-parallel rectangle ri of specified width. The height of all are same. The placement of ri must contain pi at its top-left or bottom-left corner, and it does not obscure any other point in P. The objective is to label the maximum number of points on the map so that the placed labels are mutually non-overlapping. We first consider a simple model for this problem. Here, for each point pi, the corner specification (i.e., whether the point pi would appear at the top-left or bottom-left corner of the label) is known a priori. We show that the time complexity of this problem is , and then propose an algorithm for this problem which runs in O(nlogn) time. If the corner specifications of the points in P are not known, our algorithm is a 2-approximation algorithm. Here we propose an efficient heuristic algorithm that is easy to implement. Experimental evidences show that it produces optimal solutions for most of the randomly generated instances and for all the standard benchmarks available in http://www.math-inf.uni-greifswald.de/map-labeling/.  相似文献   

8.
9.
It has long been known that pattern matching in the Hamming distance metric can be done in time, where n is the length of the text, m is the length of the pattern, and Σ is the alphabet. The classic algorithm for this is due to Abrahamson and Kosaraju. This paper considers the following generalization, motivated by the situation where the entries in the text and pattern are analog, or distorted by additive noise, or imprecisely given for some other reason: in any alignment of the pattern with the text, two aligned symbols a and b contribute +1 to the similarity score if they differ by no more than a given threshold θ, otherwise they contribute zero. We give an time algorithm for this more general version of the problem; the classic Hamming distance matching problem is the special case of θ=0.  相似文献   

10.
This paper presents simple and deterministic algorithms for partial point set pattern matching in 2D. Given a set P of n points, called sample set, and a query set Q of k points (n?k), the problem is to find a matching of Q with a subset of P under rigid motion. The match may be of two types: exact and approximate. If an exact matching exists, then each point in Q coincides with the corresponding point in P under some translation and/or rotation. For an approximate match, some translation and/or rotation may be allowed such that each point in Q lies in a predefined ε-neighborhood region around some point in P. The proposed algorithm for the exact matching needs O(n2) space and preprocessing time. The existence of a match for a given query set Q can be checked in time in the worst-case, where α is the maximum number of equidistant pairs of point in P. For a set of n points, α may be O(n4/3) in the worst-case. Some applications of the partial point set pattern matching are then illustrated. Experimental results on random point sets and some fingerprint databases show that, in practice, the computation time is much smaller than the worst-case requirement. The algorithm is then extended for checking the exact match of a set of k line segments in the query set with a k-subset of n line segments in the sample set under rigid motion in time. Next, a simple version of the approximate matching problem is studied where one point of Q exactly matches with a point of P, and each of the other points of Q lie in the ε-neighborhood of some point of P. The worst-case time and space complexities of the proposed algorithm are and O(n), respectively. The proposed algorithms will find many applications to fingerprint matching, image registration, and object recognition.  相似文献   

11.
In [C.S. Iliopoulos, M.S. Rahman, Faster index for property matching, Information Processing Letters 105 (2008) 218-223], Iliopoulos and Rahman proposed a data structure called IDS-PIP for solving the property indexing pattern matching problem. IDS-PIP can be constructed in O(n) time, where n is the length of the text. Then, based on ID-PIP, each query takes O(mlog|Σ|+K) time, where m is the length the pattern, Σ is the alphabet, and K is the output size. They assume that all intervals in property π are disjoint. If the intervals in property π are not disjoint, then they create an equivalent set of disjoint intervals π to replace π. However, property π is not equivalent to property π at all. In this erratum, we propose a way for finding correct property π and modify IDS-PIP slightly.  相似文献   

12.
We consider the combination of function and permuted matching, each of which has fast solutions in their own right. Given a pattern p of length m and a text t of length n, a function match at position i of the text is a mapping f from Σp to Σt with the property that f(pj)=ti+j−1 for all j. We show that the problem of determining for each substring of the text, if any permutation of the pattern has a function match is in general NP-Complete. However where the mapping is also injective, so-called parameterised matching, the problem can be solved efficiently in O(nlog|Σp|) time. We then give a 1/2-approximation for a Hamming distance based optimisation variant by reduction to multiple knapsack with colour constraints.  相似文献   

13.
We present a parallel algorithm for finding a maximum weight matching in general bipartite graphs with an adjustable time complexity of using O(nmax(2ω,4+ω)) processing elements for ω?1. Parameter ω is not bounded. This is the fastest known strongly polynomial parallel algorithm to solve this problem. This is also the first adjustable parallel algorithm for the maximum weight bipartite matching problem in which the execution time can be reduced by an unbounded factor. We also present a general approach for finding efficient parallel algorithms for the maximum matching problem.  相似文献   

14.
In this paper, we deal with both the complexity and the approximability of the labeled perfect matching problem in bipartite graphs. Given a simple graph G=(V,E) with |V|=2n vertices such that E contains a perfect matching (of size n), together with a color (or label) function , the labeled perfect matching problem consists in finding a perfect matching on G that uses a minimum or a maximum number of colors.  相似文献   

15.
16.
We study input sensitive algorithms for point pattern matching under various transformations and the Hausdorff metric as a distance function. Given point sets P and Q in the plane, the problem of point pattern matching is to determine whether P is similar to some portion of Q, where P may undergo transformations from a group G of allowed transformations. All algorithms are based on methods for extracting small subsets from Q that can be matched to a small subset of P. The runtime is proportional to the number k of these subsets. Let d be the number of points in P that are needed to define a transformation in G. The key observation is that for some set BP of cardinality larger than d, the number of subsets of Q of this cardinality that match B, is practically small, as the problem becomes more constrained. We present methods to extract efficiently all these subsets in Q. We provide algorithms for homothetic, rigid and similarity transformations in the plane and give a general method that works for any dimension and for any group of transformations. The runtime of our algorithms depends roughly linearly on the number of subsets k, in addition to an factor. Thus our approximate matching algorithms run roughly in time , where m and n are the number of points in P and Q, respectively. The constants hidden in the big O vary depending on the group of transformations G.  相似文献   

17.
There is no known algorithm that solves the general case of the approximate edit distance problem, where the edit operations are insertion, deletion, mismatch, and swap, in time o(nm), where n is the length of the text and m is the length of the pattern.In the effort to study this problem, the edit operations have been analyzed independently. Karloff [10] showed an algorithm that approximates the edit distance problem with only the mismatch operation in time . Amir et al. [4] showed that if the only edit operations allowed are swap and mismatch, then the exact edit distance problem can be solved in time .In this paper, we discuss the problem of approximate edit distance with swap and mismatch. We show a randomized time algorithm for the problem. The algorithm guarantees an approximation factor of (1+?) with probability of at least .  相似文献   

18.
We consider the problem of preemptively scheduling n imprecise computation tasks on m?1 uniform processors, with each task Ti having two weights wi and . Three objectives are considered: (1) minimizing the maximum w-weighted error; (2) minimizing the total w-weighted error subject to the constraint that the maximum w-weighted error is minimized; (3) minimizing the maximum w-weighted error subject to the constraint that the total w-weighted error is minimized. For these objectives, we give polynomial time algorithms with time complexity O(mn4), O(mn4) and O(kmn4), respectively, where k is the number of distinct w-weights. We also present alternative algorithms for the three objectives, with time complexity O(cmn3), O(cmn3+mn4) and O(kcmn3), respectively, where c is a parameter-dependent number.  相似文献   

19.
20.
We present an NC approximation algorithm for the weighted matching problem in graphs with an approximation ratio of (1−ε). This improves the previously best approximation ratio of of an NC algorithm for this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号