期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Matching with don't-cares and a small number of mismatches 总被引：1，自引：0，他引：1

Chaim Linhart Ron Shamir 《Information Processing Letters》2009,109(5):273-613

In matching with don't-cares and k mismatches we are given a pattern of length m and a text of length n, both of which may contain don't-cares (a symbol that matches all symbols), and the goal is to find all locations in the text that match the pattern with at most k mismatches, where k is a parameter. We present new algorithms that solve this problem using a combination of convolutions and a dynamic programming procedure. We give randomized and deterministic solutions that run in time O(nk²logm) and O(nk³logm), respectively, and are faster than the most efficient extant methods for small values of k. Our deterministic algorithm is the first to obtain an O(polylog(k)⋅nlogm) running time. 相似文献

2.

Simple deterministic wildcard matching 总被引：2，自引：0，他引：2

Peter Clifford 《Information Processing Letters》2007,101(2):53-54

We present a simple and fast deterministic solution to the string matching with don't cares problem. The task is to determine all positions in a text where a pattern occurs, allowing both pattern and text to contain single character wildcards. Our algorithm takes O(nlogm) time for a text of length n and a pattern of length m and in our view the algorithm is conceptually simpler than previous approaches. 相似文献

3.

Permuted function matching

Raphaël Clifford 《Information Processing Letters》2010,110(22):1012-1015

We consider the combination of function and permuted matching, each of which has fast solutions in their own right. Given a pattern p of length m and a text t of length n, a function match at position i of the text is a mapping f from Σp to Σt with the property that f(pj)=ti+j−1 for all j. We show that the problem of determining for each substring of the text, if any permutation of the pattern has a function match is in general NP-Complete. However where the mapping is also injective, so-called parameterised matching, the problem can be solved efficiently in O(nlog|Σp|) time. We then give a 1/2-approximation for a Hamming distance based optimisation variant by reduction to multiple knapsack with colour constraints. 相似文献

4.

A faster algorithm for matching a set of patterns with variable length don't cares 总被引：1，自引：0，他引：1

Meng Zhang Liang Hu 《Information Processing Letters》2010,110(6):216-220

We present a simple and faster solution to the problem of matching a set of patterns with variable length don't cares. Given an alphabet Σ, a pattern p is a word p₁@p₂?@pm, where pi is a string over Σ called a keyword and @∉Σ is a symbol called a variable length don't care (VLDC) symbol. Pattern p matches a text t if t=u₀p₁u₁…um−1pmum for some u₀,…,um∈Σ^∗. The problem addressed in this paper is: given a set of patterns P and a text t, determine whether one of the patterns of P matches t.Kucherov and Rusinowitch (1997) [9] presented an algorithm that solves the problem in time O((|t|+|P|)log|P|), where |P| is the total length of keywords in every pattern of P. We give a new algorithm based on Aho-Corasick automaton. It uses the solutions of Dynamic Marked Ancestor Problem of Chan et al. (2007) [5]. The algorithm takes O((|t|+‖P‖)logκ/loglogκ) time, where ‖P‖ is the total number of keywords in every pattern of P, and κ is the number of distinct keywords in P. The algorithm is faster and simpler than the previous approach. 相似文献

5.

Multiple serial episodes matching

Patrick Cégielski Yuri Matiyasevich 《Information Processing Letters》2006,98(6):211-218

Given q+1 strings (a text t of length n and q patterns m₁,…,mq) and a natural number w, the multiple serial episode matching problem consists in finding the number of size w windows of text t which contain patterns m₁,…,mq as subsequences, i.e., for each mi, if mi=p₁,…,pk, the letters p₁,…,pk occur in the window, in the same order as in mi, but not necessarily consecutively (they may be interleaved with other letters). Our main contribution here is an algorithm solving this problem on-line in time O(nq) with an MP-RAM model (which is a RAM model equipped with extra operations). 相似文献

6.

Time-space trade-offs for compressed suffix arrays

S.Srinivasa Rao 《Information Processing Letters》2002,82(6):307-311

Given a binary string of length n, we give a representation of its suffix array that takes O(nt(lgn)^1/t) bits of space such that given i,1?i?n, the ith entry in the suffix array of the string can be retrieved in O(t) time, for any parameter 1?t?lglgn. For t=lglgn, this gives a compressed suffix array representation of Grossi and Vitter [Proc. Symp. on Theory Comput., 2000, pp. 397-406]. For t=O(1/ε), this gives the best known (in terms of space) compressed suffix array representation with constant query time. From this representation one can construct a suffix tree structure for a text of length n, that uses o(nlgn) bits of space which can be used to find all the k occurrences of a given pattern of length m in O(m/lgn+k) time. No such structure was known earlier. 相似文献

7.

Construction of Aho Corasick automaton in linear time for integer alphabets

Shiri Dori Gad M. Landau 《Information Processing Letters》2006,98(2):66-72

We present a new simple algorithm that constructs an Aho Corasick automaton for a set of patterns, P, of total length n, in O(n) time and space for integer alphabets. Processing a text of size m over an alphabet Σ with the automaton costs O(mlog|Σ|+k), where there are k occurrences of patterns in the text.A new, efficient implementation of nodes in the Aho Corasick automaton is introduced, which works for suffix trees as well. 相似文献

8.

Randomized parallel list ranking for distributed memory multiprocessors

Frank Dehne Siang W. Song 《International journal of parallel programming》1997,25(1):1-16

We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP type model. We first describe a simple version which requires, with high probability, log(3p)+log ln(n)=Õ(logp+log logn) communication rounds (h-relations withh=Õ(n/p)) andÕ(n/p)) local computation. We then outline an improved version that requires high probability, onlyr?(4k+6) log(2/3p)+8=Õ(k logp) communication rounds wherek=min{i?0 |ln(i+1)n?(2/3p)²ⁱ⁺¹}. Notekn) is an extremely small number. Forn andp?4, the value ofk is at most 2. Hence, for a given number of processors,p, the number of communication rounds required is, for all practical purposes, independent ofn. Forn?1, 500,000 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Forn?10010100 and 4?p?2048, the number of communication rounds in our algorithm is bounded, with high probability, by 118; and we conjecture that the actual number of communication rounds required will not exceed 50. Our algorithm has a considerably smaller member of communication rounds than the list ranking algorithm used in Reid-Miller’s empirical study of parallel list ranking on the Cray C-90.⁽¹⁾ To our knowledge, Reid-Miller’s algorithm⁽¹⁾ was the fastest list ranking implementation so far. Therefore, we expect that our result will have considerable practical relevance. 相似文献

9.

Constant ratio fixed-parameter approximation of the edge multicut problem

Dániel Marx Igor Razgon 《Information Processing Letters》2009,109(20):1161-1166

The input of the Edge Multicut problem consists of an undirected graph G and pairs of terminals {s₁,t₁},…,{sm,tm}; the task is to remove a minimum set of edges such that si and ti are disconnected for every 1?i?m. The parameterized complexity of the problem, parameterized by the maximum number k of edges that are allowed to be removed, is currently open. The main result of the paper is a parameterized 2-approximation algorithm: in time f(k)⋅nO(1), we can either find a solution of size 2k or correctly conclude that there is no solution of size k.The proposed algorithm is based on a transformation of the Edge Multicut problem into a variant of the parameterized Max-2SAT problem, where the parameter is related to the number of clauses that are not satisfied. It follows from previous results that the latter problem can be 2-approximated in a fixed-parameter time; on the other hand, we show here that it is W[1]-hard. Thus the additional contribution of the present paper is introducing the first natural W[1]-hard problem that is constant-ratio fixed-parameter approximable. 相似文献

10.

String matching with inversions and translocations in linear average time (most of the time)

Szymon Grabowski Simone Faro 《Information Processing Letters》2011,111(11):516-520

We present an efficient algorithm for finding all approximate occurrences of a given pattern p of length m in a text t of length n allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an O(nmmax(α,β))-time complexity in the worst case and O(max(α,β,σ))-space complexity, where α and β are respectively the maximum length of the factors involved in any translocation and inversion, and σ is the alphabet size. Moreover we show that our algorithm has an O(n) average time complexity, whenever , for ε>0. Experiments show that the proposed algorithm achieves very good results in practical cases. 相似文献

11.

A faster parameterized algorithm for set packing

Ioannis Koutis 《Information Processing Letters》2005,94(1):7-9

We present an efficient parameterized algorithm for the (k,t)-set packing problem, in which we are looking for a collection of k disjoint sets whose union consists of t elements. The complexity of the algorithm is O(²O(t)nNlogN). For the special case of sets of bounded size, this improves the O(k(ck)n) algorithm of Jia et al. [J. Algorithms 50 (1) (2004) 106]. 相似文献

12.

Approximate swapped matching

Amihood Amir Moshe Lewenstein Ely Porat 《Information Processing Letters》2002,83(1):33-39

相似文献

13.

Scaled and permuted string matching

Ayelet Butman Gad M. Landau 《Information Processing Letters》2004,92(6):293-297

The goal of scaled permuted string matching is to find all occurrences of a pattern in a text, in all possible scales and permutations. Given a text of length n and a pattern of length m we present an O(n) algorithm. 相似文献

14.

A simple matching algorithm for regular bipartite graphs

Kazuhisa Makino Takashi Takabatake Satoru Fujishige 《Information Processing Letters》2002,84(4):189-193

相似文献

15.

Fast string matching algorithms for run-length coded strings

Dr. K. -L. Chung 《Computing》1995,54(2):119-125

Given a run-length coded text of length 2n and a run-length coded pattern of length 2m,m?n commonly, this paper first presents anO(n+m) time sequential algorithm for string matching, then presents anO(1) time parallel algorithm on a two-dimensionalm×n mesh with a reconfigurable bus system. 相似文献

16.

A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings

Hsing-Yen Ann Chiou-Ting Tseng Chiou-Yi Hor 《Information Processing Letters》2008,108(6):360-364

Let X and Y be two strings of lengths n and m, respectively, and k and l, respectively, be the numbers of runs in their corresponding run-length encoded forms. We propose a simple algorithm for computing the longest common subsequence of two given strings X and Y in O(kl+min{p₁,p₂}) time, where p₁ and p₂ denote the numbers of elements in the bottom and right boundaries of the matched blocks, respectively. It improves the previously known time bound O(min{nl,km}) and outperforms the time bounds O(kllogkl) or O((k+l+q)log(k+l+q)) for some cases, where q denotes the number of matched blocks. 相似文献

17.

Efficient two-dimensional pattern matching in the presence of errors

《Information Sciences》1987,43(3):169-184

We give an algorithm for two-dimensional pattern matching in the presence of errors. We find that the complexity of our algorithm is O(kn₁n₂ log n₂ + n₁²n₂ + kn₁m₁m₂), where the pattern is an n₁ × n₂ array, the text is an m₁ × m₂ array and k is the number of mismatches allowed. 相似文献

18.

Detecting False Matches in String-Matching Algorithms

S. Muthukrishnan 《Algorithmica》1997,18(4):512-520

Consider a text string of length n, a pattern string of length m, and a match vector of length n which declares each location in the text to be either a mismatch (the pattern does not occur beginning at that location in the text) or a potential match (the pattern may occur beginning at that location in the text). Some of the potential matches could be false, i.e., the pattern may not occur beginning at some location in the text declared to be a potential match. We investigate the complexity of two problems in this context, namely, checking if there is any false match, and identifying all the false matches in the match vector. We present an algorithm on the CRCW PRAM that checks if there exists a false match in O(1) time using O(n) processors. This algorithm does not require preprocessing the pattern. Therefore, checking for false matches is provably simpler than string matching since string matching takes time on the CRCW PRAM. We use this simple algorithm to convert the Karp—Rabin Monte Carlo type string-matching algorithm into a Las Vegas type algorithm without asymptotic loss in complexity. We also present an efficient algorithm for identifying all the false matches and, as a consequence, show that string-matching algorithms take time even given the flexibility to output a few false matches. Received January 28, 1995; revised January 17, 1996. 相似文献

19.

Fraction interpolation walking a Farey tree

Marc Mosko J.J. Garcia-Luna-Aceves 《Information Processing Letters》2006,98(1):19-23

We present an algorithm to find a proper fraction in simplest reduced terms between two reduced proper fractions. A proper fraction is a rational number m/n with m<n and n>1. A fraction m/n is simpler than p/q if m?p and n?q, with at least one inequality strict. The algorithm operates by walking a Farey tree in maximum steps down each branch. Through Monte Carlo simulation, we find that the present algorithm finds a simpler interpolation of two fractions than using the Euclidean-Convergent [D.W. Matula, P. Kornerup, Foundations of finite precision rational arithmetic, Computing 2 (Suppl.) (1980) 85-111] walk of a Farey tree and terminating at the first fraction satisfying the bound. Analysis shows that the new algorithms, with very high probability, will find an interpolation that is simpler than at least one of the bounds, and thus take less storage space than at least one of the bounds. 相似文献

20.

分布式存储的并行串匹配算法的设计与分析 总被引：7，自引：0，他引：7

陈国良林洁顾乃杰《软件学报》2000,11(6):771-778

并行串匹配算法的研究大都集中在PRAM(parallel random access machine)模型上,其他更为实际的模型上的并行串匹配算法的研究相对要薄弱得多.该文采用将最优串行算法并行化的技术,利用模式串的周期性质,巧妙地将改进的KMP(Knuth-Morris-Pratt)算法并行化,提出了一个简便、高效且具有良好可扩放性的分布式串匹配算法,其计算复杂度为O(n/p+m),通信复杂度为O(ulogp相似文献