共查询到19条相似文献,搜索用时 140 毫秒
1.
2.
一种时间复杂度最优的精确串匹配算法 总被引:14,自引:2,他引:12
现有的串匹配算法通常以模式长度作为滑动窗口大小.在窗口移动后,往往会丢弃掉一些已扫描正文的信息.提出了LDM(linear DAWG matching)串匹配算法,该算法将正文分为[n/m]个相互重叠、大小为2m-1的扫描窗口.在每个扫描窗口内,算法批量地尝试m个可能位置,首先使用反向后缀自动机从窗口中间位置向前扫描模式前缀;若成功,则再使用正向有限状态自动机从中间位置向后扫描剩余的模式后缀.分析证明,LDM算法的最差、最好、平均时间复杂度分别达到了理论最好结果:O(n),O(n/m),O(n(1ogσm)/m).实际性能测试也验证了平均时间复杂度最优这一理论结果.而且,对于在较大字母表下查找短模式的情况,LDM算法速度在被测试算法中最快.总之,LDM算法不但适合进行离线模式匹配,而且还特别适合需要进行在线高速匹配的应用. 相似文献
3.
4.
5.
分别在两种重要并行计算模型中给出计算有向基因组排列的反转距离新的并行算法.基于Hannenhalli和Pevzner理论,分3个主要部分设计并行算法:构建断点图、计算断点图中圈数、计算断点图中障碍的数目.在CREW-PRAM模型上,算法使用O(n2)处理器,时间复杂度为O(log2n);在基于流水光总线的可重构线性阵列系统(linear array with a reconfigurable pipelined bus system, LARPBS)模型上,算法使用O(n3)处理器,计算时间复杂度为O(logn). 相似文献
6.
在消息传递并行机上的高效的最小生成树算法 总被引:5,自引:0,他引:5
基于传统的Borǔ vka串行最小生成树算法,提出了一个在消息传递并行机上的高效的最小生成树算法.并且采用3种方法来提高该算法的效率,即通过两趟合并及打包收缩的方法来减少通信开销,通过平衡数据分布的办法使各个处理器的计算量平衡.该算法的计算和通信复杂度分别为O(n2/p)和O((tsp+twn)n/p).在曙光-1000并行机上运行的实际效果是,对于有10 000个顶点的稀疏图,通过16个节点的运行加速比是12. 相似文献
7.
Salsa20 流密码算法是Estream 最终胜出的7 个算法之一.结合非线性方程的求解及Salsa20 的两个3 轮高概率差分传递链,对5 轮Salsa20 算法进行了代数-截断差分攻击.计算复杂度不大于O(2105),数据复杂度为O(211),存储复杂度为O(211),成功率为97.72%.到目前为止,该攻击结果是对5 轮Salsa20 算法攻击最好的结果. 相似文献
8.
9.
10.
将串行动态二表算法应用于并行三表算法的设计中,提出一种求解背包、精确的可满足性和集覆盖等背包类NP完全问题的并行三表六子表算法.基于EREW-PRAM模型,该算法可使用O(2n/8)的处理机在O(27n/16)的时间和O(213n/48)的空间求解n维背包类问题,其时间-空间-处理机折衷为O(25n/6).与现有文献的性能对比分析表明,该算法极大地提高了并行求解背包类问题的时间-空间-处理机折衷性能.由于该算法能够破解更高维数的背包类公钥和数字水印系统,其结论在密钥分析领域具有一定的理论和实际意义. 相似文献
11.
Dr. K. -L. Chung 《Computing》1995,54(2):119-125
Given a run-length coded text of length 2n and a run-length coded pattern of length 2m,m?n commonly, this paper first presents anO(n+m) time sequential algorithm for string matching, then presents anO(1) time parallel algorithm on a two-dimensionalm×n mesh with a reconfigurable bus system. 相似文献
12.
There is no known algorithm that solves the general case of theapproximate string matching problem with the extended edit distance, where the edit operations are: insertion, deletion, mismatch and swap, in timeo(nm), wheren is the length of the text andm is the length of the pattern. In an effort to study this problem, the edit operations were analysed independently. It turns
out that the approximate matching problem with only the mismatch operation can be solved in timeO(n √m logm). If the only edit operation allowed is swap, then the problem can be solved in timeO(n logm logσ), whereσ=min(m, |Σ|). In this paper we show that theapproximate string matching problem withswap andmismatch as the edit operations, can be computed in timeO(n √m logm).
Amihood Amir was partially supported by NSF Grant CCR-01-04494 and ISF Grant 35/05. This work is part of Estrella Eisenberg’s
M.Sc. thesis. Ely Porat was partially supported by GIF Young Scientists Program Grant 2055-1168.6/2002. 相似文献
13.
Given an alphabet Σ={1,2,…,|Σ|} text string T∈Σ
n
and a pattern string P∈Σ
m
, for each i=1,2,…,n−m+1 define L
p
(i) as the p-norm distance when the pattern is aligned below the text and starts at position i of the text. The problem of pattern matching with L
p
distance is to compute L
p
(i) for every i=1,2,…,n−m+1. We discuss the problem for d=1,2,∞. First, in the case of L
1 matching (pattern matching with an L
1 distance) we show a reduction of the string matching with mismatches problem to the L
1 matching problem and we present an algorithm that approximates the L
1 matching up to a factor of 1+ε, which has an
O(\frac1e2nlogmlog|S|)O(\frac{1}{\varepsilon^{2}}n\log m\log|\Sigma|)
run time. Then, the L
2 matching problem (pattern matching with an L
2 distance) is solved with a simple O(nlog m) time algorithm. Finally, we provide an algorithm that approximates the L
∞ matching up to a factor of 1+ε with a run time of
O(\frac1enlogmlog|S|)O(\frac{1}{\varepsilon}n\log m\log|\Sigma|)
. We also generalize the problem of String Matching with mismatches to have weighted mismatches and present an O(nlog 4
m) algorithm that approximates the results of this problem up to a factor of O(log m) in the case that the weight function is a metric. 相似文献
14.
Theapproximate string matching problem is, given a text string, a pattern string, and an integerk, to find in the text all approximate occurrences of the pattern. An approximate occurrence means a substring of the text with edit distance at mostk from the pattern. We give a newO(kn) algorithm for this problem, wheren is the length of the text. The algorithm is based on the suffix automaton with failure transitions and on the diagonalwise monotonicity of the edit distance table. Some experiments showing that the algorithm has a small overhead are reported. 相似文献
15.
A recent trend in stringology explores the possibility of utilizing text compression to speed up similarity computation between strings. In this line of investigation, run-length encoding is one of the earliest studied compression schemes. Despite its simple coding nature, the only positive result before this work is the computation of the in-del distance (dual of longest common subsequence), which requires O(mnlogmn) time, where m and n denote the number of runs of the input strings. The worst-case time complexity of computing the edit distance between two run-length encoded strings still depends on the uncompressed string lengths. In this paper, we break the foundational gap by providing its first “fully compressed” algorithm whose running time depends solely on the compressed string lengths. Specifically, given two strings, compressed into m and n runs, m≤n, we present an O(mn 2)-time algorithm for computing the edit distance of the strings. Our approach also yields the first fully compressed solution to approximate matching of a pattern of m runs in a text of n runs in O(mn 2) time. 相似文献
16.
17.
The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In classical pattern matching, both the text and pattern are strings. Applications such as searching in xml or searching in hypertext require searching strings in non-linear structures such as trees or graphs. There has been work in the literature on exact and approximate parameterized matching, as well as work on exact and approximate string matching on non-linear structures. In this paper we explore parameterized matching in non-linear structures. We prove that exact parameterized matching on trees can be computed in linear time for alphabets in an O(n)-size integer range, and in time O(nlogm) in general, where n is the tree size and m the pattern length. These bounds are optimal in the comparison model. We also show that exact parameterized matching on directed acyclic graphs (DAGs) is NP-complete. 相似文献
18.
In this paper we present a new algorithm for adaptive prefix coding. Our algorithm encodes a text S of m symbols in O(m) time, i.e., in O(1) amortized time per symbol. The length of the encoded string is bounded above by (H+1)m+O(nlog 2
m) bits where n is the alphabet size and H is the entropy.
This is the first algorithm that adaptively encodes a text in O(m) time and achieves an almost optimal bound on the encoding length in the worst case. Besides that, our algorithm does not
depend on an explicit code tree traversal.
A preliminary version of this paper appeared in the Proceedings of the 2006 IEEE International Symposium on Information Theory
(ISIT 2006).
M. Karpinski’s work partially supported by a DFG grant, Max-Planck Research Prize, and IST grant 14036 (RAND-APX).
Y. Nekrich’s work partially supported by IST grant 14036 (RAND-APX). 相似文献
19.
We present an algorithm to approximate edit distance between two ordered and rooted trees of bounded degree. In this algorithm,
each input tree is transformed into a string by computing the Euler string, where labels of some edges in the input trees
are modified so that structures of small subtrees are reflected to the labels. We show that the edit distance between trees
is at least 1/6 and at most O(n
3/4) of the edit distance between the transformed strings, where n is the maximum size of two input trees and we assume unit cost edit operations for both trees and strings. The algorithm
works in O(n
2) time since computation of edit distance and reconstruction of tree mapping from string alignment takes O(n
2) time though transformation itself can be done in O(n) time. 相似文献