共查询到20条相似文献,搜索用时 281 毫秒
1.
2.
3.
D. Julian M. Davies 《Software》1982,12(8):709-717
Proposals have been made for fast string search algorithms that can search a string for a given pattern without having to examine every character of the text passed over. These algorithms are effective only if search patterns are in practice long enough, and enough text is traversed in a search to justify the costs of initializing the sophisticated search algorithm. We report observations of routine uses of an editor instrumented to determine the characteristics of use. It appears that many searches are indeed for patterns of 3,5 or more characters, and that these searches cover substantial amounts of text. However, searches for patterns of a single character are very common, while only moving a short distance on average. The implementation of a sophisticated search algorithm must take care to deal with short single-character searches efficiently. 相似文献
4.
Faster Approximate String Matching 总被引:11,自引:0,他引:11
We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a nondeterministic
finite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine
with word length w =
Ω
(log n) bits, where n is the text size. This is essentially similar to the model used in Wu and Manber's work, although we improve the search
time by packing the automaton states differently. The running time achieved is O(n) for small patterns (i.e., whenever mk = O(log n)) , where m is the pattern length and k<m is the number of allowed errors. This is in contrast with the result of Wu and Manber, which is O(kn) for m=O(log n) . Longer patterns can be processed by partitioning the automaton into many machine words, at O(mk/w n) search cost. We allow generalizations in the pattern, such as classes of characters, gaps, and others, at essentially the
same search cost.
We then explore other novel techniques to cope with longer patterns. We show how to partition the pattern into short subpatterns
which can be searched with less errors using the simple automaton, to obtain an average cost close to . Moreover, we allow the superimposition of many subpatterns in a single automaton, obtaining near average complexity (σ is the alphabet size).
We perform a complete analysis of all the techniques and show how to combine them in an optimal form, also obtaining new
tighter bounds for the probability of an approximate occurrence in random text. Finally, we show experimental results comparing
our algorithms against previous work. These experiments show that our algorithms are among the fastest for typical text searching,
being the fastest in some cases. Although we aim mainly at text searching, we believe that our ideas can be successfully applied
to other areas such as computational biology.
Received November 22, 1996; revised October 13 and December 5, 1997. 相似文献
5.
一种改进的Wu-Manber多模式匹配算法及应用 总被引:8,自引:0,他引:8
本文针对Wu-Manber多模式匹配算法在处理后缀模式情况下的不足,给出了一种改进的后缀模式处理算法,减少了匹配过程中字符比较的次数,提高了算法的运行效率。本文在随机选择的TREC2000的52,067篇文档上进行了全文检索实验, 对比了Wu-Manber算法、使用后缀模式的改进算法、不使用后缀模式的简单改进等三种算法的匹配过程中字符比较的次数。实验结果说明,本文的改进能够比较稳定的减少匹配过程中字符比较的次数,提高匹配的速度和效率。 相似文献
6.
求解NP难问题一直是计算机科学技术中的一个瓶颈任务。自20世纪70年代以来的研究表明,不存在求解此类问题的完整严格的有效算法。因此用启发式方法求解成为当今研究的一个热点。圆形packing问题是一个有着很高理论和实用价值的NP难问题。该文提出了一些有效的搜索策略,得到了一个求解它的快速有效启发式算法。最后用计算实例验证了此算法的有效性,计算结果表明此算法明显优于已有快速算法。 相似文献
7.
近年来,随着生物信息学、信息检索等领域的发展,串模式匹配问题被不断扩展.其中,具有代表性的是在模式中引入可变长度的通配符而形成带有通配符的模式匹配(PMWL).该问题定义的灵活性给用户提供了方便,却也造成了求解上的困难.因此,如何在多项式时间内得到更好的匹配解成为研究的焦点.提出了一种启发式的小兵算法.小兵算法通过将PMWL问题转化为路径搜索问题,并借鉴动态剪枝思想,在算法搜索的过程中动态地将不可能的匹配位置剪枝,从而提高解的质量.实验在真实DNA序列上进行,并人工生成了196个模式.结果表明,相比于目前最有效的SAIL算法,小兵算法在绝大多数的尾部有重复字符的模式中可以获得更好的匹配解. 相似文献
8.
T-shape patterns are often used in dividing stock plates into rectangular pieces, because they make good balance between plate cost and cutting complexity. A dividing cut separates the plate into two segments, each of which contains parallel strips, and the strip orientations of the two segments are perpendicular to each other. This paper presents a heuristic algorithm for constrained T-shape patterns, where the optimization objective is to maximize the pattern value, and the frequency of each piece type does not exceed the demand. The algorithm considers many dividing-cut positions, determines the pattern value associated to each position using a layout-generation procedure, and selects the one with the maximum pattern value as the solution. Pseudo upper bounds are used to skip some non-promising positions. The computational results show that the algorithm is fast and able to get solutions better than those of the optimal two-staged patterns in terms of material utilization. 相似文献
9.
10.
A geometric reasoning based algorithm for point pattern matching 总被引:1,自引:0,他引:1
Point pattern matching (PPM) is an important topic in computer vision and pattern recog-nition . It can be widely used in many areas such as image registration, object recognition, motion de-tection, target tracking, autonomous navigation, and pose estimation. This paper discusses the in-complete matching problem of two point sets under Euclidean transformation. According to geometric reasoning, some definitions for matching clique, support point pair, support index set, and support in-dex matrix, etc. are given. Based on the properties and theorems of them, a novel reasoning algo-rithm is presented, which searches for the optimal solution from top to bottom and could find out as many consistent corresponding point pairs as possible. Theoretical analysis and experimental results show that the new algorithm is very effective, and could be, under some conditions, applied to the PPM problem under other kind of transformations. 相似文献
11.
Simple deterministic wildcard matching 总被引:2,自引:0,他引:2
Peter Clifford 《Information Processing Letters》2007,101(2):53-54
We present a simple and fast deterministic solution to the string matching with don't cares problem. The task is to determine all positions in a text where a pattern occurs, allowing both pattern and text to contain single character wildcards. Our algorithm takes O(nlogm) time for a text of length n and a pattern of length m and in our view the algorithm is conceptually simpler than previous approaches. 相似文献
12.
Thierry Lecroq 《Information Processing Letters》2007,102(6):229-235
String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. 相似文献
13.
Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system. 相似文献
14.
分析了中英文混合环境下多模式匹配的特点,以及已有多模式匹配算法应用于中英文混合环境时的不足,给出并证明了中英文混合环境下多模式匹配算法的性能定理,提出了一种适合于中英文混合环境的基于线索完全哈希Trie结构的多模式匹配算法.该算法扩展了标准Trie结构,以中英文字符内码为键值构造完全哈希Trie匹配机,并利用模式串之间的关系对Trie匹配机进行线索化.理论分析与实验结果表明,所提出的算法在匹配中无需复杂的哈希运算,不需要回溯匹配指针,在中英文混合环境下能够进行正确、高效的匹配,而且不存在空间膨胀问题,具有较低的空间与时间复杂度,有较大理论与应用价值. 相似文献
15.
运动估计是帧间视频编码中的关键技术,但现有的快速搜索算法中大都是次优算法,且易陷于局部极小点,针对此问题,提出了将一种遗传算法应用于块运动估计中的遗传搜索匹配估计算法(GSAME),该方法把块运动向量作为遗传染色体,经过杂交、变异等操作,以便得到全局意义上的最优解,并与经典的全局搜索法和三步搜索法进行了比较,实验结果显示,该算法不仅有效地解决了局部极小问题,而且计算量也较少。 相似文献
16.
We give a class of heuristic algorithms for minimum weight perfect matching on a complete edge-weighted graph K(V) satisfying the triangle inequality, where V is a set of an even number, n, of vertices. This class is a generalization of the Onethird heuristics, the hypergreedy heuristic, and it possibly employs
any given exact or approximate perfect matching algorithm as an auxiliary heuristic to an appropriate subgraph of K(V). In particular, by using the heuristic of Gabow et al. [3] as its auxiliary heuristic, our algorithm can obtain a solution whose weight is at most times the weight of the optimal solution in time, or a solution with an error of in O(n
2
) time.
Received July 21, 1994; revised November 28, 1995. 相似文献
17.
18.
Kam A.C. Kopec G.E. 《IEEE transactions on pattern analysis and machine intelligence》1996,18(9):945-950
This correspondence describes an approach to reducing the computational cost of document image decoding by viewing it as a heuristic search problem. The kernel of the approach is a modified dynamic programming (DP) algorithm, called the iterated complete path (ICP) algorithm, that is intended for use with separable source models. A set of heuristic functions are presented for decoding formatted text with ICP. Speedups of 3-25 over DP have been observed when decoding text columns and telephone yellow pages using ICP and the proposed heuristics 相似文献
19.
M. J. Showalter 《Computers & Operations Research》1977,4(4):257-269
The work force tour assignment problem in a U.S. Postal Service Sectional Center Facility consists of specifying the weekly tour start time, work center assignment, and mail class responsibility for each postal employee. Any efficient work force assignment must be made subject to a number of constraints including tour schedule requirements, work center capacities, mail arrival volumes, and mail flow patterns through the processing system. A heuristic algorithm of a building, or construction, nature is described which evaluates these many constraints in developing efficient work force tour assignment solutions. Once the problem setting and the heuristic algorithm are described, the use of the procedure is demonstrated by analyzing a hypothetical problem. 相似文献