首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
以词间空格作为自然分隔符,非常容易获取维吾尔文中的词,但又很难获取结构完整的语义词,因此多种文本处理效果总是很不理想。提出维吾尔文组词的新概念,将数据挖掘中的频繁模式挖掘方法引入到维吾尔文组词中,再结合维吾尔文的语言文字特点,将无先验知识的模式挖掘问题转化为特定模式的匹配问题,提出了一种快速高效的频繁模式挖掘算法,来获取语义完整的维吾尔文词。实验结果表明,通过该算法获取的维吾尔文词,在结构上是稳定的,语义上是完整而独立的。  相似文献   

2.
讨论了带有通配符和长度约束的模式匹配(PMWL)问题,其中模式由子模式序列集组成,两个相邻子模式的间隔在一定长度范围内。针对PMWL问题,已有工作包括设计启发式求解算法和对特殊情况进行完备性分析,然而还需要构建问题的基础求解模型。借鉴约束可满足问题框架,构建了由变量、值域和约束组成的三元组求解模型,对PMWL问题的基本概念和基本性质给出了形式化描述。最后,给出了算法求解PMWL问题的特定条件下的完备解。  相似文献   

3.
Proposals have been made for fast string search algorithms that can search a string for a given pattern without having to examine every character of the text passed over. These algorithms are effective only if search patterns are in practice long enough, and enough text is traversed in a search to justify the costs of initializing the sophisticated search algorithm. We report observations of routine uses of an editor instrumented to determine the characteristics of use. It appears that many searches are indeed for patterns of 3,5 or more characters, and that these searches cover substantial amounts of text. However, searches for patterns of a single character are very common, while only moving a short distance on average. The implementation of a sophisticated search algorithm must take care to deal with short single-character searches efficiently.  相似文献   

4.
Faster Approximate String Matching   总被引:11,自引:0,他引:11  
We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a nondeterministic finite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine with word length w = Ω (log n) bits, where n is the text size. This is essentially similar to the model used in Wu and Manber's work, although we improve the search time by packing the automaton states differently. The running time achieved is O(n) for small patterns (i.e., whenever mk = O(log n)) , where m is the pattern length and k<m is the number of allowed errors. This is in contrast with the result of Wu and Manber, which is O(kn) for m=O(log n) . Longer patterns can be processed by partitioning the automaton into many machine words, at O(mk/w n) search cost. We allow generalizations in the pattern, such as classes of characters, gaps, and others, at essentially the same search cost. We then explore other novel techniques to cope with longer patterns. We show how to partition the pattern into short subpatterns which can be searched with less errors using the simple automaton, to obtain an average cost close to . Moreover, we allow the superimposition of many subpatterns in a single automaton, obtaining near average complexity (σ is the alphabet size). We perform a complete analysis of all the techniques and show how to combine them in an optimal form, also obtaining new tighter bounds for the probability of an approximate occurrence in random text. Finally, we show experimental results comparing our algorithms against previous work. These experiments show that our algorithms are among the fastest for typical text searching, being the fastest in some cases. Although we aim mainly at text searching, we believe that our ideas can be successfully applied to other areas such as computational biology. Received November 22, 1996; revised October 13 and December 5, 1997.  相似文献   

5.
一种改进的Wu-Manber多模式匹配算法及应用   总被引:8,自引:0,他引:8  
本文针对Wu-Manber多模式匹配算法在处理后缀模式情况下的不足,给出了一种改进的后缀模式处理算法,减少了匹配过程中字符比较的次数,提高了算法的运行效率。本文在随机选择的TREC2000的52,067篇文档上进行了全文检索实验, 对比了Wu-Manber算法、使用后缀模式的改进算法、不使用后缀模式的简单改进等三种算法的匹配过程中字符比较的次数。实验结果说明,本文的改进能够比较稳定的减少匹配过程中字符比较的次数,提高匹配的速度和效率。  相似文献   

6.
求解NP难问题一直是计算机科学技术中的一个瓶颈任务。自20世纪70年代以来的研究表明,不存在求解此类问题的完整严格的有效算法。因此用启发式方法求解成为当今研究的一个热点。圆形packing问题是一个有着很高理论和实用价值的NP难问题。该文提出了一些有效的搜索策略,得到了一个求解它的快速有效启发式算法。最后用计算实例验证了此算法的有效性,计算结果表明此算法明显优于已有快速算法。  相似文献   

7.
王海平  戴玮  郭丹 《计算机科学》2015,42(4):244-248
近年来,随着生物信息学、信息检索等领域的发展,串模式匹配问题被不断扩展.其中,具有代表性的是在模式中引入可变长度的通配符而形成带有通配符的模式匹配(PMWL).该问题定义的灵活性给用户提供了方便,却也造成了求解上的困难.因此,如何在多项式时间内得到更好的匹配解成为研究的焦点.提出了一种启发式的小兵算法.小兵算法通过将PMWL问题转化为路径搜索问题,并借鉴动态剪枝思想,在算法搜索的过程中动态地将不可能的匹配位置剪枝,从而提高解的质量.实验在真实DNA序列上进行,并人工生成了196个模式.结果表明,相比于目前最有效的SAIL算法,小兵算法在绝大多数的尾部有重复字符的模式中可以获得更好的匹配解.  相似文献   

8.
T-shape patterns are often used in dividing stock plates into rectangular pieces, because they make good balance between plate cost and cutting complexity. A dividing cut separates the plate into two segments, each of which contains parallel strips, and the strip orientations of the two segments are perpendicular to each other. This paper presents a heuristic algorithm for constrained T-shape patterns, where the optimization objective is to maximize the pattern value, and the frequency of each piece type does not exceed the demand. The algorithm considers many dividing-cut positions, determines the pattern value associated to each position using a layout-generation procedure, and selects the one with the maximum pattern value as the solution. Pseudo upper bounds are used to skip some non-promising positions. The computational results show that the algorithm is fast and able to get solutions better than those of the optimal two-staged patterns in terms of material utilization.  相似文献   

9.
提出了一种全新的迁移蜂群优化算法,并应用到电力系统无功优化问题.利用Q学习的试错与奖励机制构造蜂群的学习模式,并采用强化学习的行为迁移技术实现蜂群的迁移学习.为解决算法求解多变量优化问题遇到的维数灾难,提出了状态-组合动作链的方式将状态-动作空间分解成若干低维空间,明显降低算法的计算难度.仿真结果表明:本文所提算法可以保证最优解质量的同时,寻优速度能提高到传统启发式智能算法的4~67倍左右,非常适用于大规模复杂系统非线性规划问题的快速求解.  相似文献   

10.
A geometric reasoning based algorithm for point pattern matching   总被引:1,自引:0,他引:1  
Point pattern matching (PPM) is an important topic in computer vision and pattern recog-nition . It can be widely used in many areas such as image registration, object recognition, motion de-tection, target tracking, autonomous navigation, and pose estimation. This paper discusses the in-complete matching problem of two point sets under Euclidean transformation. According to geometric reasoning, some definitions for matching clique, support point pair, support index set, and support in-dex matrix, etc. are given. Based on the properties and theorems of them, a novel reasoning algo-rithm is presented, which searches for the optimal solution from top to bottom and could find out as many consistent corresponding point pairs as possible. Theoretical analysis and experimental results show that the new algorithm is very effective, and could be, under some conditions, applied to the PPM problem under other kind of transformations.  相似文献   

11.
Simple deterministic wildcard matching   总被引:2,自引:0,他引:2  
We present a simple and fast deterministic solution to the string matching with don't cares problem. The task is to determine all positions in a text where a pattern occurs, allowing both pattern and text to contain single character wildcards. Our algorithm takes O(nlogm) time for a text of length n and a pattern of length m and in our view the algorithm is conceptually simpler than previous approaches.  相似文献   

12.
String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets.  相似文献   

13.
Due to the huge size of patterns to be searched,multiple pattern searching remains a challenge to several newly-arising applications like network intrusion detection.In this paper,we present an attempt to design efficient multiple pattern searching algorithms on multi-core architectures.We observe an important feature which indicates that the multiple pattern matching time mainly depends on the number and minimal length of patterns.The multi-core algorithm proposed in this paper leverages this feature to decompose pattern set so that the parallel execution time is minimized.We formulate the problem as an optimal decomposition and scheduling of a pattern set,then propose a heuristic algorithm,which takes advantage of dynamic programming and greedy algorithmic techniques,to solve the optimization problem.Experimental results suggest that our decomposition approach can increase the searching speed by more than 200% on a 4-core AMD Barcelona system.  相似文献   

14.
孙钦东  黄新波  王倩 《软件学报》2008,19(3):674-686
分析了中英文混合环境下多模式匹配的特点,以及已有多模式匹配算法应用于中英文混合环境时的不足,给出并证明了中英文混合环境下多模式匹配算法的性能定理,提出了一种适合于中英文混合环境的基于线索完全哈希Trie结构的多模式匹配算法.该算法扩展了标准Trie结构,以中英文字符内码为键值构造完全哈希Trie匹配机,并利用模式串之间的关系对Trie匹配机进行线索化.理论分析与实验结果表明,所提出的算法在匹配中无需复杂的哈希运算,不需要回溯匹配指针,在中英文混合环境下能够进行正确、高效的匹配,而且不存在空间膨胀问题,具有较低的空间与时间复杂度,有较大理论与应用价值.  相似文献   

15.
一种遗传搜索块匹配运动估计算法   总被引:2,自引:0,他引:2       下载免费PDF全文
运动估计是帧间视频编码中的关键技术,但现有的快速搜索算法中大都是次优算法,且易陷于局部极小点,针对此问题,提出了将一种遗传算法应用于块运动估计中的遗传搜索匹配估计算法(GSAME),该方法把块运动向量作为遗传染色体,经过杂交、变异等操作,以便得到全局意义上的最优解,并与经典的全局搜索法和三步搜索法进行了比较,实验结果显示,该算法不仅有效地解决了局部极小问题,而且计算量也较少。  相似文献   

16.
We give a class of heuristic algorithms for minimum weight perfect matching on a complete edge-weighted graph K(V) satisfying the triangle inequality, where V is a set of an even number, n, of vertices. This class is a generalization of the Onethird heuristics, the hypergreedy heuristic, and it possibly employs any given exact or approximate perfect matching algorithm as an auxiliary heuristic to an appropriate subgraph of K(V). In particular, by using the heuristic of Gabow et al. [3] as its auxiliary heuristic, our algorithm can obtain a solution whose weight is at most times the weight of the optimal solution in time, or a solution with an error of in O(n 2 ) time. Received July 21, 1994; revised November 28, 1995.  相似文献   

17.
一种改进遗传搜索块匹配运动估计算法   总被引:1,自引:1,他引:1  
运动估计是帧间视频编码中的关键技术,但现有的快速搜索算法中大都是次优算法,且易陷于局部极小点。针对此问题,提出了一种改进型遗传算法应用于块运动估计中的遗传搜索块匹配运动估计算法(MGSAME)。该方法把块运动向量作为遗传染色体,经过杂交、变异等操作,以便得到全局意义上的最优解,并与经典的全局搜索法、三步搜索法和传统遗传算法(SGA)进行了比较。实验结果显示,该算法不仅有效地解决了局部极小问题,而且计算量也较少。  相似文献   

18.
This correspondence describes an approach to reducing the computational cost of document image decoding by viewing it as a heuristic search problem. The kernel of the approach is a modified dynamic programming (DP) algorithm, called the iterated complete path (ICP) algorithm, that is intended for use with separable source models. A set of heuristic functions are presented for decoding formatted text with ICP. Speedups of 3-25 over DP have been observed when decoding text columns and telephone yellow pages using ICP and the proposed heuristics  相似文献   

19.
The work force tour assignment problem in a U.S. Postal Service Sectional Center Facility consists of specifying the weekly tour start time, work center assignment, and mail class responsibility for each postal employee. Any efficient work force assignment must be made subject to a number of constraints including tour schedule requirements, work center capacities, mail arrival volumes, and mail flow patterns through the processing system. A heuristic algorithm of a building, or construction, nature is described which evaluates these many constraints in developing efficient work force tour assignment solutions. Once the problem setting and the heuristic algorithm are described, the use of the procedure is demonstrated by analyzing a hypothetical problem.  相似文献   

20.
在边缘计算环境中,为用户匹配合适的服务器是一个关键问题,可以有效提升服务质量.文中将边缘用户分配问题转换为一个受距离和服务器资源约束的二分图匹配问题,并将其建模为一个0-1整数规划问题进行优化.在离线状态下,基于精确式算法的优化模型可以求得最优分配策略,但其求解时间过长,无法处理规模较大的数据,不适用于现实服务环境.因...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号