首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
New Techniques for Regular Expression Searching   总被引:1,自引:0,他引:1  
We present two new techniques for regular expression searching and use them to derive faster practical algorithms. Based on the specific properties of Glushkovs nondeterministic finite automaton construction algorithm, we show how to encode a deterministic finite automaton (DFA) using O(m2m) bits, where m is the number of characters, excluding operator symbols, in the regular expression. This compares favorably against the worst case of O(m2m||) bits needed by a classical DFA representation (where is the alphabet) and O(m22m) bits needed by the Wu and Manber approach implemented in Agrep. We also present a new way to search for regular expressions, which is able to skip text characters. The idea is to determine the minimum length of a string matching the regular expression, manipulate the original automaton so that it recognizes all the reverse prefixes of length up to of the strings originally accepted, and use it to skip text characters as done for exact string matching in previous work. We combine these techniques into two algorithms, one able and one unable to skip text characters. The algorithms are simple to implement, and our experiments show that they permit fast searching for regular expressions, normally faster than any existing algorithm.  相似文献   

2.
深度包检测中一种高效的正则表达式压缩算法   总被引:4,自引:2,他引:4  
徐乾  鄂跃鹏  葛敬国  钱华林 《软件学报》2009,20(8):2214-2226
提出一种基于确定的有穷状态自动机(deterministic finite automaton,简称DFA)的正则表达式压缩算法.首先,定义了膨胀率DR(distending rate)来描述正则表达式的膨胀特性.然后基于DR提出一种分片的算法RECCADR (regular expressions cut and combine algorithm based on DR),有效地选择出导致DFA状态膨胀的片段并隔离,降低了单个正则表达式存储需求.同时,基于正则表达式的组合关系提出一种选择性分群算法REGADR(regular expressions group algorithm based on DR),在可以接受的存储需求总量下,通过选择性分群大幅度减少了状态机的个数,有效地降低了匹配算法的复杂性.  相似文献   

3.
当前深度包检测算法通常需要将正则表达式转换为NFA或者DFA.但是随着网络带宽的不断增加.NFA和DFA状态占用的存储空间越来越大,极大地考验着系统的存储能力。为了应对这个问题.提出一种基于正则表达式相性的分组算法来对表达式进行分组,实验证明该算法能减少NFA和DFA状态的数量,提高匹配的效率。  相似文献   

4.
金军航  张大方  黄昆 《计算机工程》2010,36(19):269-271
为对现有的高性能正则表达式匹配算法进行综合比较与分析,实现诸如DFA、D2FA、CD2FA、mDFA及XFA等最新算法,采用Snort规则集综合评估这些算法的存储空间和匹配时间。实验结果表明,在存储空间方面,与mDFA相比,XFA的存储空间减少84.9%~89.9%;在匹配效率方面,与mDFA相比,XFA的匹配时间增加了38.9%~174.6%;XFA在存储空间和匹配效率上具有良好的可伸缩性,即当规则数增加到8倍时,mDFA的存储空间增长了64倍,而XFA的存储空间仅增加了16倍,匹配时间仅增加了61.3%。  相似文献   

5.
确定性有限自动机(Det­erministic Finite Automata, DFA)匹配速度远快于非确定性有限状态自动机(Non-deterministic Finite state Autom­ata, NFA),但大量正则表达式转换为DFA时会引起状态爆炸而占用巨大的存储空间。首先定义膨胀系数(Expansion Coefficient, EC)来描述正则表达式的膨胀特性,然后在膨胀系数这一概念基础上,提出一种高效的分组算法--IGA(Improved Grouping Algorithm)算法对正则表达式进行有效分组,将容易引起状态爆炸的正则表达式相互隔离,从而节省存储空间。实验结果表明,与原有算法相比,在相同分组数目时IGA算法平均能够减少25%的状态数。  相似文献   

6.
确定性有限自动机(DFA)是实现正则表达式匹配的一种有效手段,但DFA的状态跳转是串行的,导致匹配速度慢、难以满足高速骨干网环境深度包检测(DPI)的性能需求.提出了一种称为LBDFA(Loopback DFA)的细粒度并行化状态跳转方法,通过将在Loopback状态上的连续跳转并行化,提高了匹配速度.此外,利用Bloom filter消除该并行跳转中的临时偏离现象,进一步提高了并行潜力.在L7-filter以及Snort规则集上的测试结果表明,LBDFA能够满足10 Gbps以上的正则表达式匹配需求.  相似文献   

7.
正则表达式是数据验证技术中功能最为强大的输入控制技术。传统的基于NFA的正则表达式引擎的匹配速度低。通过正则表达式与自动机等价的原理,研究了通过最小化的确定的有限自动机(DFA)来等价实现.NET中正则表达式的数据验证的机制,以期提高正则表达式的匹配速度。  相似文献   

8.
本文主要介绍基于编译器构造技术中的由正规表达式到最小化DFA的算法设计和实现技术,以及自动机转换正规式的方法。正规式与自动机理论以不同方式表达相同语言,两者相互转换在编译器构造过程中起至关重要的作用,也被广泛应用于计算机科学的各个领域。  相似文献   

9.
自动机理论是编译程序中单词识别的基本理论。论文分析了自动机与正规表达式等价性定理,指出了从确定有限自动机到正规表达式重构规则中存在的问题,给出了一个包含多个结点所组成回路的有限自动机到正规表达式的重构定理,并通过实例对于该定理所阐明的方法的运用进行了详细的讨论。  相似文献   

10.
魏强  李云照  褚衍杰 《计算机工程》2012,38(18):137-139
针对多条正则表达式转换为确定型有限自动机带来的状态空间膨胀问题,借鉴图划分的思想,提出一种改进的分组算法。与原分组算法相比,该算法在分组数相同时状态数平均减少30%,在某些情况下能获得更少的分组数。实验结果证明,该算法能有效降低匹配算法的复杂度。  相似文献   

11.
A sensing strategy for the reverse engineering of machined parts   总被引:1,自引:0,他引:1  
The reverse engineering of machined parts requires sensing an existing part and producing a design (and perhaps a manufacturing process) for it. We have developed a reverse engineering system that has proven effective with a set of machined parts. This paper describes the system, presents some results, and discusses strategy for a new system.This work was supported by ARPA under ARO grant number DAAH04-93-G-0420, DARPA grant N00014-91-J-4123, NSF grant CDA 9024721, and a University of Utah Research Committee grant. All opinions, findings, conclusions or recommendations expressed in this document are those of the authors and do not necessarily reflect the views of the sponsoring agencies.  相似文献   

12.
时间序列的快速相似性搜索改进算法   总被引:1,自引:0,他引:1  
This paper introduces a new method for finding all subsequences similar to a given time series sequence.The method takes into account noise ,offset translation and amplitude scaling. Based on a piecewise linear representa-tion, the speed is exceptionally fast.  相似文献   

13.
基于幂图的属性约简搜索式算法   总被引:7,自引:0,他引:7  
粗糙集理论是一种新的处理不精确、不完全与不一致数据的数学工具.属性约简是粗糙集理论的重要研究内容之一,已有的属性约简算法主要是基于代数表示与信息表示的方法.同一问题在不同的知识表示下,其求解难度是不同的.文中从改变属性约简问题的知识表示人手,提出了该问题的一种新的表示方式--幂图;给出了基于幂图的属性约简搜索式算法,把属性约简计算问题转化为在幂图中的搜索问题.理论分析表明新算法是有效的,为属性约简研究提供了一条新的途径.  相似文献   

14.
eCos是一种可裁剪、可配置的实时嵌入式系统,但对基于X86架构的CPU支持有限,文章作者为eCos系统增加了CF的引导方式,为X86提供了通用的启动方式.  相似文献   

15.
Predicting the fold, or approximate 3D structure, of a protein from its amino acid sequence is an important problem in biology. The homology modeling approach uses a protein database to identify fold-class relationships by sequence similarity. The main limitation of this method is that some proteins with similar structures appear to have very different sequences, which we call the hidden-homology problem. As in other real-world domains for machine learning, this difficulty may be caused by a low-level representation. Learning in such domains can be improved by using domain knowledge to search for representations that better match the inductive bias of a preferred algorithm. In this domain, knowledge of amino acid properties can be used to construct higher-level representations of protein sequences. In one experiment using a 179-protein data set, the accuracy of fold-class prediction was increased from 77.7% to 81.0%. The search results are analyzed to refine the grouping of small residues suggested by Dayhoff. Finally, an extension to the representation incorporates sequential context directly into the representation, which can express finer relationships among the amino acids. The methods developed in this domain are generalized into a framework that suggests several systematic roles for domain knowledge in machine learning. Knowledge may define both a space of alternative representations, as well as a strategy for searching this space. The search results may be summarized to extract feedback for revising the domain knowledge.  相似文献   

16.
针对外场条件下的激光捷联惯组的标定问题,设计了一种适合外场标定的方案.在进行单一位置捷联惯导误差可观测性分析的基础上,提出一种基于平台及正六面体框架的外场标定方法.该方法仅通过翻转正六面体使对称位置误差相消,并且在对准中获取姿态信息,同时精确标定出陀螺漂移和加速度计零偏.最后对理论分析结果进行了仿真验证,仿真结果表明该...  相似文献   

17.
大型网格模型简化和多分辨率技术综述   总被引:3,自引:0,他引:3  
网格简化和多分辨率绘制是2种对于提高绘制性能非常有效的技术,但对于大型网格模型,这2种技术的设计和实现本身也存在诸多难点.文中综述了大型网格模型简化和多分辨率技术的研究进展,首先分析和比较基于网格分割、基于外存数据结构和基于流式策略的大型网格模型简化方法,然后介绍和比较大型网格模型多分辨率表示的设计、构建和绘制技术.最后总结并展望了该研究领域的发展趋势.  相似文献   

18.
提出了一种在恒定深度的分层海洋中计算本征声线的方法。该方法在跨度模型的基础上,将本征声线表示为三段声线跨度的组合,建立跨度方程组以确定本征声线。采用计算本征声线的方法,可以计算若干条本征声线的出射角、到达角、到达时间和传播损失。如果给定声源的辐射频率以及海底信息,还可以计算声线携带声波的强度和相位,这样在经典射线理论的基础上可以简明的描述声场分布,给水声通信时变特性分析和通信系统的设计奠定一定基础。  相似文献   

19.
A new high spectral accuracy compact difference scheme is proposed here. This has been obtained by constrained optimization of error in spectral space for discretizing first derivative for problems with non-periodic boundary condition. This produces a scheme with the highest spectral accuracy among all known compact schemes, although this is formally only second-order accurate. Solution of Navier-Stokes equation for incompressible flows are reported here using this scheme to solve two fluid flow instability problems that are difficult to solve using explicit schemes. The first problem investigates the effect of wind-shear past bluff-body and the second problem involves predicting a vortex-induced instability.  相似文献   

20.
本文提出了一种用于对正则表达式的覆盖能力进行评价的算法。我们将一条正则表达式可覆盖的实例的数目定义为正则表达式的覆盖能力。算法首先将完整的正则表达式分成若干片断,然后分析每个片断可覆盖的字符串实例数目,最后根据乘法原理将各个片断可覆盖的实例数目相乘,即为当前正则表达式可覆盖的实例数目。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号