首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
基于正则表达式进行深度报文检测在IDS/IPS、应用层协议识别等网络应用中具有重要作用。然而,采用DFA实现正则表达式需要大量的存储空间,限制了它的实际应用。将DFA状态转换表拆分成3个表,使用run-length编码进行压缩,并对压缩方法进行了优化。采用l7-filter中几个常用应用程序的正则表达式进行测试,结果表明该方法压缩效果一般在90%以上。  相似文献   

2.
针对现有模式匹配算法无法实现大容量模式集快速搜索的不足,提出了一种基于TCAM多字节状态机的模式匹配算法。利用TCAM的掩码特性,切分具有相同匹配字符串的状态集,提出了一种编号编码压缩机制。通过理论证明,集合切分编码利用状态机的已匹配信息,将编号存储改变为编号段存储,大幅压缩了具有相同转移字符串和目的状态的交叉转移路径,减少TCAM表项数目。经理论分析和实验仿真,该算法不仅具有高搜索速率,而且可以减少大量相似表项,降低TCAM存储资源消耗,从而支持大容量的模式集。  相似文献   

3.
一组提高存储效率的深度包检测算法   总被引:2,自引:0,他引:2  
于强  霍红卫 《软件学报》2011,22(1):149-163
随着深度包检测规则数目的剧烈增长,为了适应网络处理的需求,必须对表示正则表达式的DFA(deterministic finite automata,确定的有限自动机)进行高效的存储.一方面,对DFA的状态点数目进行压缩,提出了一种复合的FSM(有限自动机)的构造方法,通过对正则表达转化成DFA的状态点数目复杂度的分析,将不同复杂度的正则表达式采用不同的方式构建DFA,使得所有平方级和指数级复杂度的状态点数目降低到了线性级.另一方面,对DFA的状态转移数目进行压缩,给出了一种高效的压缩算法,即WD2FA(weighted delayed input DFA,带权延迟DFA)算法,对于任意复杂度的正则表达式都可以将状态转移数目压缩为原来的5%左右,相对于D2FA(delayed input DFA,延迟的DFA)有更好的压缩能力,并且使得D2FA是WD2FA在权值为0情况下的特例.实验结果表明,有限自动机的状态点数目能够控制在线性级,并且在状态点压缩的基础上将状态转移数目压缩为原来的7%.  相似文献   

4.
基于簇聚类的确定型有穷自动机(DFA)压缩算法,即ClusterFA算法,解决了正则表达式匹配中的空间爆炸问题,但该算法的分组个数取理想值较为困难,且其类中心向量表的每一行中连续重复转移状态出现频率较高。针对该问题,提出一种改善ClusterFA算法的方案En_ClusterFA。提取类中心向量表行与行之间相同的首尾部分,并对其进行游程编码以建立索引表,对类中心向量表余下部分的转移状态进行游程编码。利用该方案对Bro,Snort和L7-filter规则集进行测试,实验结果表明,除了L7_2和L7_6规则集的压缩率分别提高到96.1%和98.1%之外,其他规则集的压缩率都提高到99%以上。与ClusterFA算法的压缩率相比,En_ClusterFA平均提高了4%,证明En_ClusterFA能够有效地提高DFA的压缩效率。  相似文献   

5.
针对经典Aho-Corasick算法存在空间开销大,存储效率低的问题,提出一种改进的空间高效Aho-Corasick算法。新算法在预处理阶段根据状态转移函数、输出函数的不同特性,灵活选择不同的方式存储状态结点,实现对Aho-Corasick算法状态机的压缩。实验表明,新算法与经典Aho-Corasick算法、Bitmapped AC算法相比,以匹配阶段较小的时间性能为代价,极大幅度地压缩状态机的存储空间。  相似文献   

6.
多模式匹配算法经常使用有限自动状态机来实现多个模式串的并行匹配。针对基于自动状态机的多模式匹配算法在应用于中文编码时存在的存储空间膨胀问题,使用中文字符的拆分编码构造自动状态机,以优化算法自动状态机的存储空间,并利用中文编码的编码关联性,设计了一种基于编码关联跳转的失效跳转表,使用启发式跳跃规则提升匹配算法的时间性能。最后通过实验证明,中文编码环境下,相比于其它使用自动状态机的多模式匹配算法,改良算法拥有更小的空间消耗与更快的运行速度。  相似文献   

7.
深度包检测采用简单的字符串匹配技术将报文内容与一组固定字符串进行匹配,基于正则表达式匹配算法能提供更强的表达能力和灵活性,而复杂的正则表达式结构可能引起DFA的状态数膨胀,导致存储代价巨大;DFA拆分算法将DFA转换表拆分为三个表:间接索引表,转换输出表,直接转换表,实验结果表明DFA所占空间大大减小,实现了DFA的压缩存储。  相似文献   

8.
进行网络稳定性测试时,对网络协议状态机制进行检测可以有效提高测试的全面性.基于有限状态机思想提出了一种协议状态机制检测方法.建立待测协议特定消息发送实体的有限状态机模型,确定输入集合;测试并监测实体的状态转移情况,生成状态转移图;根据状态转移图判定该消息的状态机制,确定有状态协议消息的触发条件,对消息进行归纳分类实现协议状态机制的判定.搭建实验环境,验证了该方法的有效性.  相似文献   

9.
针对目前硬件正则表达式匹配算法在存储空间以及吞吐量等方面面临的挑战,结合扩展有限自动机(XFA)正则表达式匹配算法,提出了一种预定义类的压缩自动机匹配算法(Pre-Class CFA)。通过预定义类,算法既可以实现正则表达式中类字符匹配,又能够通过优先级的设定匹配特殊字符集,并在XFA消除确定性有限状态机(DFA)状态爆炸问题的基础上进一步压缩了迁移边数目;同时算法根据现场可编程门阵列(FPGA)和迁移边的特征,设计了一种基于并联只读存储器(ROM)结构的迁移边存取方法,可以实现同一状态多条迁移边的并行读取和匹配。在中低性能FPGA平台ALTERA DE2-70上对算法进行测试,实验中系统吞吐量为1.3 Gb/s,可实现千兆网络下的入侵检测和垃圾过滤。  相似文献   

10.
运用状态机提高嵌入式软件效率   总被引:2,自引:0,他引:2  
有限状态机是根据当前状态以及触发条件进行状态转换的一种机制,包含一组状态集(state)、一个起始状态(start state)、一组输入符号集(alphabet)、一个映射输入符号和当前状态到下一状态的转换函数(transition function)的计算模型。当输入符号串时,模型随即进入起始状态。要让状态机改变到新的状态,依赖于系统的转换函  相似文献   

11.
Signature-based intrusion detection is required to inspect network traffic at wire-speed. Matching packet payloads against patterns specified with regular expression is a computation intensive task. Hence, the design of hardware accelerator to speed up regular expression matching has been an active research area. A systematic approach to detect regular expression is based on finite automaton. The space-time trade-off between deterministic finite automaton (DFA) and non-deterministic finite automaton (NFA) is well-known. DFA can offer constant throughput but it may suffer from the state explosion problem. Hence, implementation of DFA for large pattern sets on embedded device with limited on-chip memory may not be viable. NFA requires linear space but the throughput can be very low. Implementations of NFA with hardwired circuits can overcome the speed deficiency by exploiting the massive parallelism offered by dedicated hardware circuitries, but this approach does not support efficient dynamic updates. In this paper, we shall present a memory-based architecture for the implementation of NFA to speed up regular expression matching for signature-based intrusion detection. The proposed method supports dynamic updates and offers constant throughput so that it can be used to supplement the existing DFA-based methods in handling large pattern sets.  相似文献   

12.
A mutating finite automaton (MFA) is a nondeterministic finite automaton (NFA) that changes its morphology over discrete time by a sequence of mutations. This results in a sequence of NFAs, the initial NFA, and one mutated NFA for each mutation. Some application domains, including model-based diagnosis of discrete-event systems in artificial intelligence and model-based testing in software engineering, require temporal determinization of MFAs. Determinizing an MFA temporally means generating a deterministic finite automaton (DFA) that is equivalent to the mutated NFA as soon as a mutation occurs. Since, in computation time, the classical Subset Construction determinization algorithm may be less than optimal when applied to MFAs, a conservative algorithm is proposed, called Subset Restructuring, which, instead of constructing the new DFA from scratch based on the mutated NFA, generates the new DFA by updating the previous DFA based on the mutation occurred. Subset Restructuring is sound and complete, thereby yielding the same DFA generated by Subset Construction. Results from massive experimentation indicate the viability of Subset Restructuring, especially so when large MFAs change by small mutations.  相似文献   

13.
本文提出了一种新的用于构造入侵检测模式匹配自动机的方法。该方法从构造判定单个模式的NFA自动机入手,通过集成单个的NFA而得到全集的NFA,并将全集NFA转换为与之等价的DFA并化简,从而可得到全集的确定型模式匹配有限自动机。由于该方法可以完全自动完成,从而可以方便地为入侵检测系统构造模式匹配自动机。  相似文献   

14.
NFA的确定化具有重要的理论和实际意义.迄今为止,普遍采用子集构造法将一个NFA(非确定性自动机)转化为DFA(确定性自动机),但这种方法需要引入空输入ε及状态子集I的ε-闭包,其计算过程相对繁琐.而且在确定化过程中对于NFA状态集存在ε-closure重复计算和由于对非ε转换的判断而引起的重复计算等问题.本文描述了一种将一类NFA直接转化为DFA的方法.在本方法中,不需要引入空输入ε,可根据原始的NFA状态图或状态转移表直接得出等价的DFA状态图或状态转移表,而且所有状态都是单一的状态而非集合状态,便于软硬件实现与测试.  相似文献   

15.
16.
Summary The amount of nondeterminism in a nondeterministic finite automaton (NFA) is measured by counting the minimal number of guessing points a string w has to pass through on its way to an accepting state. NFA's with more nondeterminism can achieve greater savings in the number of states over their deterministic counterparts than NFA's with less nondeterminism. On the other hand, for some nontrivial infinite regular languages a deterministic finite automaton (DFA) can already be quite succinct in the sense that NFA's need as many states (and even context-free grammars need as many nonterminals) as the minimal DFA has states.This research was supported in part by the National Science Foundation under Grant No. MCS 76-10076  相似文献   

17.
There is an increasing demand for network devices to perform deep packet inspection (DPI) in order to enhance network security. In DPI, the packet payload is compared against a set of predefined patterns that can be specified using regular expressions (regexes). It is well-known that mapping regexes to deterministic finite automaton (DFA) may suffer from the state explosion problem. Through observation, we attribute DFA explosion to the necessity of remembering matching history. In this paper, we investigate how to manage matching history efficiently and propose an extended DFA approach for regex matching called fcq-FA, which can make a memory size reduction of about 1,000 times with a fully automated approach. In fcq-FA, we use pipeline queues and counters to help record the matching history. Hence, state explosion caused by Kleene closure and length restriction can be completely avoided. Furthermore, it achieves a fully automated signature compilation with polynomial running time and space. The equivalence between fcq-FA and the traditional DFA is guaranteed by a strict theoretical proof, which means fcq-FA can process all the regexes supported by the traditional DFA.  相似文献   

18.
This article describes an improvement of the brute force determinization algorithm in the case of homogeneous nondeterministic finite automata (NFAs), as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it a safe procedure contrary to the classical subset construction. Actually, our algorithm is inspired by both recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. Our algorithm significantly improves Navarro–Raffinot's one since it has an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time. This algorithm has been implemented in CCP software (http://www.univ-rouen.fr/LIFAR/aia/ccp.html). Tests have been carried out in the field of text processing and biology. Experimental results are reported.  相似文献   

19.
Closure underlength-preserving homomorphisms is interesting because of its similarity tonondeterminism. We give a characterization of NP in terms of length-preserving homomorphisms and present related complexity results. However, we mostly study the case of two-way finite automata: Let l.p.hom[n state 2DFA] denote the class of languages that are length-preserving homomorphic images of languages recognized byn-state 2DFAs. We give a machine characterization of this class. We show that any language accepted by ann-state two-wayalternating finite automaton (2AFA), or by a l-pebble 2NFA, belongs to l.p.hom[O(n 2) state 2DFA]. Moreover, there are languages in l.p.hom[n state 2DFA] whose smallest accepting 2NFA has at leastc n states (for some constantc > 1). So for two-way finite automata, the closure under length-preserving homomorphisms is much more powerful than nondeterminism. We disprove two conjectures (of Meyer and Fischer, and of Chrobak) about the state-complexity of unary languages. Finally, we show that the equivalence problems for 2AFAs (resp. 1-pebble 2NFAs) are in PSPACE, and that the equivalence problem for 1-pebble 2AFAs is in ExpSPACE (thus answering a question of Jiang and Ravikumar); it was known that these problems are hard in these two classes. We also give a new proof that alternating 1-pebble machines recognize only regular languages (which was first proved by Goralčíket al.). This research was supported in part by N.S.F. Grant DMS 8702019.  相似文献   

20.
当前深度包检测算法通常需要将正则表达式转换为NFA或者DFA.但是随着网络带宽的不断增加.NFA和DFA状态占用的存储空间越来越大,极大地考验着系统的存储能力。为了应对这个问题.提出一种基于正则表达式相性的分组算法来对表达式进行分组,实验证明该算法能减少NFA和DFA状态的数量,提高匹配的效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号