首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This article describes an improvement of the brute force determinization algorithm in the case of homogeneous nondeterministic finite automata (NFAs), as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it a safe procedure contrary to the classical subset construction. Actually, our algorithm is inspired by both recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. Our algorithm significantly improves Navarro–Raffinot's one since it has an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time. This algorithm has been implemented in CCP software (http://www.univ-rouen.fr/LIFAR/aia/ccp.html). Tests have been carried out in the field of text processing and biology. Experimental results are reported.  相似文献   

2.
NFA的确定化具有重要的理论和实际意义.迄今为止,普遍采用子集构造法将一个NFA(非确定性自动机)转化为DFA(确定性自动机),但这种方法需要引入空输入ε及状态子集I的ε-闭包,其计算过程相对繁琐.而且在确定化过程中对于NFA状态集存在ε-closure重复计算和由于对非ε转换的判断而引起的重复计算等问题.本文描述了一种将一类NFA直接转化为DFA的方法.在本方法中,不需要引入空输入ε,可根据原始的NFA状态图或状态转移表直接得出等价的DFA状态图或状态转移表,而且所有状态都是单一的状态而非集合状态,便于软硬件实现与测试.  相似文献   

3.
For a given weighted finite automaton over a strong bimonoid we construct its reduced Nerode automaton, which is crisp-deterministic and equivalent to the original weighted automaton with respect to the initial algebra semantics. We show that the reduced Nerode automaton is even smaller than the Nerode automaton, which was previously used in determinization related to this semantics. We determine necessary and sufficient conditions under which the reduced Nerode automaton is finite and provide an efficient algorithm which computes the reduced Nerode automaton whenever it is finite. In determinization of weighted finite automata over semirings and fuzzy finite automata over lattice-ordered monoids this algorithm gives smaller crisp-deterministic automata than any other known determinization algorithm.  相似文献   

4.
Signature-based intrusion detection is required to inspect network traffic at wire-speed. Matching packet payloads against patterns specified with regular expression is a computation intensive task. Hence, the design of hardware accelerator to speed up regular expression matching has been an active research area. A systematic approach to detect regular expression is based on finite automaton. The space-time trade-off between deterministic finite automaton (DFA) and non-deterministic finite automaton (NFA) is well-known. DFA can offer constant throughput but it may suffer from the state explosion problem. Hence, implementation of DFA for large pattern sets on embedded device with limited on-chip memory may not be viable. NFA requires linear space but the throughput can be very low. Implementations of NFA with hardwired circuits can overcome the speed deficiency by exploiting the massive parallelism offered by dedicated hardware circuitries, but this approach does not support efficient dynamic updates. In this paper, we shall present a memory-based architecture for the implementation of NFA to speed up regular expression matching for signature-based intrusion detection. The proposed method supports dynamic updates and offers constant throughput so that it can be used to supplement the existing DFA-based methods in handling large pattern sets.  相似文献   

5.
一种基于深度报文检测的FSM状态表压缩技术   总被引:6,自引:0,他引:6  
针对深度报文检测中正则表达式模式匹配的状态表爆炸问题,提出并实现了一种集合交割的预编码方法(SI-precode),在正则表达式转换成DFA前对所有输入符号进行预编码,通过压缩输入,减少FSM中输入符号的种类,从而压缩状态转移表的空间.证明了预编码生成的状态机的正确性及其与原状态机的同态性.采用L7-filter模式进行实验表明SI-precode不仅提高了正则表达式的编译速度,针对单模式状态机,其状态转移表空间比不进行预编码压缩了87%~97%,50个模式的多模式状态机可压缩59%.预编码在软硬件结合体系结构下进行协议识别时不会对性能造成影响;对纯软件结构性能降低2%~4%.  相似文献   

6.
7.
The determinization of a nondeterministic finite automaton (FA) is the process of generating a deterministic FA (DFA) equivalent to (sharing the same regular language of) . The minimization of is the process of generating the minimal DFA equivalent to . Classical algorithms for determinization and minimization are available in the literature for several decades. However, they operate monolithically, assuming that the FA to be either determinized or minimized is given once and for all. By contrast, we consider determinization and minimization in a dynamic context, where augments over time: after each augmentation, determinization and minimization of into is required. Using classical monolithic algorithms to solve this problem is bound to poor performance. An algorithm for incremental determinization and minimization of acyclic finite automata, called IDMA, is proposed. Despite being conceived within the narrow domain of model‐based diagnosis and monitoring of active systems, the algorithm is general‐purpose in nature. Experimental evidence indicates that IDMA is far more efficient than classical algorithms in solving incremental determinization and minimization problems. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
This paper investigates the transition function and the reachability conditions of finite automata by using a semitensor product of matrices, which is a new powerful matrix analysis tool. The states and input symbols are first expressed in vector forms, then the transition function is described in an algebraic form. Using this algebraic representation, a sufficient and necessary condition of the reachability of any two states is proposed, based on which an algorithm is developed for discovering all the paths from one state to another. Furthermore, a mechanism is established to recognize the language acceptable by a finite automaton. Finally, illustrative examples show that the results/algorithms presented in this paper are suitable for both deterministic finite automata (DFA) and nondeterministic finite automata (NFA).  相似文献   

9.
We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for constructing a polynomial-size Nondeterministic Finite Automaton (NFA) that decides unique decodability is given. This NFA may be simulated efficiently in time and space. Conversely, we show that the minimum deterministic finite automaton for deciding unique decodability has exponentially many states in alphabet size, and compute the correct order of magnitude of the exponent.  相似文献   

10.
Mutation systems     
We propose mutation systems as a model of the evolution of a string subject to the effects of mutations and a fitness function. One fundamental question about such a system is whether knowing the rules for mutations and fitness, we can predict whether it is possible for one string to evolve into another. To explore this issue, we define a specific kind of mutation system with point mutations and a fitness function based on conserved strongly k-testable string patterns. We show that for any k greater than 1, such systems can simulate computation by both finite state machines (FSMs) and asynchronous cellular automata. The cellular automaton simulation shows that in this framework, universal computation is possible and the question of whether one string can evolve into another is undecidable. We also analyse the efficiency of the FSM simulation assuming random point mutations.  相似文献   

11.
A word w is called synchronizing (recurrent, reset, directable) word of deterministic finite automata (DFA) if w brings all states of the automaton to a unique state. According to the famous conjecture of Cerny from 1964, every n-state synchronizing automaton possesses a synchronizing word of length at most (n - 1)2. The problem is still open. It will be proved that the Cerny conjecture holds good for synchronizing DFA with transition monoid having no involutions and for every n-state (n 〉 2) synchronizing DFA with transition monoid having only trivial subgroups the minimal length of synchronizing word is not greater than (n - 1)2/2. The last important class of DFA involved and studied by Schutzenberger is called aperiodic; its automata accept precisely star-free languages. Some properties of an arbitrary synchronizing DFA were established.  相似文献   

12.
Howard Chivers 《Software》2017,47(5):669-688
The jsre regular expression library was designed to provide fast matching of complex expressions over large input streams using user‐selectable character encodings. An established design approach was used: a simulated non‐deterministic automaton (NFA) implemented as a virtual machine, avoiding exponential cost functions in either space or time. A deterministic automaton (DFA) was chosen as a general dispatching mechanism for Unicode character classes, and this also provided the opportunity to use compact DFAs in various optimization strategies. The result was the development of a regular expression Preview which provides a summary of all the matches possible from a given point in a regular expression in a form that can be implemented as a compact DFA and can be used to further improve the performance of the standard NFA simulation algorithm. This paper formally defines a preview and describes and evaluates several optimizations using this construct. They provide significant speed improvements accrued from fast scanning of anchor positions, avoiding retesting of repeated strings in unanchored searches and efficient searching of multiple alternate expressions which in the case of keyword searching has a time complexity which is logarithmic in the number of words to be searched. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
Nondeterministic weighted finite-state automata are a key abstraction in automatic speech recognition systems. The efficiency of automatic speech recognition depends directly on the sizes of these automata and the degree of nondeterminism present, so recent research has studied ways to determinize and minimize them, using analogues of classical automata determinization and minimization. Although, as we describe here, determinization can in the worst case cause poly-exponential blowup in the number of states of a weighted finite-state automaton, in practice it is remarkably successful. In extensive experiments in automatic speech recognition systems, deterministic weighted finite-state automata tend to be smaller than the corresponding nondeterministic inputs. Our observations show that these size reductions depend critically on the interplay between weights and topology in nondeterministic weighted finite-state automata. We exploit these observations to design a new approximate determinization algorithm, which produces a deterministic weighted finite-state automaton that preserves the strings of a weighted language but not necessarily their weights. We apply our algorithm to two different types of weighted finite-state automata that occur in automatic speech recognition systems and in each case provide extensive experimental results showing that, compared with current techniques, we achieve significant size reductions without affecting performance. In particular, for a standard test bed, we can reduce automatic speech recognition memory requirements by 25—35\percent with negligible effects on recognition time and accuracy. Received March 31, 1998; revised January 29, 1999.  相似文献   

14.
Summary The amount of nondeterminism in a nondeterministic finite automaton (NFA) is measured by counting the minimal number of guessing points a string w has to pass through on its way to an accepting state. NFA's with more nondeterminism can achieve greater savings in the number of states over their deterministic counterparts than NFA's with less nondeterminism. On the other hand, for some nontrivial infinite regular languages a deterministic finite automaton (DFA) can already be quite succinct in the sense that NFA's need as many states (and even context-free grammars need as many nonterminals) as the minimal DFA has states.This research was supported in part by the National Science Foundation under Grant No. MCS 76-10076  相似文献   

15.
Closure underlength-preserving homomorphisms is interesting because of its similarity tonondeterminism. We give a characterization of NP in terms of length-preserving homomorphisms and present related complexity results. However, we mostly study the case of two-way finite automata: Let l.p.hom[n state 2DFA] denote the class of languages that are length-preserving homomorphic images of languages recognized byn-state 2DFAs. We give a machine characterization of this class. We show that any language accepted by ann-state two-wayalternating finite automaton (2AFA), or by a l-pebble 2NFA, belongs to l.p.hom[O(n 2) state 2DFA]. Moreover, there are languages in l.p.hom[n state 2DFA] whose smallest accepting 2NFA has at leastc n states (for some constantc > 1). So for two-way finite automata, the closure under length-preserving homomorphisms is much more powerful than nondeterminism. We disprove two conjectures (of Meyer and Fischer, and of Chrobak) about the state-complexity of unary languages. Finally, we show that the equivalence problems for 2AFAs (resp. 1-pebble 2NFAs) are in PSPACE, and that the equivalence problem for 1-pebble 2AFAs is in ExpSPACE (thus answering a question of Jiang and Ravikumar); it was known that these problems are hard in these two classes. We also give a new proof that alternating 1-pebble machines recognize only regular languages (which was first proved by Goralčíket al.). This research was supported in part by N.S.F. Grant DMS 8702019.  相似文献   

16.
马子睿 《数字社区&智能家居》2009,5(9):7273-7273,7297
主要介绍了有穷自动机的基础知识,研究了有穷自动机的等价性,并在确定型有穷自动机的状态集上引入等价关系,给出了自动机的最小化过程。利用等价归并算法,可以将某一给定的确定型有穷自动机状态集上的等价状态归并掉.生成与其等价的最小化的确定型有穷自动机。  相似文献   

17.
《国际计算机数学杂志》2012,89(9):1097-1106
The problem of converting a regular expression to nondeterministic finite automaton (NFA) is a fundamental problem that has been well studied. However, the two basic construction algorithms: (1) Thompson, (2) McNaughton–Yamada and Glushkov, both have disadvantages. In this article: first, a ‘smart’ parsing algorithm is developed which constructs a parse tree with at most (3l???1) nodes form a regular expression with l literals; second, we propose an algorithm that works on the resulting NFA from Thompson's construction, eliminating as many auxiliary states as possible while maintaining Thompson's properties. It is shown that the resulting NFA is minimized. This means that no auxiliary states can be eliminated without violating the defining properties of Thompson NFA. The time and space requirements for the above algorithms are linear with respect to the length of the regular expression.  相似文献   

18.
We investigate the learning problem of two-tape deterministic finite automata (2-tape DFAs) from queries and counterexamples. Instead of accepting a subset of ∑*, a 2-tape DFA over an alphabet ∑ accepts a subset of ∑* × ∑*, and therefore, it can specify a binary relation on ∑*. In [3] Angluin showed that the class of deterministic finite automata (DFAs) is learnable in polynomial time from membership queries and equivalence queries, namely, from a minimally adequate teacher (MAT). In this article we show that the class of 2-tape DFAs is learnable in polynomial time from MAT. More specifically, we show an algorithm that, given any languageL accepted by an unknown 2-tape DFAM, learns from MAT a two-tape nonde-terministic finite automaton (2-tape NFA)M′ acceptingL in time polynomial inn andl, wheren is the size ofM andl is the maximum length of any counterexample provided during the learning process. This work was supported in part by Grants-in-Aid for Scientific Research No. 04229105 from the Ministry of Education, Science, and Culture, Japan.  相似文献   

19.
In this paper, we analyze the problem of state minimization in a class of Finite State Automata called Two Start State Deterministic Finite State Automata (2-MDFAs). A 2-MDFA is similar to a deterministic finite state automaton (DFA), in that on a given input, each state has precisely one destination state; however, it differs from a DFA in that there are two start states. A string is accepted by a 2-MDFA if and only if there exists a transitional path from either start state to a finish state, on that string. Observe that 2-MDFAs provide a limited amount of non-determinism and hence investigating their properties from the perspective of state minimization is a worthwhile pursuit. In case of unbounded non-determinism, i.e., Non-deterministic finite state automata (NFAs), it is well-known that the state minimization problem is PSPACE-complete [Jiang and Ravikumar in Proceedings of the 18th International Colloquium on Automata, Languages and Programming, ICALP’91, Madrid, Spain, July 8–12, 1991] and further that such automata can be exponentially more succinct than DFAs [Meyer and Fischer in Proceedings of the 12th SWAT(Annual Symposium on switching and automata theory), pp 188-191, 1971]. Even in the case of 2-MDFAs, the minimization problem remains non-trivial; indeed, Malcher in Theor Comput Sci 327(3):375–390, 2004 shows that the corresponding decision problem is NP-complete. We focus on deriving approximability bounds for the state minimization problem in 2-MDFAs. Our main contribution in the current paper, is the design of an n-approximation algorithm for state minimization in 2-MDFAs, where n denotes the minimum number of states required to represent the input language as a 2-MDFA. We also present a proof that this bound is tight for our algorithm.The work of K. Subramani was supported in part by the Air-Force office of Scientific Research under Grant FA9550-06-1-0050  相似文献   

20.
We develop a novel learning algorithm RTI for identifying a deterministic real-time automaton (DRTA) from labeled time-stamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finite-state automaton (DFA) identification, called evidence-driven state-merging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NP-complete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from time-stamped event sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号