首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
We consider some natural variations on the following classic pattern-matching problem: given an NFA M over the alphabet Σ and a pattern p over some alphabet Δ, does there exist a word xL(M) such that x matches p? We consider the restricted problem where M only accepts a finite language. We also consider the variation where only some factor of x is required to match the pattern p. We show that both of these problems are NP-complete. We also consider the same problems for context-free grammars; in this case the problems become PSPACE-complete.  相似文献   

Courcelle introduced the study of regular words, i.e., words isomorphic to frontiers of regular trees. Heilbrunner showed that a nonempty word is regular iff it can be generated from the singletons by the operations of concatenation, omega power, omega-op power, and the infinite family of shuffle operations. We prove that the algebra of nonempty regular words on the set A, equipped with these operations, is freely generated by A in a variety which is axiomatizable by an infinite collection of some natural equations. We also show that this variety has no finite equational basis and that its equational theory is decidable in polynomial time.  相似文献   

Several subclasses of regular languages are known to be inferable from positive data only. This paper surveys classes of languages originating from the class of reversible languages. We define the classes by using a uniform grammatical notation.  相似文献   

In this paper we consider two questions. First we consider whether every pattern language which is regular can be generated by a regular pattern. We show that this is indeed the case for extended (erasing) pattern languages if alphabet size is at least four. In all other cases, we show that there are patterns generating a regular language which cannot be generated by a regular pattern. Next we consider whether there are pattern languages which are context-free but not regular. We show that, for alphabet size 2 and 3, there are both erasing and non-erasing pattern languages which are context-free but not regular. On the other hand, for alphabet size at least 4, every erasing pattern language which is context-free is also regular. It is open at present whether there exist non-erasing pattern languages which are context-free but not regular for alphabet size at least 4.  相似文献   

Drawn symbolic pictures are an extension of drawn pictures obtained by associating a symbol from an alphabet to each point of the picture. In the paper we will address some new interesting issues derived from the introduction of the symbols and we will identify the conditions, which ensure the preservation of properties holding for drawn pictures in the setting of the proposed extension.  相似文献   

This article describes an improvement of the brute force determinization algorithm in the case of homogeneous nondeterministic finite automata (NFAs), as well as its application to pattern matching. Brute force determinization with limited memory may provide a partially determinized automaton, but its bounded complexity makes it a safe procedure contrary to the classical subset construction. Actually, our algorithm is inspired by both recent results of Champarnaud concerning the subset automaton of a homogeneous NFA and the algorithm recently designed by Navarro and Raffinot to implement the brute force determinization of the Glushkov NFA of a regular pattern. Our algorithm significantly improves Navarro–Raffinot's one since it has an average exponentially smaller memory requirement for a given level of determinization, which, considering a bounded memory, implies a quadratically smaller parsing time. This algorithm has been implemented in CCP software (http://www.univ-rouen.fr/LIFAR/aia/ccp.html). Tests have been carried out in the field of text processing and biology. Experimental results are reported.  相似文献   

This paper is concerned with an algorithm for identifying an unknown regular language from examples of its members and non-members. The algorithm is based on the model inference algorithm given by Shapiro. In our setting, however, a given first order language for describing a target logic program has countably many unary predicate symbols: q 0,q 1,q 2…. On the other hand, the oracle which gives information about the unknown regular language to the inference algorithm has no interpretation for predicates other than the predicate q 0. In such a setting,we cannot directly take advantage of the contradiction backtracing algorithm which is one of the most important parts for the efficiency of the model inference algorithm. In order to overcome this disadvantage, we develop a method for giving an interpretation for predicates other than the predicate q 0 indirectly, which is based on the idea of using the oracle and a one to one mapping from a set of predicates to a set of strings. Furthermore, we propose a model inference algorithm for regular languages using the method, then argue the correctness and the time complexity of the algorithm  相似文献   

In this paper we study formal power series over a quantale with coefficients in the algebra of all languages over a given alphabet, and representation of fuzzy languages by these formal power series. This representation generalizes the well-known representation of fuzzy languages by their cut and kernel languages. We show that regular operations on fuzzy languages can be represented by regular operations on power series which are defined by means of operations on ordinary languages. We use power series in study of fuzzy languages which are recognized by fuzzy finite automata and deterministic finite automata, and we study closure properties of the set of polynomials and the set of polynomials with regular coefficients under regular operations on power series.  相似文献   

Myhill-Nerode定理利用等价关系描述了正则语言的一个重要特征,它是有限自动机理论中的一个经典、优美的结果。为了将Myhill-Nerode定理推广到更一般的情形,引入了有限自动机M上的状态转移半群和Σ*上的M-半群,讨论了其若干性质。在此基础上,将Myhill-Nerode定理中的等价关系一般化,给出了正则语言的一个新的特征定理,Myhill-Nerode定理成为该定理的一个推论。讨论了正则语言的最一般的特征,提出了有待进一步研究的问题。  相似文献   

在测试基于复杂数据结构的程序时,需要用到上下文无关语言句子的枚举.基于上下文无关语言按推导树高度的分层构造,提出了句子的反向自然枚举算法.通过堆、层、簇和长方体将句子划分为有穷集合序列,该算法的时间效率为O(n),n是被枚举句子的长度.实验数据表明,该算法是高效的,且应用更加便利.  相似文献   

We consider equality sets of prefix morphisms, that is, sets E(g1,g2)={w|g1(w)=g2(w)}, where g1 and g2 are prefix morphisms. Recall that a morphism g is prefix if, for all different letters a and b, g(a) is not a prefix of g(b). We prove a rather surprising equality on families of languages, namely, that the family of regular star languages coincides with the family of languages of form πA(E(g1,g2)) for some prefix morphisms g1 and g2, and a projection πA which deletes the letters not in A.  相似文献   

Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, where the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are very adequate to be used in constrained tasks where training samples of pairs of sentences are available. A technique to infer finite-state transducers is proposed in this work. This technique is based on formal relations between finite-state transducers and finite-state grammars. Given a training corpus of input-output pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic finite-state grammar is inferred. This grammar is finally transformed into a resulting finite-state transducer. The proposed methods are assessed through series of machine translation experiments within the framework of the EUTRANS project.  相似文献   

In this note we prove that the equations satisfied by one-letter regular languages are exactly those satisfied by commutative regular languages. This answers a problem raised by Arto Salomaa.  相似文献   

A structural characterization of reflexive splicing languages has been recently given in [P. Bonizzoni, C. De Felice, R. Zizza, The structure of reflexive regular splicing languages via Schützenberger constants, Theoretical Computer Science 334 (2005) 71-98] and [P. Bonizzoni, G. Mauri, Regular splicing languages and subclasses, Theoretical Computer Science 340 (2005) 349-363] showing surprising connections between long standing notions in formal language theory, the syntactic monoid and Schützenberger constant and the splicing operation.In this paper, we provide a procedure to decide whether a regular language is a reflexive splicing language, based on the above-mentioned characterization that is given in terms of a finite set of constants for the language. The procedure relies on the notion of label-equivalence that induces a finite refinement of the syntactic monoid of a regular language L. A finite set of representatives for label-equivalent classes of constant words in L is defined and it is proved that such a finite set provides the splice sites of splicing rules generating language L.  相似文献   

A word which is equal to its mirror image is called a palindrome word. Any language consisting of palindrome words is called a palindrome language. In this paper we investigate properties of palindrome words and languages. We show that there is no dense regular language consisting of palindrome words. A language contains all the mirror images of its elements is called a reverse closed language. Clearly, every palindrome language is reverse closed. We show that whether a given regular or context-free language is reverse closed is decidable. We study certain properties concerning reverse closed finite maximal prefix codes in this paper. Properties of languages that commute with reverse closed languages are investigated too.  相似文献   

We examine decision problems for various classes of convex languages, previously studied by Ang and Brzozowski, originally under the name “continuous languages”. We can decide whether a language L is prefix-, suffix-, factor-, or subword-convex in polynomial time if L is represented by a DFA, but these problems become PSPACE-complete if L is represented by an NFA. If a regular language is not convex, we find tight upper bounds on the length of the shortest words demonstrating this fact, in terms of the number of states of an accepting DFA. Similar results are proved for some subclasses of convex languages: the prefix-, suffix-, factor-, and subword-closed languages, and the prefix-, suffix-, factor-, and subword-free languages. Finally, we briefly examine these questions where L is represented by a context-free grammar.  相似文献   

A Teacher knows a regular language L(G), in the form of a finite state acceptor. A method is described for selecting a set of examples, strings X, each in L(G) as inputs to the Pupil. The Set X is mapped into a lattice W (in Pupil) of finite state machines. A mapping is defined from pairs of machines in the lattice W into strings y, each of which serves as a ”crucial experiment”. The Teacher is asked to decide if the string y belongs to L(G). The process then repeats or terminates.This procedure is shown to converge (if Teacher answers truthfully) to a finite state acceptor accepting only strings of L(G)(which obviously may be brought into canonical, minimal state form). However, this process does not depend on state minimization as an inference method.The only necessary condition for the inference process is that every move (edge) of that finite state acceptor U(X) chosen to correspond to L(G) must be applied at least once in generating some string x in X. A proof is given that if the Teacher answers correctly, the Pupil will infer a machine behaviorally equivalent to the original acceptor U.Elements of the lattice W are constructed by successive refinement of the partitions of the state set of the initial finite state machine U(X). Pairs are chosen in an ordered process and converted to deterministic and completely specified machines, if necessary. The two machines are tested for behavioral equivalence. If they are equivalent, one is eliminated. If not, a testing string y belonging to one machine, but not the other, is constructed and output to the teacher. If y belongs to L(G), one machine is eliminated. If not, y is tested by the Pupil against a sequence of machines generated internally. If only one machine is left, the process terminates, otherwise two new candidate machines are chosen. The algorithm described is relatively simple and easy to understand, but does not necessarily produce a minimal time solution.  相似文献   

A new and simple method for the inference of regular grammars or finite automata is presented. It is based on the analysis of the successive appearances of the terminal symbols in the learning strings. It is shown that for syntactic pattern recognition applications, this method is more efficient than other algorithms already proposed.  相似文献   

Imposing constraints is an effective means to incorporate biological knowledge into alignment procedures. As in the PROSITE database, functional sites of proteins can be effectively described as regular expressions. In an alignment of protein sequences it is natural to expect that functional motifs should be aligned together. Due to this motivation, Arslan introduced the regular expression constrained sequence alignment problem and proposed an algorithm which, if implemented naïvely, can take time and space up to O(2|Σ|4|V|n2) and O(2|Σ|4|V|n), respectively, where Σ is the alphabet, n is the sequence length, and V is the set of states in an automaton equivalent to the input regular expression. In this paper we propose a more efficient algorithm solving this problem which takes O(3|V|n2) time and O(2|V|n) space in the worst case. If |V|=O(logn) we propose another algorithm with time complexity O(2|V|log|V|n2). The time complexity of our algorithms is independent of Σ, which is desirable in protein applications where the formulation of this problem originates; a factor of 2|Σ|=400 in the time complexity of the previously proposed algorithm would significantly affect the efficiency in practice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号