首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 656 毫秒
1.
We present a novel algorithm using new hypothesis representations for learning context-free grammars from a finite set of positive and negative examples. We propose an efficient hypothesis representation method which consists of a table-like data structure similar to the parse table used in efficient parsing algorithms for context-free grammars such as Cocke-Younger-Kasami algorithm. By employing this representation method, the problem of learning context-free grammars from examples can be reduced to the problem of partitioning the set of nonterminals. We use genetic algorithms for solving this partitioning problem. Further, we incorporate partially structured examples to improve the efficiency of our learning algorithm, where a structured example is represented by a string with some parentheses inserted to indicate the shape of the derivation tree of the unknown grammar. We demonstrate some experimental results using these algorithms and theoretically analyse the completeness of the search space using the tabular method for context-free grammars.  相似文献   

2.
《Pattern recognition》1988,21(6):623-629
An edNLC-graph grammar, introduced by Janssens,(4) is a strong formalism for generating scene representations. This grammar generates directed node- and edge-labelled graphs, EDG-graphs. A method of construction of unambiguous string EDG-graph representation is briefly described. The characteristics of edNLC-graph grammar for syntactic pattern recognition allows us to construct the parsing algorithm. The deterministic top-down syntax analyzer is constructed for the subfamily of an edNLC-graph grammar, called an ETL/1-graph grammar. An ETL/1-graph grammar is parallel to a finite state string grammar. The notions introduced in the paper are useful for researches in less restricted edNLC-graph grammars, for example grammars analogical to context-free string grammars.  相似文献   

3.
An unsupervised incremental algorithm for grammar inference and its application to domain-specific language development are described. Grammatical inference is the process of learning a grammar from the set of positive and optionally negative sentences. Learning general context-free grammars is still considered a hard problem in machine learning and is not completely solved yet. The main contribution of the paper is a newly developed memetic algorithm, which is a population-based evolutionary algorithm enhanced with local search and a generalization process. The learning process is incremental since a new grammar is obtained from the current grammar and false negative samples, which are not parsed by the current grammar. Despite being incremental, the learning process is not sensitive to the order of samples. All important parts of this algorithm are explained and discussed. Finally, a case study of a domain specific language for rendering graphical objects is used to show the applicability of this approach.  相似文献   

4.
This paper describes the winning entry to the Omphalos context free grammar learning competition. We describe a context-free grammatical inference algorithm operating on positive data only, which integrates an information theoretic constituent likelihood measure together with more traditional heuristics based on substitutability and frequency. The competition is discussed from the perspective of a competitor. We discuss a class of deterministic grammars, the Non-terminally Separated (NTS) grammars, that have a property relied on by our algorithm, and consider the possibilities of extending the algorithm to larger classes of languages. Editor: Georgios Paliouras and Yasubumi Sakakibara  相似文献   

5.
Vanlehn  Kurt  Ball  William 《Machine Learning》1987,2(1):39-74
In principle, the version space approach can be applied to any induction problem. However, in some cases the representation language for generalizations is so powerful that (1) some of the update functions for the version space are not effectively computable, and (2) the version space contains infinitely many generalizations. The class of context-free grammars is a simple representation that exhibits these problems. This paper presents an algorithm that solves both problems for this domain. Given a sequence of strings, the algorithm incrementally constructs a data structure that has nearly all the beneficial properties of a version space. The algorithm is fast enough to solve small induction problems completely, and it serves as a framework for biases that permit the solution of larger problems heuristically. The same basic approach may be applied to representations that include context-free grammars as special cases, such as And-Or graphs, production systems, and Horn clauses.  相似文献   

6.
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.  相似文献   

7.
Fractal encoding of context-free grammars in connectionist networks   总被引:1,自引:0,他引:1  
Connectionist network learning of context-free languages has so far been applied only to very simple cases and has often made use of an external stack. Learning complex context-free languages with a homogeneous neural mechanism looks like a much harder problem. The current paper takes a step toward solving this problem by analyzing context-free grammar computation (without addressing learning) in a class of analog computers called dynamical automata, which are naturally implemented in connectionist networks. The result is a widely applicable method of using fractal sets to organize infinite-state computations in a bounded state space. An appealing consequence is the development of parameter-space maps, which locate various complex computers in spatial relationships to one another. An example suggests that such a global perspective on the organization of the parameter space may be helpful for solving the hard problem of getting connectionist networks to learn complex grammars from examples.  相似文献   

8.
With the introduction of the Regular Membership Constraint, a new line of research has opened where constraints are based on formal languages. This paper is taking the next step, namely to investigate constraints based on grammars higher up in the Chomsky hierarchy. We devise a time- and space-efficient incremental arc-consistency algorithm for context-free grammars, investigate when logic combinations of grammar constraints are tractable, show how to exploit non-constant size grammars and reorderings of languages, and study where the boundaries run between regular, context-free, and context-sensitive grammar filtering.  相似文献   

9.
Summary Specializing an existing graph grammar model we look in detail at node context-free graph grammars. With a slight generalization the parse trees for context-free Chomsky grammars can be used to describe derivations of these graph grammars.As shown already in former works the precedence graph grammars are defined as a subclass of context-free graph grammars by certain algebraic restrictions on the form of the rules. Then we can prove that every precedence grammar is unambiguous and additionally the reduction process in such a grammar read as replacement system is finite.The most important aim in defining the predence relations was a simple parsing method. This is realized because it is shown that the syntactic analysis for precedence graph grammars can be done in a time which linearly depends on the size of the input graph.The whole method has been implemented and a documentation is available.  相似文献   

10.
Suppes  Patrick  Böttner  Michael  Liang  Lin 《Machine Learning》1995,19(2):133-152
We are developing a theory of probabilistic language learning in the context of robotic instruction in elementary assembly actions. We describe the process of machine learning in terms of the various events that happen on a given trial, including the crucial association of words with internal representations of their meaning. Of central importance in learning is the generalization from utterances to grammatical forms. Our system derives a comprehension grammar for a superset of a natural language from pairs of verbal stimuli like Go to the screw! and corresponding internal representations of coerced actions. For the derivation of a grammar no knowledge of the language to be learned is assumed but only knowledge of an internal language.We present grammars for English, Chinese, and German generated from a finite sample of about 500 commands that are roughly equivalent across the three languages. All of the three grammars, which are context-free in form, accept an infinite set of commands in the given language.  相似文献   

11.
Grammar binarization is the process and result of transforming a grammar to an equivalent form whose rules contain at most two symbols in their right-hand side. Binarization is used, explicitly or implicity, by a wide range of parsers for context-free grammars and other grammatical formalisms. Non-trivial grammars can be binarized in multiple ways, but in order to optimize the parser's computational cost, it is convenient to choose a binarization that is as small as possible. While several authors have explored heuristics to obtain compact binarizations, none of them guarantee that the resulting grammar has minimum size. However, to our knowledge, no hardness results for this problem have been published. In this article, we address this issue and prove that the problem of finding a minimum binarization of a given context-free grammar is NP-hard, by reduction from vertex cover. We also provide a lower bound on the approximability of this problem.  相似文献   

12.
Fuzzy context-free max- grammar (or FCFG, for short), as a straightforward extension of context-free grammar, has been introduced to express uncertainty, imprecision, and vagueness in natural language fragments. Li recently proposed the approximation of fuzzy finite automata, which may effectively deal with the practical problems of fuzziness, impreciseness and vagueness. In this paper, we further develop the approximation of fuzzy context-free grammars. In particular, we show that a fuzzy context-free grammar under max- compositional inference can be approximated by some fuzzy context-free grammar under max-min compositional inference with any given accuracy. In addition, some related properties of fuzzy context-free grammars and fuzzy languages generated by them are studied. Finally, the sensitivity of fuzzy context-free grammars is also discussed.  相似文献   

13.
Summary Attribute grammars are a value-oriented, non-procedural extension to context-free grammars that facilitate the specification of translations whose domain is described by the underlying context-free grammar. Just as parsers for context-free languages can be automatically constructed from a context-free grammar, so can translators, called attribute evaluators, be automatically generated from an attribute grammar. A major obstacle to generating efficient attribute evaluators is that they typically use large amounts of memory to represent the attributed parse tree. In this report we investigate the problem of efficient representation of the attributed parse tree by analyzing and comparing the strategies of two systems that have been used to automatically generate a translator from an attribute grammar: the GAG system developed at the Universitat de Karlsruhe and the LINGUIST-86 system written at Intel Corporation. Our analysis will characterize the two strategies and highlight their respective strengths and weaknesses. Drawing on the insights given by this analysis, we propose a strategy for storage optimization in automatically generated attribute evaluators that not only incorporates the best features of both GAG and LINGUIST-86, but also contains novel features that address aspects of the problem that are handled poorly by both systems.This research was partially supported by the National Science Foundation under grant DCR-83-10930, and partially supported by the Defense Advanced Research Projects Agency under contract number N00039-84-C-0165  相似文献   

14.
A new interactive evolutionary 3D design system is presented. The representation is based on graph grammars, a fascinating and powerful formalism in which nodes and edges are iteratively rewritten by rules analogous to those of context-free grammars and shape grammars. The nodes of the resulting derived graph are labelled with Euclidean coordinates: therefore the graph fully represents a 3D beam design. Results from user-guided runs are presented, demonstrating the flexibility of the representation. Comparison with results using an alternative graph representation demonstrates that the graph grammar search space is more rich in organised designs. A set of numerical features are defined over designs. They are shown to be effective in distinguishing between the designs produced by the two representations, and between designs labelled by users as good or bad. The features allow the definition of a non-interactive fitness function in terms of proximity to target feature vectors. In non-interactive experiments with this fitness function, the graph grammar representation out-performs the alternative graph representation, and evolution out-performs random search.  相似文献   

15.
It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures.We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this by presenting an ambiguity analysis framework based on conservative language approximations. As a concrete example, we propose a technique based on local regular approximations and grammar unfoldings. We evaluate the analysis using grammars that occur in RNA analysis in bioinformatics, and we demonstrate that it is sufficiently precise and efficient to be practically useful.  相似文献   

16.
A number of grammatical formalisms have been proposed to describe the syntax of natural languages, and the universal recognition problems for some of those classes of grammars have been studied. A universal recognition problem for a class Q of grammars is the one to decide, taking a grammar G ∈ G and a string ui as an input, whether G can generate w or not. In this paper, the computational complexities of the universal recognition problems for parallel multiple context-free grammars, multiple context-free grammars, and their subclasses are discussed.  相似文献   

17.
A new class of grammars (precedence-regular grammars) is obtained as a proper extension of the class of weak precedence grammars. A parsing algorithm is described for this class of grammars, using Domolki's algorithm. Finally a criterion is obtained to decide whether a given context-free grammar belongs to this class.  相似文献   

18.
Trees can be conveniently compressed with linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars which are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression). It is shown that every linear straight-line context-free tree grammar can be transformed in polynomial time into a monadic (and linear) one. A tree grammar is monadic if each nonterminal uses at most one context parameter. Based on this result, polynomial time algorithms are presented for testing whether a given (i) nondeterministic tree automaton or (ii) nondeterministic tree automaton with sibling-constraints or (iii) nondeterministic tree walking automaton, accepts a tree represented by a linear straight-line context-free tree grammar. It is also shown that if tree grammars are nondeterministic or non-linear, then reducing their numbers of parameters cannot be done without an exponential blow-up in grammar size.  相似文献   

19.
Summary There is no algorithm for deciding whether two linear context-free grammars generate the same sentential forms. The equivalence problem for propagatingOL-systems is undecidable. The finiteness problem forOL-systems is decidable.SF-languages, i.e., languages which equal the set of sentential forms of a context-free grammar, possess some of the properties of context-free languages but their family is not closed under any of the ordinary operations.  相似文献   

20.
一种特殊的上下文无关文法及其语法分析   总被引:4,自引:0,他引:4  
张瑞岭 《软件学报》1998,9(12):904-910
SAQ系统是一个进行软件规约获取、检验和复用的实验系统,其中以上下文无关文法表示的概念是规约的一部分.SAQ要求将概念的词法和句法定义结合在一个上下文无关文法中.如果用常规的上下文无关文法描述诸如程序设计语言和自然语言等一些复杂概念的语法,则需要把诸如空格和回车等没有实质意义的分隔符包含到语法中去(这种描述方法称为朴素表示法),使得语法描述很累赘.为此,作者设计了一种特殊的上下文无关文法,它把通常上下文无关文法定义中的非终极符集合和终极符集合进行细化.用这种文法可以相对简洁地描述程序语言和自然语言等复杂概  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号