首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
This paper describes an improved version of TBL algorithm [Y. Sakakibara, Learning context-free grammars using tabular representations, Pattern Recognition 38(2005) 1372–1383; Y. Sakakibara, M. Kondo, GA-based learning of context-free grammars using tabular representations, in: Proceedings of 16th International Conference in Machine Learning (ICML-99), Morgan-Kaufmann, Los Altos, CA, 1999] for inference of context-free grammars in Chomsky Normal Form. The TBL algorithm is a novel approach to overcome the hardness of learning context-free grammars from examples without structural information available. The algorithm represents the grammars by parsing tables and thanks to this tabular representation the problem of grammar learning is reduced to the problem of partitioning the set of nonterminals. Genetic algorithm is used to solve NP-hard partitioning problem. In the improved version modified fitness function and new delete specialized operator is applied. Computer simulations have been performed to determine improved a tabular representation efficiency. The set of experiments has been divided into 2 groups: in the first one learning the unknown context-free grammar proceeds without any extra information about grammatical structure, in the second one learning is supported by a partial knowledge of the structure. In each of the performed experiments the influence of partition block size in an initial population and the size of population at grammar induction has been tested. The new version of TBL algorithm has been experimentally proved to be not so much vulnerable to block size and population size, and is able to find the solutions faster than standard one.  相似文献   

2.
逐步求精法获取上下文无关文法   总被引:3,自引:0,他引:3  
文法推断研究如何从语言的有限实例,通过归纳推断获取语言的文法定义。文中提出一个基于逐步求精的上下文无关文法推断方法,以尝试将文法推断用于替代或帮助传统手工的文法构造工作。文中的推断方法以Angluinh的交互式学习模型为框架,以逐步求精和复用为主要策略,具有增量式获取结构自然的文法的特点。  相似文献   

3.
The grammar of the language in which some given code is written is essential for developing automated tools for maintenance, reengineering, and program analysis. Frequently grammar is available for a language but not for its variants that are implemented by various vendors and in which the given code may be written. In this work we address the problem of obtaining the grammar from source code, which can then be used for generating tools for the programs. We propose an incremental method for obtaining grammar for a particular language variant, from a set of programs written in the language variant and an approximate grammar (presumably of the standard language) with some user interaction. We also present the design of a tool for implementing this approach and our experience in working with grammars of C, C++ and COBOL. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

4.
An unsupervised incremental algorithm for grammar inference and its application to domain-specific language development are described. Grammatical inference is the process of learning a grammar from the set of positive and optionally negative sentences. Learning general context-free grammars is still considered a hard problem in machine learning and is not completely solved yet. The main contribution of the paper is a newly developed memetic algorithm, which is a population-based evolutionary algorithm enhanced with local search and a generalization process. The learning process is incremental since a new grammar is obtained from the current grammar and false negative samples, which are not parsed by the current grammar. Despite being incremental, the learning process is not sensitive to the order of samples. All important parts of this algorithm are explained and discussed. Finally, a case study of a domain specific language for rendering graphical objects is used to show the applicability of this approach.  相似文献   

5.
Inference of high-dimensional grammars is discussed. Specifically, techniques for inferring tree grammars are briefly presented. The problem of inferring a stochastic grammar to model the behavior of an information source is also introduced and techniques for carrying out the inference process are presented for a class of stochastic finite-state and context-free grammars. The possible practical application of these methods is illustrated by examples.  相似文献   

6.
R. Lmmel  C. Verhoef 《Software》2001,31(15):1395-1438
We propose an approach to the construction of grammars for existing languages. The main characteristic of the approach is that the grammars are not constructed from scratch but they are rather recovered by extracting them from language references, compilers and other artifacts. We provide a structured process to recover grammars including the adaptation of raw extracted grammars and the derivation of parsers. The process is applicable to possibly all existing languages for which business critical applications exist. We illustrate the approach with a non‐trivial case study. Using our process and some basic tools, we constructed in a few weeks a complete and correct VS COBOL II grammar specification for IBM mainframes. In addition, we constructed a parser for VS COBOL II, and were the first to publish a (Web‐enabled) grammar specification so that others can use this result to construct their own grammar‐based tools for VS COBOL II or derivatives. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

7.
The high complexity of natural language and the huge amount of human and temporal resources necessary for producing the grammars lead several researchers in the area of Natural Language Processing to investigate various solutions for automating grammar generation and updating processes. Many algorithms for Context-Free Grammar inference have been developed in the literature. This paper provides a survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade. After introducing some preliminary definitions and notations concerning learning and inductive inference, some of the most relevant existing grammatical inference methods for Natural Language are described and classified according to the kind of presentation (if text or informant) and the type of information (if supervised, unsupervised, or semi-supervised). Moreover, the state of the art of the strategies for evaluation and comparison of different grammar inference methods is presented. The goal of the paper is to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.  相似文献   

8.
Grammar convergence is a method that helps in discovering relationships between different grammars of the same language or different language versions. The key element of the method is the operational, transformation-based representation of those relationships. Given input grammars for convergence, they are transformed until they are structurally equal. The transformations are composed from primitive operators; properties of these operators and the composed chains provide quantitative and qualitative insight into the relationships between the grammars at hand. We describe a refined method for grammar convergence, and we use it in a major study, where we recover the relationships between all the grammars that occur in the different versions of the Java Language Specification (JLS). The relationships are represented as grammar transformation chains that capture all accidental or intended differences between the JLS grammars. This method is mechanized and driven by nominal and structural differences between pairs of grammars that are subject to asymmetric, binary convergence steps. We present the underlying operator suite for grammar transformation in detail, and we illustrate the suite with many examples of transformations on the JLS grammars. We also describe the extraction effort, which was needed to make the JLS grammars amenable to automated processing. We include substantial metadata about the convergence process for the JLS so that the effort becomes reproducible and transparent.  相似文献   

9.
While grammar inference (or grammar induction) has found extensive application in the areas of robotics, computational biology, and speech recognition, its application to problems in programming language and software engineering domains has been limited. We have found a new application area for grammar inference which intends to make domain-specific language development easier for domain experts not well versed in programming language design, and finds a second application in construction of renovation tools for legacy software systems. As a continuation of our previous efforts to infer context-free grammars (CFGs) for domain-specific languages which previously involved a genetic-programming based CFG inference system, we discuss extensions to the inference capabilities of GenInc, an incremental learning algorithm for inferring CFGs. We show that these extensions enable GenInc to infer more comprehensive grammars, discuss the results of applying GenInc to various domain-specific languages and evaluate the results using a comprehensive suite of grammar metrics.  相似文献   

10.
Summary The present categorial approach to bottom-up parsing context free grammars treats two aspects of determinism. One is an abstraction of grammatical determinism from actual parsing strategies. The other is the transfer of determinism under grammar transformations. The approach is based on the characterization of a parse step as categorial limit, which on the one hand yields a convenient pattern for grammar type definition, and leads on the other hand in a transparent way to invariance results on deterministic grammars under homomorphic transformations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号