首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The grammar of the language in which some given code is written is essential for developing automated tools for maintenance, reengineering, and program analysis. Frequently grammar is available for a language but not for its variants that are implemented by various vendors and in which the given code may be written. In this work we address the problem of obtaining the grammar from source code, which can then be used for generating tools for the programs. We propose an incremental method for obtaining grammar for a particular language variant, from a set of programs written in the language variant and an approximate grammar (presumably of the standard language) with some user interaction. We also present the design of a tool for implementing this approach and our experience in working with grammars of C, C++ and COBOL. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

2.
We establish a semantics for building grammars from a modularised specification in which modules are able to delete productions from imported nonterminals. Modules have import lists of nonterminals; some or all of an imported nonterminal's productions may be suppressed at import time. There are two basic import mechanisms which (a) reference or (b) clone an imported nonterminal's productions. One of our goals is to allow a precise answer to the question: ‘what character level language does this grammar generate’ in the face of difficult issues such as the mutual embedding of languages that have different whitespace and commenting conventions. Our technique is to automatically generate a character level grammar from grammars written at token level in the conventional way; the grammar is constructed from modules each of which may have its own whitespace convention.  相似文献   

3.
4.
正则文法是研究自动机的重要工具。引入取值于赋值幺半群的加权正则文法、加权类正则文法的定义,讨论了赋值幺半群上加权正则文法、加权类正则文法和加权有限自动机(WFA)的关系。证明了在赋值幺半群上,已知一个加权正则文法或加权类正则文法,分别存在一个WFA与之等价。定义了可分配的赋值幺半群,证明了在可分配的赋值幺半群上已知一个WFA,存在一个加权正则文法和加权类正则文法与之等价,即证明了可分配的赋值幺半群上加权正则文法、加权类正则文法和WFA在生成语言上等价,并举例说明了赋值幺半群的可分配性不是已知WFA存在与之等价的加权正则文法或加权类正则文法的必要条件。  相似文献   

5.
Stream X-machines are a general and powerful computational model. By coupling the control structure of a stream X-machine with a set of formal grammars a new machine called a generalised stream X-machine with underlying distributed grammars, acting as a translator, is obtained. By introducing this new mechanism a hierarchy of computational models is provided. If the grammars are of a particular class, say regular or context-free, then finite sets are translated into finite sets, when ?k, = k derivation strategies are used, and regular or context-free sets, respectively, are obtained for ?k, * and terminal derivation strategies. In both cases, regular or context-free grammars, the regular sets are translated into non-context-free languages. Moreover, any language accepted by a Turing machine may be written as a translation of a regular set performed by a generalised stream X-machine with underlying distributed grammars based on context-free rules, under = k derivation strategy. On the other hand the languages generated by some classes of cooperating distributed grammar systems may be obtained as images of regular sets through some X-machines with underlying distributed grammars. Other relations of the families of languages computed by generalised stream X-machines with the families of languages generated by cooperating distributed grammar systems are established. At the end, an example dealing with the specification of a scanner system illustrates the use of the introduced mechanism as a formal specification model. Received September 1999 / Accepted in revised form October 2000  相似文献   

6.
In this paper, we describe our experience in grammar engineering to construct multiple parsers and front ends for the Python language. We present a metrics-based study of the evolution of the Python grammars through the multiple versions of the language in an effort to distinguish and measure grammar evolution and to provide a basis of comparison with related research in grammar engineering. To conduct this research, we have built a toolkit, pygrat , which builds on tools developed in other research. We use pygrat to build a system that automates much of the process needed to translate the Python grammars from EBNF to a formalism acceptable to the bison parser generator. We exploit the suite of Python test cases, used by the Python developers, to validate our parser generation. Finally, we describe our use of the menhir parser generator to facilitate the parser and front-end construction, eliminating some of the transformations and providing practical support for grammar modularisation.  相似文献   

7.
A wide range of parser generators are used to generate parsers for programming languages. The grammar formalisms that come with parser generators provide different approaches for defining operator precedence. Some generators (e.g. YACC) support precedence declarations, others require the grammar to be unambiguous, thus encoding the precedence rules. Even if the grammar formalism provides precedence rules, a particular grammar might not use it. The result is grammar variants implementing the same language. For the C language, the GNU Compiler uses YACC with precedence rules, the C-Transformers uses SDF without priorities, while the SDF library does use priorities. For PHP, Zend uses YACC with precedence rules, whereas PHP-front uses SDF with priority and associativity declarations.The variance between grammars raises the question if the precedence rules of one grammar are compatible with those of another. This is usually not obvious, since some languages have complex precedence rules. Also, for some parser generators the semantics of precedence rules is defined operationally, which makes it hard to reason about their effect on the defined language. We present a method and tool for comparing the precedence rules of different grammars and parser generators. Although it is undecidable whether two grammars define the same language, this tool provides support for comparing and recovering precedence rules, which is especially useful for reliable migration of a grammar from one grammar formalism to another. We evaluate our method by the application to non-trivial mainstream programming languages, such as PHP and C.  相似文献   

8.
Grammar deployment is the process of turning a given grammar specification into a working parser. The Grammar Deployment Kit (for short, GDK) provides tool support in this process based on grammar engineering methods. We are mainly interested in the deployment of grammars for software renovation tools, that is, tools for software re- and reverse engineering. The current version of GDK is optimized for Cobol. We assume that grammar deployment starts from an initial grammar specification which is maybe still ambiguous or even incomplete. In practice, grammar deployment binds unaffordable human resources because of the unavailability of suitable grammar specifications, the diversity of parsing technology as well as the limitations of the technology, integration problems regarding the development of software renovation functionality, and the lack of tools and adherence to firm methods for grammar engineering. GDK helps to largely automate grammar deployment because tool support for grammar adaptation and parser generation is provided. We support different parsing technologies, among them btyacc, that is, yacc with backtracking. GDK is free software.  相似文献   

9.
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.  相似文献   

10.
一种特殊的上下文无关文法及其语法分析   总被引:4,自引:0,他引:4  
张瑞岭 《软件学报》1998,9(12):904-910
SAQ系统是一个进行软件规约获取、检验和复用的实验系统,其中以上下文无关文法表示的概念是规约的一部分.SAQ要求将概念的词法和句法定义结合在一个上下文无关文法中.如果用常规的上下文无关文法描述诸如程序设计语言和自然语言等一些复杂概念的语法,则需要把诸如空格和回车等没有实质意义的分隔符包含到语法中去(这种描述方法称为朴素表示法),使得语法描述很累赘.为此,作者设计了一种特殊的上下文无关文法,它把通常上下文无关文法定义中的非终极符集合和终极符集合进行细化.用这种文法可以相对简洁地描述程序语言和自然语言等复杂概  相似文献   

11.
In this paper we provide an implementation strategy to map a functional specification of an utterance into a syntactically well-formed sentence. We do this by integrating the functional and the syntactic perspectives on language, which we take to be exemplified by systemic grammars and tree adjoining grammars (TAGs) respectively. From systemic grammars we borrow the use of networks of choices to classify the set of possible constructions. The choices expressed in an input are mapped by our generator to a syntactic structure as defined by a TAG. We argue that the TAG structures can be appropriate structural units of realization in an implementation of a generator based on systemic grammar and also that a systemic grammar provides an effective means of deciding between various syntactic possibilities expressed in a TAG grammar. We have developed a generation strategy which takes advantage of what both paradigms offer to generation, without compromising either.  相似文献   

12.
We study regularly controlled bidirectional (RCB) grammars from the viewpoint of time-bounded grammars. RCB-grammars are context-free grammars of which the rules can be used in a productive and in a reductive fashion, while the application of these rules is controlled by a regular language. Several modes of derivation can be distinguished for this kind of grammar. A time bound on such a grammar is a measure of its derivational complexity. For some families of time bounds and for some modes of derivation we establish closure properties and a normal form theorem. In addition parsing algorithms are given for some modes of derivation. We conclude with considering generalizations with respect to the family of control languages and the family of bounding functions..  相似文献   

13.
14.
刘禹锋  杨帆 《软件学报》2021,32(12):3669-3683
作为一种二维的形式化方法,图文法为可视化语言提供了直观而规范的描述手段.然而,大多数图文法形式框架在空间语义处理能力方面有所不足,影响了图文法的表达能力及其实际应用范围.针对现存的问题,构建了一种新型空间图文法形式框架vCGG (virtual-node based coordinate graph grammar).区别于其他空间图文法,vCGG在产生式中通过定义虚结点的概念描述产生式与主图之间的语法结构与空间语义关系,在保留抽象能力的同时,提高了其空间语义配置性能.通过与几种典型空间图文法框架比较,vCGG形式框架在直观性、规范性、表达能力以及分析效率方面均有着较好的表现.  相似文献   

15.
邹阳  吕建  曹春  胡昊  宋巍  杨启亮 《软件学报》2012,23(7):1635-1655
上下文相关图文法是描述可视化语言的形式化工具.为了直观地刻画并高效地分析可视化语言,已有图文法形式框架均着重于文法形式和分析算法的研究,而忽略了对它们之间表达能力的分析.在对已有上下文相关图文法形式框架的关键特征进行分析和归纳的基础上,通过构造不同形式框架之间的转换算法,揭示并形式化证明了它们表达能力之间的关系.而且,转换算法在不同形式框架之间建立了关联,使图文法的应用不必再局限于一个框架,而是可以选择不同框架分别进行图的描述和分析,从而提高了上下文相关图文法的易用性.  相似文献   

16.
A technique that represents derivations of a context-free grammarG over a semiring and that obtains for a wordw inL(G) the set of all canonical parses forw has previously been described. A state grammar is one of a collection of grammars that place restrictions on the manner of application of context-free-like productions and that generate a noncontext-free language. The context-free properties of a state grammar have been used to extend the algebraic parsing technique for languages generated by state grammars,viz., context-sensitive languages. The extension for state grammars is not unlike that required for other types of grammars in whose collection state grammars are representative.  相似文献   

17.
Jean Bovet  Terence Parr 《Software》2008,38(12):1305-1332
Programmers tend to avoid using language tools, resorting to ad hoc methods, because tools can be hard to use, their parsing strategies can be difficult to understand and debug, and their generated parsers can be opaque black‐boxes. In particular, there are two very common difficulties encountered by grammar developers: understanding why a grammar fragment results in a parser non‐determinism and determining why a generated parser incorrectly interprets an input sentence. This paper describes ANTLRWorks, a complete development environment for ANTLR grammars that attempts to resolve these difficulties and, in general, make grammar development more accessible to the average programmer. The main components are a grammar editor with refactoring and navigation features, a grammar interpreter, and a domain‐specific grammar debugger. ANTLRWorks' primary contributions are a parser non‐determinism visualizer based on syntax diagrams and a time‐traveling debugger that pays special attention to parser decision‐making by visualizing lookahead usage and speculative parsing during backtracking. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

18.
Toward a lexicalized grammar for interlinguas   总被引:1,自引:0,他引:1  
In this paper we present one aspect of our research on machine translation (MT): capturing the grammatical and computational relation between (i) the interlingua (IL) as defined declaratively in the lexicon and (ii) the IL as defined procedurally by way of algorithms that compose and decompose pivot IL forms. We begin by examining the interlinguas in the lexicons of a variety of current IL-based approaches to MT. This brief survey makes it clear that no consensus exists among MT researchers on the level of representation for defining the IL. In the section that follows, we explore the consequences of this missing formal framework for MT system builders who develop their own lexical-IL entries. The lack of software tools to support rapid IL respecification and testing greatly hampers their ability to modify representations to handle new data and new domains. Our view is that IL-based MT research needs both (a) the formal framework to specify possible IL grammars and (b) the software support tools to implement and test these grammars. With respect to (a), we propose adopting a lexicalized grammar approach, tapping research results from the study oftree grammars for natural language syntax. With respect to (b), we sketch the design and functional specifications for parts of ILustrate, the set of software tools that we need to implement and test the various IL formalisms that meet the requirements of a lexicalized grammar. In this way, we begin to address a basic issue in MT research, how to define and test an interlingua as a computational language — without building a full MT system for each possible IL formalism that might be proposed.  相似文献   

19.
Hui Wu  Jeff Gray  Marjan Mernik 《Software》2008,38(10):1073-1103
Domain‐specific languages (DSLs) assist a software developer (or end‐user) in writing a program using idioms that are similar to the abstractions found in a specific problem domain. Tool support for DSLs is lacking when compared with the capabilities provided for standard general‐purpose languages (GPLs), such as Java and C++. For example, support for debugging a program written in a DSL is often non‐existent. The lack of a debugger at the proper abstraction level limits an end‐user's ability to discover and locate faults in a DSL program. This paper describes a grammar‐driven technique to build a debugging tool generation framework from existing DSL grammars. The DSL grammars are used to generate the hooks needed to interface with a supporting infrastructure constructed for an integrated development environment that assists in debugging a program written in a DSL. The contribution represents a coordinated approach to bring essential software tools (e.g. debuggers) to different types of DSLs (e.g. imperative, declarative, and hybrid). This approach hides from the end‐users the accidental complexities associated with expanding the focus of a language environment to include debuggers. The research described in this paper addresses a long‐term goal of empowering end‐users with development tools for particular DSL problem domains at the proper level of abstraction without depending on a specific GPL. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

20.
Building parsers is an essential task for the development of many tools, from software maintenance tools to any kind of business-specific, programmable environment having a command-line interface. Whilst grammars for many programming languages are available, these are, very often, almost useless because of the large diffusion of dialects and variants not contemplated by standard grammars. Writing a grammar by hand is clearly feasible, however it can be a tedious and error-prone task, requiring appropriate skills not always available. Grammar inference is a possible, challenging approach for obtaining suitable grammars from program examples. However, inference from scratch poses serious scalability issues and tends to produce correct, but meaningless grammars, hard to be understood and used to build tools. This paper describes an approach, based on genetic algorithms, for evolving existing grammars towards target (dialect) grammars, inferring changes from examples written using the dialect. Results obtained experimenting the inference of C dialect rules show that the algorithm is able to successfully evolve the grammar. Inspections indicated that the changes automatically made to the grammar during its evolution preserved its meaningfulness, and were comparable to what a developer could have done by hand.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号