首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 609 毫秒
1.
Jean Bovet  Terence Parr 《Software》2008,38(12):1305-1332
Programmers tend to avoid using language tools, resorting to ad hoc methods, because tools can be hard to use, their parsing strategies can be difficult to understand and debug, and their generated parsers can be opaque black‐boxes. In particular, there are two very common difficulties encountered by grammar developers: understanding why a grammar fragment results in a parser non‐determinism and determining why a generated parser incorrectly interprets an input sentence. This paper describes ANTLRWorks, a complete development environment for ANTLR grammars that attempts to resolve these difficulties and, in general, make grammar development more accessible to the average programmer. The main components are a grammar editor with refactoring and navigation features, a grammar interpreter, and a domain‐specific grammar debugger. ANTLRWorks' primary contributions are a parser non‐determinism visualizer based on syntax diagrams and a time‐traveling debugger that pays special attention to parser decision‐making by visualizing lookahead usage and speculative parsing during backtracking. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

2.
S. Glass  D. Ince  E. Fergus 《Software》2001,31(10):983-1001
Parser generators such as yacc have been used in a large number of applications, not just those that involve compiler writing. This has meant that these tools are being used increasingly by nonspecialist developers. A consequence of this is that good support is required for debugging a grammar and its generated parser(s). This paper describes Llun, a debugging tool that visualizes the operation of a generated parser at both a high‐level and a low‐level. Llun is superior to other parser visualization products by virtue of the high‐level facilities it offers. The paper describes some of the problems encountered using parser generators, outlines a visualization system which addresses a number of the problems and uses a taxonomy developed by Price to categorize the system. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

3.
Visual YACC is a tool that automatically creates visualizations of the YACC LR parsing process and synthesized attribute computation. The Visual YACC tool works by instrumenting a standard YACC grammar with graphics calls that draw the appropriate data structures given the current actions by the parser. The new grammar is processed by the YACC tools and the resulting parser displays the parse stack and parse tree for every step of the parsing process of a given input string. Visual YACC was initially designed to be used in compiler construction courses to supplement the teaching of parsing and syntax directed evaluation. We have also found it to be useful in the difficult task of debugging YACC grammars. In this paper, we describe this tool and how it is used in both contexts. We also detail two different implementations of this tool: one that produces a parser written in C with calls to Motif; and a second implementation that generates Java source code. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

4.
Syntactic analysis forms a foundation of many source analysis and reverse engineering tools. However, a single standard grammar is not always appropriate for all source analysis and manipulation tasks. Small custom modifications to the grammar can make the programs used to implement these tasks simpler, clearer and more efficient. This leads to a new paradigm for programming these tools: agile parsing. In agile parsing the effective grammar used by a particular tool is a combination of two parts: the standard base grammar for the input language, and a set of explicit grammar overrides that modify the parse to support the task at hand. This paper introduces the basic techniques of agile parsing in TXL and discusses several industry proven techniques for exploiting agile parsing in software source analysis and transformation.  相似文献   

5.
Koen De Bosschere 《Software》1996,26(7):763-779
Prolog is a language with a dynamic grammar which is the result of embedded operator declarations. The parsing of such a language cannot be done easily by means of standard tools. Most often, an existing parsing technique for a static grammar is adapted to deal with the dynamic constructs. This paper uses the syntax definition as defined by the ISO standard for the Prolog language. It starts with a brief discussion of the standard, highlighting some aspects that are important for the parser, such as the restrictions on the use of operators as imposed by the standard in order to make the parsing deterministic. Some possible problem areas are also indicated. As output is closely related to input in Prolog, both are treated in this paper. Some parsing techniques are compared and an operator precedence parser is chosen to be modified to deal with the dynamic operator declarations. The necessary modifications are discussed and an implementation in C is presented. Performance data are collected and compared with a public domain Prolog parser written in Prolog. It is the first efficient public domain parser for Standard Prolog that actually works and deals with all the details of the syntax.  相似文献   

6.
R. Lmmel  C. Verhoef 《Software》2001,31(15):1395-1438
We propose an approach to the construction of grammars for existing languages. The main characteristic of the approach is that the grammars are not constructed from scratch but they are rather recovered by extracting them from language references, compilers and other artifacts. We provide a structured process to recover grammars including the adaptation of raw extracted grammars and the derivation of parsers. The process is applicable to possibly all existing languages for which business critical applications exist. We illustrate the approach with a non‐trivial case study. Using our process and some basic tools, we constructed in a few weeks a complete and correct VS COBOL II grammar specification for IBM mainframes. In addition, we constructed a parser for VS COBOL II, and were the first to publish a (Web‐enabled) grammar specification so that others can use this result to construct their own grammar‐based tools for VS COBOL II or derivatives. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

7.
GRK — the Grammar Recovery Kit — illustrates options for automation and corresponding tool support in the context of developing quality language references that readily cater for the derivation of parsers.GRK provides the proof-of-concept for two notions: (i) semi-automatic grammar recovery; (ii) language-reference re-engineering. GRK's support for semi-automatic grammar recovery means that GRK can be used to obtain a relatively correct and complete as well as implementable grammar from a language reference. GRK's support for language-reference re-engineering means that GRK can be used to update the original language reference such that it reflects the completed and corrected grammar knowledge.As of today, GRK is particularly fit for Cobol archaeology, more specifically for IBM's VS Cobol II. That is, GRK offers a fully mechanised process, where IBM's reference is used as an input, and the output is a transformed language reference whose grammar portions are correct and complete. (The recovery required several hundreds of simple transformation steps in order to deliver a grammar that is fit for parser derivation.) As a byproduct, GRK also generates a slow, Prolog-based parser. Via export to GRK's sibling, GDK (the Grammar Deployment Kit), a reasonably fast, btyacc-based parser can be generated as well. Both parsers accept all of the VS Cobol II code that is at our avail (several millions of lines of code).  相似文献   

8.
The term grammar-based software describes software whose input can be specified by a context-free grammar. This grammar may occur explicitly in the software, in the form of an input specification to a parser generator, or implicitly, in the form of a hand-written parser. Grammar-based software includes not only programming language compilers, but also tools for program analysis, reverse engineering, software metrics and documentation generation. Hence, ensuring their completeness and correctness is a vital prerequisite for their use. In this paper we propose a strategy for the construction of test suites for grammar based software, and illustrate this strategy using the ISO C + +  grammar. We use the concept of grammar-rule coverage as a pivot for the reduction of an implementation-based test suite, and demonstrate a significant decrease in the size of this suite. The effectiveness of this reduced test suite is compared to the original test suite with respect to code coverage and more importantly, fault detection. This work greatly expands upon previous work in this area and utilises large scale mutation testing to compare the effectiveness of grammar-rule coverage to that of statement coverage as a reduction criterion for test suites of grammar-based software. This work finds that when grammar rule coverage is used as the sole criterion for reducing test suites of grammar based software, the fault detection capability of that reduced test suite is greatly diminished when compared to other coverage criteria such as statement coverage.
James F. PowerEmail:
  相似文献   

9.
The paper is the second in a series of three papers devoted to a detailed study of LR(k) parsing with error recovery and correction. Error recovery in LR(k) parsing of a context-free grammar is formalized by extending an LR(k) parser of the grammar such that it accepts all strings over the terminal vocabulary. The parse produced by this extension for a terminal string is a right parse if the string is in the language. In the case of a string not in the language the parse produced by the extension contains so-called error productions which represent the error recovery actions performed by the extension. The treatment is based on the formalization of LR(k) parsing presented in the first paper in the series and it covers practically all error recovery methods designed for LR(k) parsing.  相似文献   

10.
The paper is the third in a series of three papers devoted to a detailed study of LR(k) parsing with error recovery and correction. A new class of syntax errors is introduced, called (k)-local parser defined errors, which suit better than the conventional minimum distance errors for characterization of error detection and recovery in LR(k) parsing. The question whether a given string has n k-local parser defined errors for some integer n is shown to be decidable. Using the formalization of LR(k) parsing and error recovery presented in the first and the second paper in the series it is shown that the canonical LR(k) parser of an LR(k) grammar always has an error recovering extension which is able to produce a correction for any terminal string containing only (k)-local parser defined errors.  相似文献   

11.
The importance of the parsing task for NLP applications is well understood. However developing parsers remains difficult because of the complexity of the Arabic language. Most parsers are based on syntactic grammars that describe the syntactic structures of a language. The development of these grammars is laborious and time consuming. In this paper we present our method for building an Arabic parser based on an induced grammar, PCFG grammar. We first induce the PCFG grammar from an Arabic Treebank. Then, we implement the parser that assigns syntactic structure to each input sentence. The parser is tested on sentences extracted from the treebank (1650 sentences).We calculate the precision, recall and f-measure. Our experimental results showed the efficiency of the proposed parser for parsing modern standard Arabic sentences (Precision: 83.59 %, Recall: 82.98 % and F-measure: 83.23 %).  相似文献   

12.
Analyses of C source code usually ignore the C preprocessor because of its complexity. Instead, these analyses either define their own approximate parser or scanner, or else they require that their input already be preprocessed. Neither approach is entirely satisfactory: the first gives up accuracy (or incurs large implementation costs), while the second loses the preprocessor‐based abstractions. We describe a framework that permits analyses to be expressed in terms of both preprocessing and parsing actions, allowing the implementer to focus on the analysis. We discuss an implementation of such a framework that embeds a C preprocessor, a parser, and a Perl interpreter for the action ‘hooks’. Many common software engineering analyses can be written surprisingly easily using our implementation, replacing numerous ad‐hoc tools. The framework's integration of the preprocessor and the parser further enables some analyses that otherwise would be especially difficult to perform. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

13.
The analysis of programming languages is a fundamental activity in the process of comprehending, maintaining, documenting and reengineering old software. However, the analysis may sometimes be inefficient because of the parsing methodology adopted. Prolog parsers, for instance, have two main drawbacks: the time spent on extracting information as a result of the backtracking mechanism, and the large memory occupation involved. This paper presents an algorithm that optimizes a top-down parser by imposing a set of rules on the grammar to be analyzed. A methodology is then defined for rewritting a context-free grammar (LL(n) n > = 1), and a tool that can achieve this is illustrated. This tool, called FACTOTUM, has been implemented in LPA Prolog, and works on a PC (IBM-MSDOS) through the Windows medium. FACTOTUM is composed of nearly 250 productions, and is an important part of another reverse engineering tool called RE_Tool (Giannone and Maresca, 1995) which can analyze codes in a multilanguage environment. RE_Tool has been implemented in the software engineering laboratories of DIS at the University of Naples Federico II, where it is currently under experimentation.  相似文献   

14.
Code transformation and analysis tools provide support for software engineering tasks such as style checking, testing, calculating software metrics as well as reverse‐ and re‐engineering. In this paper we describe the architecture and the applications of JTransform, a general Java source code processing and transformation framework. It consists of a Java parser generating a configurable parse tree and various visitors (transformers, tree evaluators) which produce different kinds of outputs. While our framework is written in Java, the paper further opens an opportunity for a new generation of XML‐based source code tools. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

15.
A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.  相似文献   

16.
Summary Most applications of parsing require that the parser call semantic action routines while processing the input. For LR(k) parsers it is well known that a semantic action routine can be called when the end of a production is recognized. Often, however, it is desirable to call routines at other times.This paper presents fast algorithms that determine, for an LR(k) (or SLR(k)) grammar, which positions are suitable for calling routines. The algorithms are practical for use with LR(1) (SLR(1)) parser building programs, because the worst case running time is dominated by the time required to build the LR(1) (SLR(1)) parser. Applications of the algorithms to attribute grammars and automatic indentation are discussed.  相似文献   

17.
In this paper, we describe our experience in grammar engineering to construct multiple parsers and front ends for the Python language. We present a metrics-based study of the evolution of the Python grammars through the multiple versions of the language in an effort to distinguish and measure grammar evolution and to provide a basis of comparison with related research in grammar engineering. To conduct this research, we have built a toolkit, pygrat , which builds on tools developed in other research. We use pygrat to build a system that automates much of the process needed to translate the Python grammars from EBNF to a formalism acceptable to the bison parser generator. We exploit the suite of Python test cases, used by the Python developers, to validate our parser generation. Finally, we describe our use of the menhir parser generator to facilitate the parser and front-end construction, eliminating some of the transformations and providing practical support for grammar modularisation.  相似文献   

18.
LALR(1)分析程序生成系统在编译器构造领域以外被许多普通软件开发者学习和使用.为帮助用户理解LALR(1)分析器方法,编写出正确、完整、无语法分析冲突的文法规范,严格定义了使用LALR(1)分析器生成器时用户可能遇到的几类文法问题,描述一个为帮助用户解决这些问题而开发的LALR(1)分析器可视化和断点调试系统VPGE.VPGE以多种视图显示LALR(1)分析器的数据结构,包括状态栈、符号栈、输入符号串、分析树和底层的自动机,支持LR分析动作的单步执行和断点调试.性能实验结果表明,VPGE比GNU的Bison有更快的分析器生成速度,从而提供了一个LALR(1)文法及分析器的快速交互式调试环境.  相似文献   

19.
Since many domains are constantly evolving, the associated domain specific languages (DSL) inevitably have to evolve too, to retain their value. But the evolution of a DSL can be very expensive, since existing words of the language (i.e. programs) and tools have to be adapted according to the changes of the DSL itself. In such cases, these costs seriously limit the adoption of DSLs.This paper presents Lever, a tool for the evolutionary development of DSLs. Lever aims at making evolutionary changes to a DSL much cheaper by automating the adaptation of the DSL parser as well as existing words and providing additional support for the correct adaptation of existing tools (e.g. program generators). This way, Lever simplifies DSL maintenance and paves the ground for bottom-up DSL development.  相似文献   

20.
Aho and Ullman give an algorithm in [1] for eliminating reductions of the form AB, where A and B are nonterminals, from a set of LR(k) parsing tables, thus increasing the speed of the parser and reducing its size. We present a modification of their algorithm and show that after reductions by AB have been eliminated, the symbols A and B can be equated and the associated columns of the parsing table merged, further reducing the size of the parser. The new set of tables parses according to an “abridged” grammar with fewer nonterminal symbols than the original. Finally, we give an algorithm which for certain LR(1) grammars constructs a set of LR(1) tables with no reductions by single productions, directly from the grammar.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号