首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary Methods for the automatic construction of error handling parsers are presented. The resulting parsers are capable of correcting all syntax errors by insertion and/or deletion of terminal symbols to the right of the error location. Thus, the output of the parser always corresponds to a syntactically valid program. This contributes significantly to the reliability and robustness of a compiler. The speed of parsing correct parts of a program is not affected by the presence of the error handling capability. The correction algorithm is easy to implement. Apart from the parsing tables only one character per parser state is required to control the correction process. The method is applicable to a wide class of stack automata including LL(k), LR(k), SLR(k), and LALR(k) parsers. It is shown that for LL(k) grammars error correction can be obtained as a byproduct of the canonical LL(k) parser generation. A similar result can be obtained for LR(k) grammars if the parser generator is slightly modified. The method has been successfully added to an LALR(1) parser generator.  相似文献   

2.
The paper is the second in a series of three papers devoted to a detailed study of LR(k) parsing with error recovery and correction. Error recovery in LR(k) parsing of a context-free grammar is formalized by extending an LR(k) parser of the grammar such that it accepts all strings over the terminal vocabulary. The parse produced by this extension for a terminal string is a right parse if the string is in the language. In the case of a string not in the language the parse produced by the extension contains so-called error productions which represent the error recovery actions performed by the extension. The treatment is based on the formalization of LR(k) parsing presented in the first paper in the series and it covers practically all error recovery methods designed for LR(k) parsing.  相似文献   

3.
Riad Jabri 《Computing》2011,92(2):123-167
Despite all advances in parsing, parser size, conflict resolution and error recovery are still of important consideration. In this research, we propose a predictive bottom-up parser. The parser is implemented in two versions. Both versions constitute an algorithm that simulates the run of a shift–reduce automaton, defined and constructed in a way that integrates its parsing actions with reduction prediction, conflict resolution and error recovery. However, the first implementation version performs explicit shift–reduce parsing actions based on implicit prediction of the reduction sequences. The second one performs parsing actions based on explicit prediction of the reduction sequences with implied shift–reduce actions. The proposed parser has been experimented against the ones based on similar approaches. 10–20% reduction of the parser size has been achieved, with a parsing behaviour proportional to a factor reflecting the grammar ambiguity.  相似文献   

4.
The paper is the third in a series of three papers devoted to a detailed study of LR(k) parsing with error recovery and correction. A new class of syntax errors is introduced, called (k)-local parser defined errors, which suit better than the conventional minimum distance errors for characterization of error detection and recovery in LR(k) parsing. The question whether a given string has n k-local parser defined errors for some integer n is shown to be decidable. Using the formalization of LR(k) parsing and error recovery presented in the first and the second paper in the series it is shown that the canonical LR(k) parser of an LR(k) grammar always has an error recovering extension which is able to produce a correction for any terminal string containing only (k)-local parser defined errors.  相似文献   

5.
LALR(1)分析程序生成系统在编译器构造领域以外被许多普通软件开发者学习和使用.为帮助用户理解LALR(1)分析器方法,编写出正确、完整、无语法分析冲突的文法规范,严格定义了使用LALR(1)分析器生成器时用户可能遇到的几类文法问题,描述一个为帮助用户解决这些问题而开发的LALR(1)分析器可视化和断点调试系统VPGE.VPGE以多种视图显示LALR(1)分析器的数据结构,包括状态栈、符号栈、输入符号串、分析树和底层的自动机,支持LR分析动作的单步执行和断点调试.性能实验结果表明,VPGE比GNU的Bison有更快的分析器生成速度,从而提供了一个LALR(1)文法及分析器的快速交互式调试环境.  相似文献   

6.
Even faster generalized LR parsing   总被引:2,自引:1,他引:1  
We prove a property of generalized LR (GLR) parsing – if the grammar is without right and hidden left recursions, then the number of consecutive reductions between the shifts of two adjacent symbols cannot be greater than a constant. Further, we show that this property can be used for constructing an optimized version of our GLR parser. Compared with a standard GLR parser, our optimized parser reads one symbol on every transition and performs significantly fewer stack operations. Our timings show that, especially for highly ambiguous grammars, our parser is significantly faster than a standard GLR parser. Received: 9 May 2000 / 5 March 2001  相似文献   

7.
Abstract: This paper presents a simple connectionist approach to parsing of a subset of sentences in the Hindi language, using Rule based Connectionist Networks (RBCN) as suggested by Fu in 1993. The basic grammar rules representing Kernel Hindi sentences have been used to determine the initial topology of the RBCN. The RBCN is based on a multilayer perceptron, trained using the backpropagation algorithm. The terminal symbols defined in the language structure are mapped onto the input nodes, the non-terminals onto hidden nodes and the start symbol onto the single output node of the network structure. The training instances are sentences of arbitrary, but fixed maximum length and fixed word order. A neural network based recognizer is used to perform grammaticality determination and parse tree generation of a given sentence. The network is exposed to both positive and negative training instances, derived from a simple context-free-grammar (CFG), during the training phase. The trained network recognizes seen sentences (sentences present in the training set) with 98–100% accuracy. Since a neural net based recognizer is trainable in nature, it can be trained to recognize any other CFG, simply by changing the training set. This results in reducing programming effort involved in parser development, as compared to that of the conventional AI approach. The parsing time is also reduced to a great extent as compared to that of a conventional parser, as a result of the inherent parallelism exhibited by neural net architecture.  相似文献   

8.
Incremental parsing is widely used in language-based editors and incremental compiling and interpreting environments. Re-parsing of modified input strings is the most frequently performed operation in these environments. Its efficiency can greatly affect the success of these environments. This paper describes the introduction of a threaded link in parse trees and an additional distance entry in LL parse tables to support minimal LL(1) re-parsing. These enhancements are then used to produce an efficient incremental LL parser.  相似文献   

9.
10.
Phil Cook  Jim Welsh 《Software》2001,31(15):1461-1486
Incremental parsing has long been recognized as a technique of great utility in the construction of language‐based editors, and correspondingly, the area currently enjoys a mature theory. Unfortunately, many practical considerations have been largely overlooked in previously published algorithms. Many user requirements for an editing system necessarily impact on the design of its incremental parser, but most approaches focus only on one: response time. This paper details an incremental parser based on LR parsing techniques and designed for use in a modeless syntax recognition editor. The nature of this editor places significant demands on the structure and quality of the document representation it uses, and hence, on the parser. The strategy presented here is novel in that both the parser and the representation it constructs are tolerant of the inevitable and frequent syntax errors that arise during editing. This is achieved by a method that differs from conventional error repair techniques, and that is more appropriate for use in an interactive context. Furthermore, the parser aims to minimize disturbance to this representation, not only to ensure other system components can operate incrementally, but also to avoid unfortunate consequences for certain user‐oriented services. The algorithm is augmented with a limited form of predictive tree‐building, and a technique is presented for the determination of valid symbols for menu‐based insertion. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

11.
Koen De Bosschere 《Software》1996,26(7):763-779
Prolog is a language with a dynamic grammar which is the result of embedded operator declarations. The parsing of such a language cannot be done easily by means of standard tools. Most often, an existing parsing technique for a static grammar is adapted to deal with the dynamic constructs. This paper uses the syntax definition as defined by the ISO standard for the Prolog language. It starts with a brief discussion of the standard, highlighting some aspects that are important for the parser, such as the restrictions on the use of operators as imposed by the standard in order to make the parsing deterministic. Some possible problem areas are also indicated. As output is closely related to input in Prolog, both are treated in this paper. Some parsing techniques are compared and an operator precedence parser is chosen to be modified to deal with the dynamic operator declarations. The necessary modifications are discussed and an implementation in C is presented. Performance data are collected and compared with a public domain Prolog parser written in Prolog. It is the first efficient public domain parser for Standard Prolog that actually works and deals with all the details of the syntax.  相似文献   

12.
一种基于句法语义特征的汉语句法分析器   总被引:4,自引:2,他引:2  
句法分析不是简单地符号推理,而应该是一种实体推理。增加语义信息是实现句法分析实体推理的有效手段。本文所介绍的句法分析器有两个特色:一是利用基于词的兼类处理规则大大提高了句法分析的效率;二是利用词静态和动态的句法语义特征来限制句法规则过强的生成能力,取得了较好的效果。  相似文献   

13.
Error detection in arithmetic code is usually achieved by inserting markers in the source sequence during encoding. Transmission errors can then be detected in the decoding process if the inserted markers do not appear at the expected positions. Unlike the existing approaches in which the marker symbol is selected from the set of source symbols, we propose that the marker be created artificially so as not to affect the original distribution of the source symbols. Our scheme is proved to possess a better compression ratio than existing marker approaches at the same error misdetection probability. The relationship between codeword length expansion and error misdetection probability within a coded block is well formulated, which makes it easy to adapt to channels with different bit error rates. Simulation results show that, for adaptive arithmetic coding implemented using finite-precision computation, the distribution of error detection delay has a peak at a value slightly larger than the length of the decoding register. With a sufficiently long register, our approach can detect most error patterns in long source sequences at a high probability.  相似文献   

14.
A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.  相似文献   

15.
John Tobin  Carl Vogel 《Knowledge》2009,22(7):516-522
Some parsers need to be very precise and strict when parsing, yet must allow users to easily adapt or extend the parser to parse new inputs, without requiring that the user have an in-depth knowledge and understanding of the parser’s internal workings. This paper presents a novel parsing architecture, designed for parsing Postfix log files, that aims to make the process of parsing new inputs as simple as possible, enabling users to trivially add new rules (to parse variants of existing inputs) and relatively easily add new actions (to process a previously unknown category of input). The architecture scales linearly or better as the number of rules and size of input increases, making it suitable for parsing large corpora or months of accumulated data.  相似文献   

16.
论文提出了一种基于BP网络的汉语句法分析错误恢复方法,结合神经网络自学习、自组织的优点,以神经网络的结构模型代替了富田胜算法的分析表,模拟其移进规约动作。该方法鲁棒性好,能够恢复富田胜算法句法分析结果中的若干错误句子,提高其容错性能。实验结果表明,该方法对于错误句子的恢复取得了令人满意的效果。  相似文献   

17.
一个基于GLR算法的英汉机器翻译浅层句法分析器   总被引:5,自引:0,他引:5  
浅层句法分析是指短语级的自然语言句法分析。在研制MatLink英汉机器翻译系统的过程中,提出了扩充的CFG文法用于描述英语短语句法,并改进了GLR算法,设计实现了用于英汉翻译的英语浅层句法分析器。该分析器采用多出口的分析表结构,引入符号映射函数实现短语边界的自动识别,用孩子兄弟树描述短语的句法结构,并通过短语转换模式实现源语言向目标语言的短语级转换。最后,通过对一个实例句子的分析阐述了该浅层句法分析器的设计思想和工作过程。  相似文献   

18.
Parallel parsing is currently receiving attention but there is little discussion about the adaptation of sequential error handling techniques to these parallel algorithms. We describe a noncorrecting error handler implemented with a parallel LR substring parser. The parser used is a parallel version of Cormack's LR substring parser. The applicability of noncorrecting error handling for parallel parsing is discussed. The error information provided for a standard set of 118 erroneous Pascal programs is analysed. The programs are run on the sequential LR substring parser.  相似文献   

19.
Patent claim parsing can contribute in many patent-related applications, such as patent search, information extraction, machine translation and summarization. However, patent claim parsing is difficult due to the special structure of patent claims. To overcome this difficulty, the challenges facing the patent claim parsing were first investigated and the peculiarities of claim syntax that obstruct dependency parsing were highlighted. To handle these peculiarities, this study proposes a new two-level parser, in which a conventional parser is imbedded. A patent claim is pre-processed in order to remove peculiarities before passed to the conventional parser. The process is based on a new dependency-based syntax called Independent Claim Segment Dependency Syntax (ICSDS). This two-lever parser has demonstrated promising improvement for patent claim parsing on both effectiveness and efficiency over the conventional parser.  相似文献   

20.
A deterministic parallel LL parsing algorithm is presented. The algorithm is based on a transformation from a parsing problem to parallel reduction. First, a nondeterministic version of a parallel LL parser is introduced. Then, it is transformed into the deterministic version—the LLP parser. The deterministic LLP(q,k) parser uses two kinds of information to select the next operation — a lookahead string of length up to k symbols and a lookback string of length up to q symbols. Deterministic parsing is available for LLP grammars, a subclass of LL grammars. Since the presented deterministic and nondeterministic parallel parsers are both based on parallel reduction, they are suitable for most parallel architectures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号