首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Implementation of a new compiler usually requires making frequent adjustments to grammar definitions. An incremental technique for updating the parser tables after a monor change to the grammer could potentially save much computational effort. More importantly, debugging a grammar is made easier if the grammar is re-checked for correctness after each small change to the grammar. The basic design philosophy of an incremental parser generator, and incremental algorithms for LR(0), SLR(1) and LALR(1) parser generation are discussed in this paper. Some of these algorithms have been incorporated into an implementation of an incremental LALR(1) parser generator.  相似文献   

2.
基于移进归约的句法分析系统具有线性的时间复杂度,因此在大规模句法分析任务中具有特别实际的意义。然而目前移进归约句法分析系统的性能远低于领域内最好的句法分析器,例如,伯克利句法分析器。该文研究如何利用向上学习和无标注数据改进移进归约句法分析系统,使之尽可能接近伯克利句法分析器的性能。我们首先应用伯克利句法分析器对大规模的无标注数据进行自动分析,然后利用得到的自动标注数据作为额外的训练数据改进词性标注系统和移进归约句法分析器。实验结果表明,向上学习方法和无标注数据使移进归约句法分析的性能提高了2.3%,达到82.4%。这个性能与伯克利句法分析器的性能可比。与此同时,该文最终得到的句法分析系统拥有明显的速度优势(7倍速度于伯克利句法分析器)。  相似文献   

3.
This paper describes the implementation of a constraint-based parser, PARSEC (Parallel ARchitecture SEntence Constrainer), which has the required flexibility that a user may easily construct a custom grammar and test it. Once the user designs grammar parameters, constraints, and a lexicon, our system checks them for consistency and creates a parser for the grammar. The parser has an X-windows interface that allows a user to view the state of a parse of a sentence, test new constraints, and dump the constraint network to a file. The parser has an option to perform the computationally expensive constraint propagation steps on the MasPar MP-1. Stream and socket communication was used to interface the MasPar constraint parser with a standard X-windows interface on our Sun Sparcstation. The design of our heterogeneous parser has benefitted from the use of object-oriented techniques. Without these techniques, it would have been more difficult to combine the processing power of the MasPar with a Sun Sparcstation. Also, these techniques allowed the parser to gracefully evolve from a system that operated on single sentences, to one capable of processing word graphs containing multiple sentences, consistent with speech processing. This system should provide an important component of a real-time speech understanding system.  相似文献   

4.
Summary Methods for the automatic construction of error handling parsers are presented. The resulting parsers are capable of correcting all syntax errors by insertion and/or deletion of terminal symbols to the right of the error location. Thus, the output of the parser always corresponds to a syntactically valid program. This contributes significantly to the reliability and robustness of a compiler. The speed of parsing correct parts of a program is not affected by the presence of the error handling capability. The correction algorithm is easy to implement. Apart from the parsing tables only one character per parser state is required to control the correction process. The method is applicable to a wide class of stack automata including LL(k), LR(k), SLR(k), and LALR(k) parsers. It is shown that for LL(k) grammars error correction can be obtained as a byproduct of the canonical LL(k) parser generation. A similar result can be obtained for LR(k) grammars if the parser generator is slightly modified. The method has been successfully added to an LALR(1) parser generator.  相似文献   

5.
Visual YACC is a tool that automatically creates visualizations of the YACC LR parsing process and synthesized attribute computation. The Visual YACC tool works by instrumenting a standard YACC grammar with graphics calls that draw the appropriate data structures given the current actions by the parser. The new grammar is processed by the YACC tools and the resulting parser displays the parse stack and parse tree for every step of the parsing process of a given input string. Visual YACC was initially designed to be used in compiler construction courses to supplement the teaching of parsing and syntax directed evaluation. We have also found it to be useful in the difficult task of debugging YACC grammars. In this paper, we describe this tool and how it is used in both contexts. We also detail two different implementations of this tool: one that produces a parser written in C with calls to Motif; and a second implementation that generates Java source code. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

6.
In its recogniser form, Earley's algorithm for testing whether a string can be derived from a grammar is worst case cubic on general context free grammars (CFG). Earley gave an outline of a method for turning his recognisers into parsers, but it turns out that this method is incorrect. Tomita's GLR parser returns a shared packed parse forest (SPPF) representation of all derivations of a given string from a given CFG but is worst case unbounded polynomial order. We have given a modified worst-case cubic version, the BRNGLR algorithm, that, for any string and any CFG, returns a binarised SPPF representation of all possible derivations of a given string. In this paper we apply similar techniques to develop two versions of an Earley parsing algorithm that, in worst-case cubic time, return an SPPF representation of all derivations of a given string from a given CFG.  相似文献   

7.
篇章关系识别是篇章分析的核心组成部分。汉语中,缺少显式连接词的隐式篇章关系占比很高,篇章关系识别更具挑战性。该文给出了一个基于多层局部推理的汉语篇章关系及主次联合识别方法。该方法借助双向LSTM和多头自注意力机制进行篇章关系对应论元的表征;进一步借助软对齐方式获取论元间局部语义的推理权重,形成论元间交互语义信息的表征;再将两类信息结合进行篇章关系的局部推理,并通过堆叠多层局部推理部件构建了汉语篇章关系及主次联合识别框架,在CDTB语料库上的关系识别F1值达到了67.0%。该文进一步将该联合识别模块嵌入一个基于转移的篇章解析器,在自动生成的篇章结构下进行篇章关系及主次的联合分析,形成了完整的汉语篇章解析器。  相似文献   

8.
Phil Cook  Jim Welsh 《Software》2001,31(15):1461-1486
Incremental parsing has long been recognized as a technique of great utility in the construction of language‐based editors, and correspondingly, the area currently enjoys a mature theory. Unfortunately, many practical considerations have been largely overlooked in previously published algorithms. Many user requirements for an editing system necessarily impact on the design of its incremental parser, but most approaches focus only on one: response time. This paper details an incremental parser based on LR parsing techniques and designed for use in a modeless syntax recognition editor. The nature of this editor places significant demands on the structure and quality of the document representation it uses, and hence, on the parser. The strategy presented here is novel in that both the parser and the representation it constructs are tolerant of the inevitable and frequent syntax errors that arise during editing. This is achieved by a method that differs from conventional error repair techniques, and that is more appropriate for use in an interactive context. Furthermore, the parser aims to minimize disturbance to this representation, not only to ensure other system components can operate incrementally, but also to avoid unfortunate consequences for certain user‐oriented services. The algorithm is augmented with a limited form of predictive tree‐building, and a technique is presented for the determination of valid symbols for menu‐based insertion. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

9.
Haruno  Masahiko  Shirai  Satoshi  Ooyama  Yoshifumi 《Machine Learning》1999,34(1-3):131-149
This paper describes a novel and practical Japanese parser that uses decision trees. First, we construct a single decision tree to estimate modification probabilities; how one phrase tends to modify another. Next, we introduce a boosting algorithm in which several decision trees are constructed and then combined for probability estimation. The constructed parsers are evaluated using the EDR Japanese annotated corpus. The single-tree method significantly outperforms the conventional Japanese stochastic methods. Moreover, the boosted version of the parser is shown to have great advantages; (1) a better parsing accuracy than its single-tree counterpart for any amount of training data and (2) no over-fitting to data for various iterations. The presented parser, the first non-English stochastic parser with practical performance, should tighten the coupling between natural language processing and machine learning.  相似文献   

10.
Parallel parsing is currently receiving attention but there is little discussion about the adaptation of sequential error handling techniques to these parallel algorithms. We describe a noncorrecting error handler implemented with a parallel LR substring parser. The parser used is a parallel version of Cormack's LR substring parser. The applicability of noncorrecting error handling for parallel parsing is discussed. The error information provided for a standard set of 118 erroneous Pascal programs is analysed. The programs are run on the sequential LR substring parser.  相似文献   

11.
Thomas R. Kramer 《Software》2010,40(5):387-404
A method is described for repairing some shift/reduce conflicts caused by limited lookahead in LALR(1) parsers such as those built by bison. Also, six types of Extended BNF (EBNF) construct are identified that cause a shift/reduce conflict when a Yet Another Compiler Compiler (YACC) file translated directly from EBNF is processed by bison. For each type, a replacement EBNF construct is described representing the same grammar and causing no shift/reduce conflict when its YACC equivalent is processed by bison. Algorithms are given for identifying instances of each type and transforming them into their replacements. The algorithms are implemented in an automatic parser builder that builds parsers for subsets of the DMIS language. The parser builder reads an EBNF file and writes C++ classes, a YACC file, and a Lex file, which are then processed to build a parser. The parsers build a parse tree using the C++ classes. The EBNF for DMIS is written in natural terms so that natural C++ classes are generated. However, if translated directly into YACC, the natural EBNF leads to 22 shift/reduce conflicts that fall into the six types. The parser builder recognizes the six constructs and replaces them automatically before generating YACC. The YACC that is generated parses in terms of unnatural constructs while building a parse tree using natural C++ classes. The six types of construct may occur in any statement‐based language that uses a minor separator, such as a comma; hence knowing how to recognize and replace them may be broadly useful. Published in 2010 by John Wiley & Sons, Ltd.  相似文献   

12.
This paper describes a bidirectional head-corner parser for (unification-based versions of) lexicalized tree-adjoining grammars.  相似文献   

13.
Analysis   总被引:4,自引:0,他引:4  
This paper describes the parser, especially its mapping rule interpreter, used in KBMT-89. The interpreter is characterized by its ability to produce semantic and syntactic structures of a parse simultaneously and therefore more efficiently than other kinds of analyzers. Applicable forms of parser mapping rules, which map syntactic structures to semantic structures, are introduced. The parser, a modified version of Tomita's universal parser, is briefly described. Sample traces illustrate the functioning of the parser and mapping rule interpreter.  相似文献   

14.
一种基于句法语义特征的汉语句法分析器   总被引:4,自引:2,他引:2  
句法分析不是简单地符号推理,而应该是一种实体推理。增加语义信息是实现句法分析实体推理的有效手段。本文所介绍的句法分析器有两个特色:一是利用基于词的兼类处理规则大大提高了句法分析的效率;二是利用词静态和动态的句法语义特征来限制句法规则过强的生成能力,取得了较好的效果。  相似文献   

15.
概率句法分析器(PCFG Parser)是基于概率规则集的上下文无关文法的句法分析器。规则集主要是针对词类和短语类。然而事实上,词性相同而词汇不同,其所常用的句法规则也通常不同。目前NLP研究的一个趋势和热点就是词汇化的句法分析。针对概率句法分析独立性假设中缺乏词汇化的缺陷,本文将谓语动词的子语类信息与概率句法分析结合起来,提出一种基于动词子语类信息的词汇化概率句法分析方法。论文建立了基于汉语动词子语类框架的统计句法分析模型,并且针对动词子语类框架难以获取的问题,提出一种词汇化概率句法分析与动词子语类框架获取的互动方法。实验利用这种互动的方法获取了汉语中十个常用高频动词的概率化子语类信息,并结合原有的概率句法分析器PCFG实现了一个基于动词子语类信息的概率句法分析器原型系统S-PCFG。实验证明了基于动词子语类信息的概率句法分析对自然语言句法分析的准确率和速度均有所提高。同时分析了新的概率句法分析器的不足之处,为进一步的改进提供条件。  相似文献   

16.
Theory and algorithm for optimization of a directed and labeled tree are presented. Their application for optimizing any finite pattern grammar represented in the form of a tree is discussed. Tree optimization leads to loss information which is essential for identification of patterns. Special technique for preserving this information has been suggested.Finally, outlines of two different algorithms for the parsing of patterns are included. The tree parser uses the optimized tree and the table-driven parser uses the optimized syntax stored in four separate tables.  相似文献   

17.
Discourse parsing has become an inevitable task to process information in the natural language processing arena. Parsing complex discourse structures beyond the sentence level is a significant challenge. This article proposes a discourse parser that constructs rhetorical structure (RS) trees to identify such complex discourse structures. Unlike previous parsers that construct RS trees using lexical features, syntactic features and cue phrases, the proposed discourse parser constructs RS trees using high‐level semantic features inherited from the Universal Networking Language (UNL). The UNL also adds a language‐independent quality to the parser, because the UNL represents texts in a language‐independent manner. The parser uses a naive Bayes probabilistic classifier to label discourse relations. It has been tested using 500 Tamil‐language documents and the Rhetorical Structure Theory Discourse Treebank, which comprises 21 English‐language documents. The performance of the naive Bayes classifier has been compared with that of the support vector machine (SVM) classifier, which has been used in the earlier approaches to build a discourse parser. It is seen that the naive Bayes probabilistic classifier is better suited for discourse relation labeling when compared with the SVM classifier, in terms of training time, testing time, and accuracy.  相似文献   

18.
Summary This paper presents a new approach to the design and the proof of bypassed LR(k) parsers; a bypassed LR(k) parser is an LR(k) parser which does not perform any reductions caused by insignificant unit productions. We show that bypassed LR(k) parsers having minimum set of reduction contexts always exist for Knuth LR(k) parsers, but do not necessarily exist for SLR(k) and LALR(k) parsers. As byproducts of our approach, we naturally derive Anderson, Eve and Homing's algorithm [5] and a sufficient condition for the existence of bypassed SLR(1) parsers.  相似文献   

19.
校园导航系统Easy Nav的设计与实现   总被引:10,自引:0,他引:10  
本文介绍了校园导航口语对话系统EasyNav的设计与实现。在分析了口语对话系统的特点和要求之后,我们提出了适合于对话系统的基于规则的语言理解流程。在这一流程中,句法分析使用GLR分析器处理上下文无关文法(CFG),获取句子结构特征以便为语义分析服务,句法规则照顾到覆盖率和准确率间的平衡。语义分析使用考虑句法约束条件的模板匹配方法,以获取话者意图为目标,并消除句法分析引入的歧义。这一设计的优点是系统容易搭建,也容易扩展。  相似文献   

20.
中文语义依存关系分析的统计模型   总被引:7,自引:0,他引:7  
李明琴  李涓子  王作英  陆大 《计算机学报》2004,27(12):1679-1687
该文提出了一个统计语义分析器,它能够发现中文句子中的语义依存关系.这些语义依存关系可以用于表示句子的意义和结构.语义分析器在1百万词的标有语义依存关系的语料库(语义依存网络语料库,SDN)上训练并测试,文中设计、实现了多个实验以分析语义分析器的性能.实验结果表明,分析器在非限定领域中表现出了较好的性能,分析正确率与中文句法分析器基本相当。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号