首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 750 毫秒
1.
Jean Bovet  Terence Parr 《Software》2008,38(12):1305-1332
Programmers tend to avoid using language tools, resorting to ad hoc methods, because tools can be hard to use, their parsing strategies can be difficult to understand and debug, and their generated parsers can be opaque black‐boxes. In particular, there are two very common difficulties encountered by grammar developers: understanding why a grammar fragment results in a parser non‐determinism and determining why a generated parser incorrectly interprets an input sentence. This paper describes ANTLRWorks, a complete development environment for ANTLR grammars that attempts to resolve these difficulties and, in general, make grammar development more accessible to the average programmer. The main components are a grammar editor with refactoring and navigation features, a grammar interpreter, and a domain‐specific grammar debugger. ANTLRWorks' primary contributions are a parser non‐determinism visualizer based on syntax diagrams and a time‐traveling debugger that pays special attention to parser decision‐making by visualizing lookahead usage and speculative parsing during backtracking. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

2.
校园导航系统Easy Nav的设计与实现   总被引:10,自引:0,他引:10  
本文介绍了校园导航口语对话系统EasyNav的设计与实现。在分析了口语对话系统的特点和要求之后,我们提出了适合于对话系统的基于规则的语言理解流程。在这一流程中,句法分析使用GLR分析器处理上下文无关文法(CFG),获取句子结构特征以便为语义分析服务,句法规则照顾到覆盖率和准确率间的平衡。语义分析使用考虑句法约束条件的模板匹配方法,以获取话者意图为目标,并消除句法分析引入的歧义。这一设计的优点是系统容易搭建,也容易扩展。  相似文献   

3.
面向虚拟装配的三维交互平台   总被引:14,自引:0,他引:14  
针对当前装配领域的特点,提出一个面向虚拟装配的三维交互平台(Virtual Assembly Toolkit,VAT),VAT中包含了新的三维交互思想,定义了装配领域中常有的三维交互原语,设计和实现了交互原语的捕获,解释和处理框架,同时,VAT封装了三维图形构造,零件间的约束和碰撞检测等功能,VAT可以大大简化虚拟装配应用的构造,便于应用的快速生成。  相似文献   

4.
Discourse parsing has become an inevitable task to process information in the natural language processing arena. Parsing complex discourse structures beyond the sentence level is a significant challenge. This article proposes a discourse parser that constructs rhetorical structure (RS) trees to identify such complex discourse structures. Unlike previous parsers that construct RS trees using lexical features, syntactic features and cue phrases, the proposed discourse parser constructs RS trees using high‐level semantic features inherited from the Universal Networking Language (UNL). The UNL also adds a language‐independent quality to the parser, because the UNL represents texts in a language‐independent manner. The parser uses a naive Bayes probabilistic classifier to label discourse relations. It has been tested using 500 Tamil‐language documents and the Rhetorical Structure Theory Discourse Treebank, which comprises 21 English‐language documents. The performance of the naive Bayes classifier has been compared with that of the support vector machine (SVM) classifier, which has been used in the earlier approaches to build a discourse parser. It is seen that the naive Bayes probabilistic classifier is better suited for discourse relation labeling when compared with the SVM classifier, in terms of training time, testing time, and accuracy.  相似文献   

5.
A language implementation with proper compositionality enables a compiler developer to divide-and-conquer the complexity of building a large language by constructing a set of smaller languages. Ideally, these small language implementations should be independent of each other such that they can be designed, implemented and debugged individually, and later be reused in different applications (e.g., building domain-specific languages). However, the language composition offered by several existing parser generators resides at the grammar level, which means all the grammar modules need to be composed together and all corresponding ambiguities have to be resolved before generating a single parser for the language. This produces tight coupling between grammar modules, which harms information hiding and affects independent development of language features. To address this problem, we have developed a novel parsing algorithm that we call Component-based LR (CLR) parsing, which provides code-level compositionality for language development by producing a separate parser for each grammar component. In addition to shift and reduce actions, the algorithm extends general LR parsing by introducing switch and return actions to empower the parsing action to jump from one parser to another. Our experimental evaluation demonstrates that CLR increases the comprehensibility, reusability, changeability and independent development ability of the language implementation. Moreover, the loose coupling among parser components enables CLR to describe grammars that contain LR parsing conflicts or require ambiguous token definitions, such as island grammars and embedded languages.  相似文献   

6.
It is shown that if the basic method for eliminating single productions from canonical LR parsers developed by Pager is applied to an SLR parser and the resulting parser is free of conflicts, then the resulting parser is a valid parser which accepts exactly the strings in the language. However, if the elimination process is performed during the construction of an SLR parser, then the resulting parser may be invalid even if it were free of conflicts.  相似文献   

7.
The importance of the parsing task for NLP applications is well understood. However developing parsers remains difficult because of the complexity of the Arabic language. Most parsers are based on syntactic grammars that describe the syntactic structures of a language. The development of these grammars is laborious and time consuming. In this paper we present our method for building an Arabic parser based on an induced grammar, PCFG grammar. We first induce the PCFG grammar from an Arabic Treebank. Then, we implement the parser that assigns syntactic structure to each input sentence. The parser is tested on sentences extracted from the treebank (1650 sentences).We calculate the precision, recall and f-measure. Our experimental results showed the efficiency of the proposed parser for parsing modern standard Arabic sentences (Precision: 83.59 %, Recall: 82.98 % and F-measure: 83.23 %).  相似文献   

8.
A parser for natural language is proposed which, (a) concentrates on methodological procedures rather than grammatical details of a natural language; (b) is based on the claim that in order to exploit the full power of computers, models of human language processing should not necessarily be designed to simulate human behavior; and (c) is designed for the advanced technology of the computers of the future. the parser is a component of a language understanding machine which is only represented in a sketch in this article.  相似文献   

9.
Koen De Bosschere 《Software》1996,26(7):763-779
Prolog is a language with a dynamic grammar which is the result of embedded operator declarations. The parsing of such a language cannot be done easily by means of standard tools. Most often, an existing parsing technique for a static grammar is adapted to deal with the dynamic constructs. This paper uses the syntax definition as defined by the ISO standard for the Prolog language. It starts with a brief discussion of the standard, highlighting some aspects that are important for the parser, such as the restrictions on the use of operators as imposed by the standard in order to make the parsing deterministic. Some possible problem areas are also indicated. As output is closely related to input in Prolog, both are treated in this paper. Some parsing techniques are compared and an operator precedence parser is chosen to be modified to deal with the dynamic operator declarations. The necessary modifications are discussed and an implementation in C is presented. Performance data are collected and compared with a public domain Prolog parser written in Prolog. It is the first efficient public domain parser for Standard Prolog that actually works and deals with all the details of the syntax.  相似文献   

10.
11.
Real-world natural language sentences are often long and complex, and contain unexpected grammatical constructions. They even include noise and ungrammaticality. This paper describes the Controlled Skip Parser, a program that parses such real-world sentences by skipping some of the words in the sentence. The new feature of this parser is that it controls its behavior by finding out which words to skip, without using domain-specific knowledge. The parser is a priority-based chart parser. By assigning appropriate priority levels to the constituents in the chart, the parser's behavior is controlled. Statistical information is used for assigning priority levels. The statistical information (n-grams) can be thought of as a generalized approximation of the grammar learned from past successful experiences. The control mechanism gives a great speed-up and reduction in memory usage. Experiments on real newspaper articles are shown, and our experience with this parser in a machine translation system is described.  相似文献   

12.
In this paper, we describe our experience in grammar engineering to construct multiple parsers and front ends for the Python language. We present a metrics-based study of the evolution of the Python grammars through the multiple versions of the language in an effort to distinguish and measure grammar evolution and to provide a basis of comparison with related research in grammar engineering. To conduct this research, we have built a toolkit, pygrat , which builds on tools developed in other research. We use pygrat to build a system that automates much of the process needed to translate the Python grammars from EBNF to a formalism acceptable to the bison parser generator. We exploit the suite of Python test cases, used by the Python developers, to validate our parser generation. Finally, we describe our use of the menhir parser generator to facilitate the parser and front-end construction, eliminating some of the transformations and providing practical support for grammar modularisation.  相似文献   

13.
A wide range of parser generators are used to generate parsers for programming languages. The grammar formalisms that come with parser generators provide different approaches for defining operator precedence. Some generators (e.g. YACC) support precedence declarations, others require the grammar to be unambiguous, thus encoding the precedence rules. Even if the grammar formalism provides precedence rules, a particular grammar might not use it. The result is grammar variants implementing the same language. For the C language, the GNU Compiler uses YACC with precedence rules, the C-Transformers uses SDF without priorities, while the SDF library does use priorities. For PHP, Zend uses YACC with precedence rules, whereas PHP-front uses SDF with priority and associativity declarations.The variance between grammars raises the question if the precedence rules of one grammar are compatible with those of another. This is usually not obvious, since some languages have complex precedence rules. Also, for some parser generators the semantics of precedence rules is defined operationally, which makes it hard to reason about their effect on the defined language. We present a method and tool for comparing the precedence rules of different grammars and parser generators. Although it is undecidable whether two grammars define the same language, this tool provides support for comparing and recovering precedence rules, which is especially useful for reliable migration of a grammar from one grammar formalism to another. We evaluate our method by the application to non-trivial mainstream programming languages, such as PHP and C.  相似文献   

14.
金蓓弘  曹冬磊  任鑫  余双  戴蓓洁 《软件学报》2008,19(10):2728-2738
XML(extensible markup language)解析器是分析、处理XML文档的基础软件.研究高性能验证型XML解析器的实现.开发了支持3种解析模型的XML解析器OnceXMLParser,该解析器通过了严格的XML兼容性测试和API兼容性测试.OnceXMLParser具有轻量级体系结构并进行了多方面的性能优化,包括高效的词法分析、基于统计分析的自动机实现、合理的资源分配策略以及语言层次上的优化.性能测试结果表明,OnceXMLParser具有出色的解析性能.  相似文献   

15.
We present MARS (Multilingual Automatic tRanslation System), a research prototype speech-to-speech translation system. MARS is aimed at two-way conversational spoken language translation between English and Mandarin Chinese for limited domains, such as air travel reservations. In MARS, machine translation is embedded within a complex speech processing task, and the translation performance is highly effected by the performance of other components, such as the recognizer and semantic parser, etc. All components in the proposed system are statistically trained using an appropriate training corpus. The speech signal is first recognized by an automatic speech recognizer (ASR). Next, the ASR-transcribed text is analyzed by a semantic parser, which uses a statistical decision-tree model that does not require hand-crafted grammars or rules. Furthermore, the parser provides semantic information that helps further re-scoring of the speech recognition hypotheses. The semantic content extracted by the parser is formatted into a language-independent tree structure, which is used for an interlingua based translation. A Maximum Entropy based sentence-level natural language generation (NLG) approach is used to generate sentences in the target language from the semantic tree representations. Finally, the generated target sentence is synthesized into speech by a speech synthesizer.Many new features and innovations have been incorporated into MARS: the translation is based on understanding the meaning of the sentence; the semantic parser uses a statistical model and is trained from a semantically annotated corpus; the output of the semantic parser is used to select a more specific language model to refine the speech recognition performance; the NLG component uses a statistical model and is also trained from the same annotated corpus. These features give MARS the advantages of robustness to speech disfluencies and recognition errors, tighter integration of semantic information into speech recognition, and portability to new languages and domains. These advantages are verified by our experimental results.  相似文献   

16.
《Software, IEEE》2006,23(4):62-63
This paper evaluates the use of a functional language for implementing domain-specific functionality. The factors we consider when choosing a programming language are programmer productivity, maintainability, efficiency, portability, tool support, and software and hardware interfaces. The choice of programming language is a fine balancing act. Modern object-oriented languages such as Java and C# are more orthogonal and hide fewer surprises for the programmer, although the inevitable accumulation of features makes this statement less true with every new version of each language.  相似文献   

17.
一种通用的植物逼真几何建模方法   总被引:4,自引:0,他引:4       下载免费PDF全文
针对使用 L 系统进行植物几何建模的具体过程随规则定义的变化而变化的问题 ,提出了一种较为通用的基于 L 系统规则语言分析器的解决方法 ,即通过归纳和抽象得到可以定义多种 L 系统规则的语言 L- plants,并为其构造语言分析器 ,完成 L 系统开始状态和规则的识别 ,进行规则替换 ,以形成最终的字符串 ,最后使用形状语法对字符串进行解释 ,建立出植物的几何模型 .实验证明 ,该方法可以较大幅度地提高植物几何建模的效率  相似文献   

18.
工业机器人通常采用特定的机器人语言进行示教编程与控制,对于操作人员需要具有较高专业与技能要求,并且示教周期长导致工作效率降低。为了提高工业机器人使用效率与易用性,提出一种基于受限自然语言解析器的设计方法。该系统通过对受限自然语言进行词法解析、语法解析、语义解析,得到所需求的工作意图,然后与实时生成的三维空间语义地图进行匹配,结合机械臂轨迹规划,生成能够完成工作任务的机器人作业程序,并完成了机器人作业程序的解析与实际机械臂的控制。通过实验证明设计的基于受限自然语言处理的分拣机器人解析器能够正确解析自然语言命令,实现对机械臂的控制。  相似文献   

19.
基于自然语言计算模型的汉语理解系统   总被引:9,自引:0,他引:9  
周经野 《软件学报》1993,4(6):41-46
本文首先给出了一种自然语言计算模型,该模型把自然语言交流过程划分为三个层次:语言形式,表层语义和深层语义,从而将自然语言理解抽象为一个复合函数UP(s,k),依据这个模型,我们设计了一个汉语理解系统,这个系统具有良好的扩展性和可移植性,该系统采用汉语语义结构文法来分析汉语句子,把语法分析和语义分析有机地结合在一起,文中形式定义了词语的深层语义以及深层语义的基本运算,给出了分析器、理解器以及生成器的算法。  相似文献   

20.
Natural language interface for an expert system   总被引:2,自引:0,他引:2  
This paper provides complete analysis of natural language interface from the screen manager to the parser/understander, even though it has some limitations. The main focus is on the design and development of a subsystem for understanding natural language input in an expert system. Fast response time and user-friendliness were the most important considerations in the design. The screen manager provides an easy editing capability for the user. The spelling correction system can detect most spelling errors and correct them automatically, quickly and effectively. The Lexical Functional Grammar (LFG) parser and the understander are designed to handle most types of simple sentences, fragments, and ellipses. As a result, this system as applied to a man-machine dialogue gives improved ill-formed natural language input handling capabilities to the expert system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号