首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
DeConverter is core software in a Universal Networking Language(UNL) system.A UNL system has EnConverter and DeConverter as its two major components.EnConverter is used to convert a natural language sentence into an equivalent UNL expression,and DeConverter is used to generate a natural language sentence from an input UNL expression.This paper presents design and development of a Punjabi DeConverter.It describes five phases of the proposed Punjabi DeConverter,i.e.,UNL parser,lexeme selection,morphology generation,function word insertion,and syntactic linearization.This paper also illustrates all these phases of the Punjabi DeConverter with a special focus on syntactic linearization issues of the Punjabi DeConverter.Syntactic linearization is the process of defining arrangements of words in generated output.The algorithms and pseudocodes for implementation of syntactic linearization of a simple UNL graph,a UNL graph with scope nodes and a node having un-traversed parents or multiple parents in a UNL graph have been discussed in this paper.Special cases of syntactic linearization with respect to Punjabi language for UNL relations like ’and’,’or’,’fmt’,’cnt’,and ’seq’ have also been presented in this paper.This paper also provides implementation results of the proposed Punjabi DeConverter.The DeConverter has been tested on 1000 UNL expressions by considering a Spanish UNL language server and agricultural domain threads developed by Indian Institute of Technology(IIT),Bombay,India,as gold-standards.The proposed system generates 89.0% grammatically correct sentences,92.0% faithful sentences to the original sentences,and has a fluency score of 3.61 and an adequacy score of 3.70 on a 4-point scale.The system is also able to achieve a bilingual evaluation understudy(BLEU) score of 0.72.  相似文献   

2.
李海军  张蕾 《微机发展》2006,16(1):41-43
句法分析是自然语言处理中的关键一环,目前的句法分析一般是依据句子中词的词性标记来进行的。而汉语单纯依据词性很难确定词之间正确的句法关系。在词类这个平面上进行句法分析存在着难以排除的结构歧义现象,因而使用语义知识排除结构歧义的方法更显重要。文中提出基于ATN(Augmented Translation Networks)的句法分析的新方法,对ATN进行相应的改进,利用《知网》的语义知识资源对其成分进行各种特性的标注。  相似文献   

3.
An intelligent machine can be thought of as a human friendly machine system that identifies or understands the problems of generating tasks, developing plans, compiling and executing the tasks automatically. High performance dependable intelligent systems must understand and translate natural languages. The translation of natural languages for intelligent systems has been one of the most challenging problems in intelligent systems from the very beginning. It is the responsibility of a translation system to assign the responsibility of task generation ability of the machine to automate a program generation.

In this paper, the problem of advanced machine translation capabilities is approached by examining the Sinhala natural language. Sinhalese has not been analyzed using computational linguistics. Our earlier system on Sinhalese morphology is the first attempt of such a study. This paper extends it to syntactic and semantic analysis. We formalize grammar rules for unit, phrase, clause and sentence, and developed a semantically characteristic Sinhalese dictionary, and a conceptual dictionary based on English, Japanese, and Sinhalese. Syntactic and semantic analyses are implemented on the computer and sound experimental results are obtained.  相似文献   

4.
In this paper we describe a language for reasoning about actions that can be used for modelling and for programming rational agents. We propose a modal approach for reasoning about dynamic domains in a logic programming setting. Agent behavior is specified by means of complex actions which are defined using modal inclusion axioms. The language is able to handle knowledge producing actions as well as actions which remove information. The problem of reasoning about complex actions with incomplete knowledge is tackled and the temporal projection and planning problems is addressed; more specifically, a goal directed proof procedure is defined, which allows agents to reason about complex actions and to generate conditional plans. We give a non-monotonic solution for the frame problem by making use of persistency assumptions in the context of an abductive characterization. The language has been used for implementing an adaptive web-based system.  相似文献   

5.
The goal of this paper is to deal with a problem hardly ever addressed in natural language generation, conceptual input. In order to be able to express something, one needs to have something to express to begin with: ideas, concepts and thoughts. The question is how to access thoughts and build their representation in form of messages. What are the building blocks? How to organize and index them in order to allow for quick and intuitive access later on? It is generally believed that ideas precede expressions. Indeed, meanings, imprecise as they may be, tend to precede their expression in language. Yet, message creation is hardly ever a one-step process. Conceptual inputs are generally abstract and underspecified, which implies that they need to get refined later on, possibly even during the expression phase. In this paper we investigate interactive sentence generation, the focus being on conceptual input, a major component of language generation. Our views will be illustrated via three systems: ILLICO, a system for analyzing/generating sentences and guiding their composition; SPB, a multilingual phrase-book with on the fly generated audio output and Drill Tutor (DT), an exercise generator. While ILLICO is a message-understanding system with a message-completion functionality, SPB and DT are message-specification systems. The user works quite early with fairly complete structures (sentences or patterns), making basically only local changes: replacing words in the case of SPB, and choosing them to instantiate pattern variables in the case of DT.  相似文献   

6.
We present the Intelligent Thai text – Thai sign translation for language learning (IT3STL). IT3STL is able to translate Thai text into Thai sign language simply and conveniently anytime, anywhere. Thai sign language is the language of the deaf in Thailand. In the translation process, the distinction between Thai text and Thai sign language in both grammar and vocabulary are concerned in each processing step to ensure the accuracy of translation. IT3STL was designed not only to be an automatic interpreter but also to be a language tutor assistant. It provides meaning of each word and describes the structure formation and word order of the translated sentence. With IT3STL, the deaf and hearing-impaired are able to enhance their communication ability and to improve their knowledge and learning skills. Moreover IT3STL has increased motivation and opportunity for them to access multimedia and e-learning.  相似文献   

7.
《Artificial Intelligence》1987,33(3):325-378
The development of natural language interfaces to artificial intelligence systems is dependent on the representation of knowledge. A major impediment to building such systems has been the difficulty in adding sufficient linguistic and conceptual knowledge to extend and adapt their capabilities. This difficulty has been apparent in systems that perform the task of language production, i.e., the generation of natural language output to satisfy the communicative requirements of a system.A uniform, parsimonious representation of knowledge about language can increase extensibility and efficiency as well as simplify the generation process. The Ace framework applies knowledge representation fundamentals to the task of encoding knowledge about language. Within this framework, linguistic and conceptual knowledge are organized into hierarchies, and structured associations are used to join knowledge structures that are metaphorically related or otherwise used in linguistic expression. These structured associations permit specialized linguistic knowledge to derive partially from more abstract knowledge, facilitating the use of abstractions in generating specialized phrases. A simple generator called KING (Knowledge INtensive Generator) uses an Ace knowledge base to produce utterances from their conceptual representation.  相似文献   

8.
Many automatic speech recognition (ASR) systems rely on the sole pronunciation dictionaries and language models to take into account information about language. Implicitly, morphology and syntax are to a certain extent embedded in the language models but the richness of such linguistic knowledge is not exploited. This paper studies the use of morpho-syntactic (MS) information in a post-processing stage of an ASR system, by reordering N-best lists. Each sentence hypothesis is first part-of-speech tagged. A morpho-syntactic score is computed over the tag sequence with a long-span language model and combined to the acoustic and word-level language model scores. This new sentence-level score is finally used to rescore N-best lists by reranking or consensus. Experiments on a French broadcast news task show that morpho-syntactic knowledge improves the word error rate and confidence measures. In particular, it was observed that the errors corrected are not only agreement errors and errors on short grammatical words but also other errors on lexical words where the hypothesized lemma was modified.  相似文献   

9.
An implemented model of language processing has been developed that views the propositional components of a sentence as neural units. The propositional sentence units are linked through symbolic, reified representations of subordinate sentence parts. Large numbers of these highly standardized propositional units are encoded in a manner that interconnects propositional data through the declarative knowledge base structures, thus minimizing the importance of the procedural component and the need for backward chaining and inference generation. The introduction of new sentence information triggers a connectionist-like flurry of activity in which constantly changing propositional weights and reification strengths effect changes in the belief states encoded within the knowledge base. ©1999 John Wiley & Sons, Inc.  相似文献   

10.
龚龙超  郭军军  余正涛 《计算机应用》2022,42(11):3386-3394
当前性能最优的机器翻译模型之一Transformer基于标准的端到端结构,仅依赖于平行句对,默认模型能够自动学习语料中的知识;但这种建模方式缺乏显式的引导,不能有效挖掘深层语言知识,特别是在语料规模和质量受限的低资源环境下,句子解码缺乏先验约束,从而造成译文质量下降。为了缓解上述问题,提出了基于源语言句法增强解码的神经机器翻译(SSED)方法,显式地引入源语句句法信息指导解码。所提方法首先利用源语句句法信息构造句法感知的遮挡机制,引导编码自注意力生成一个额外的句法相关表征;然后将句法相关表征作为原句表征的补充,通过注意力机制融入解码,共同指导目标语言的生成,实现对模型的先验句法增强。在多个IWSLT及WMT标准机器翻译评测任务测试集上的实验结果显示,与Transformer基线模型相比,所提方法的BLEU值提高了0.84~3.41,达到了句法相关研究的最先进水平。句法信息与自注意力机制融合是有效的,利用源语言句法可指导神经机器翻译系统的解码过程,显著提高译文质量。  相似文献   

11.
The use of a statistical language model to improve the performance of an algorithm for recognizing digital images of handwritten or machine-printed text is discussed. A word recognition algorithm first determines a set of words (called a neighborhood) from a lexicon that are visually similar to each input word image. Syntactic classifications for the words and the transition probabilities between those classifications are input to the Viterbi algorithm. The Viterbi algorithm determines the sequence of syntactic classes (the states of an underlying Markov process) for each sentence that have the maximum a posteriori probability, given the observed neighborhoods. The performance of the word recognition algorithm is improved by removing words from neighborhoods with classes that are not included on the estimated state sequence. An experimental application is demonstrated with a neighborhood generation algorithm that produces a number of guesses about the identity of each word in a running text. The use of zero, first and second order transition probabilities and different levels of noise in estimating the neighborhood are explored  相似文献   

12.
Although the concept ofknowledge plays a central role in artificial intelligence, the theoretical foundations of knowledge representation currently rest on a very limited conception of what it means for a machine to know a proposition. In the current view, the machine is regarded as knowing a fact if its state either explicitly encodes the fact as a sentence of an interpreted formal language or if such a sentence can be derived from other encoded sentences according to the rules of an appropriate logical system. We contrast this conception, the interpreted-symbolic-structure approach, with another, the situated-automata approach, which seeks to analyze knowledge in terms of relations between the state of a machine and the state of its environment over time using logic as a metalanguage in which the analysis is carried out.  相似文献   

13.
基于语句聚类识别的知识动态提取方法研究   总被引:6,自引:0,他引:6  
苏牧  肖人彬 《计算机学报》2001,24(5):487-495
根据自然语言的群集现象和对知识体系动态更新的要求,该文提出了一种基于语句聚类识别的知识动态提取方法。文中首先给出了知识动态提取方法的研究框架,该框架描述了由自然语言文卷到面向对象知识体系的转换过程。研究了语句矢量化的有关问题,给出了若干基本定义和一个判定定理,讨论了句元属性矢量的后置处理。提出了基于神经网络的语句聚类识别方法,采用前信度概念作为语句识别结果可信性的一种度量,利用Matlab编写了一个ART2神经网络仿真程序,给出了该神经网络对语句识别效果且作了相应分析。根据ART2网络对语句进行识别的结果,需将聚类后的各个语句进行知识形式的转换,为此提出了中间代码生成的宽度优先方法,并定义后信度作为对语句识别及语义模型构造可信性 的一个最终评价指标;进而针对合取规则句型,具体介绍了该方法的实现步骤。最后运用结构建模新方法生成结构化的派生关系,从而完成了由自然语言文卷到面向对象知识体系的知识提取过程。作者将一个机械CAD为背景的应用实例贯穿全文,演示了该实例的具体实现,证实了所提方法的有效性。  相似文献   

14.
Aggregation in Natural Language Generation   总被引:2,自引:0,他引:2  
The content of real‐world databases, knowledge bases, database models, and formal specifications is often highly redundant and needs to be aggregated before these representations can be successfully paraphrased into natural language. To generate natural language from these representations, a number of processes must be carried out, one of which is sentence planning where the task of aggregation is carried out. Aggregation, which has been called ellipsis or coordination in Linguistics, is the process that removes redundancies during generation of a natural language discourse, without losing any information.
The article describes a set of corpus studies that focus on aggregation, provides a set of aggregation rules, and finally, shows how these rules are implemented in a couple of prototype systems. We develop further the concept of aggregation and discuss it in connection with the growing literature on the subject. This work offers a new tool for the sentence planning phase of natural language generation systems.  相似文献   

15.
在模糊理论基础上,建立了基于通用模糊类型的模糊知识表,定义了标准SQL语言中select语句的模糊条件格式,并构造了模糊select语句的解释函数,用户如同写普通精确select语句一样书写模糊语句,系统通过解释函数自动识别并执行,在关系数据库基础上实现了SQL语言中select语句的模糊扩展.  相似文献   

16.
This report describes the current state of our central research thrust in the area of natural language generation. We have already reported on our text-level theory of lexical selection in natural language generation ([59, 60]), on a unification-based syntactic processor for syntactic generation ([73]) and designed a relatively flexible blackboard-oriented architecture for integrating these and other types of processing activities in generation ([60]). We have implemented these ideas in our prototype generator, Diogenes — a DIstributed, Opportunistic GENEration System — and tested our lexical selection and syntactic generation modules in a comprehensive natural language processing project — the KBMT-89 machine translation system ([15]). At this stage we are developing a more comprehensive Diogenes system, concentrating on both the theoretical and the system-building aspects of a) formulating a more comprehensive theory of distributed natural language generation; b) extending current theories of text organization as they pertain to the task of planning natural language texts; c) improving and extending the knowledge representation and the actual body of background knowledge (both domain and discourse/pragmatic) required for comprehensive text planning; d) designing and implementing algorithms for dynamic realization of text structure and integrating them into the blackboard style of communication and control; e) designing and implementing control algorithms for distributed text planning and realization. In this document we describe our ideas concerning opportunistic control for a natural language generation planner and present a research and development plan for the Diogenes project.Many people have contributed to the design and development of the Diogenes generation system over the last four years, especially Eric Nyberg, Rita McCardell, Donna Gates, Christine Defrise, John Leavitt, Scott Huffman, Ed Kenschaft and Philip Werner. Eric Nyberg and Masaru Tomita have created genkit, which is used as the syntactic component of Diogenes. A short version of this article appeared in Proceedings of IJCAI-89, co-authored with Victor Lesser and Eric Nyberg. To all the above many thanks. The remaining errors are the responsibility of this author.  相似文献   

17.
Abductive inferences are commonplace during natural language processing. Having identified some limitations of an existing parsimonious covering model of abductive diagnostic inference, we developed an extended, dual-route version to address issues in word sense disambiguation and logical form generation. the details of representing knowledge in this framework and the syntactic route of covering are described in a companion article [V. Dasigi, Int. J. Intell. Syst., 9 , 571-608 (1994)]. Here, we describe the semantic covering process in detail. A dual-route algorithm that integrates syntactic and semantic covering is given. Taking advantage of the “transitivity” of irredundant syntactic covering, plausible semantic covers are searched for, based on some heuristics, in the space of irredundant syntactic covers. Syntactic covering identifies all possible candidates for semantic covering, which in turn, helps focus syntactic covering. Attributing both syntactic and semantic facets to “open-class” linguistic concepts makes this integration possible. an experimental prototype has been developed to provide a proof-of-concept for these ideas in the context of expert system interfaces. the prototype has at least some ability to handle ungrammatical sentences, to perform some nonmonotonic inferences, etc. We believe this work provides a starting point for a nondeductive inference method for logical form generation, exploiting the associative linguistic knowledge. © 1994 John Wiley & Sons, Inc.  相似文献   

18.
《Knowledge》2006,19(7):459-470
This paper presents a set of experiments we carried out with, Divago, a system that is an attempt to implement our ideas towards a computational model of creativity. It is expected to be able to generate novel concepts out of previous knowledge. Here we show its behaviour with a large dataset constructed independently by other researchers consisting of over 170 nouns (for a project named C3). Each noun is represented with a syntax that is equivalent to the one adopted for Divago. We apply a two step experimentation procedure, which starts by “training” the system with “preferred outcomes” and then allowing it to do free generation, constrained by the pragmatic goal of a given query. We evaluate the results and make a short discussion regarding well-defined criteria of novelty and usefulness. We also present a comparison with a similar experiment done with C3.  相似文献   

19.
An effective algorithm for extracting two useful features from text documents for analyzing word collocation habits, “Frequency Rank Ratio” (FRR) and “Intimacy”, is proposed. FRR is derived from a ranking index of a word according to its word frequency. Intimacy, computed by a compact language model called Influence Language Model (ILM), measures how close a word is to others within the same sentence. Using the proposed features, a visualization framework is developed for word collocation analysis. To evaluate our proposed framework, two corpora are designed and collected from the real-life data covering diverse topics and genres. Extensive simulations are conducted to illustrate the feasibility and effectiveness of our visualization framework. Our results demonstrate that the proposed features and algorithm are able to conduct reliable text analysis efficiently.  相似文献   

20.
提出了一种基于HNC自然语言理解框架下的中文问答处理算法,并在此算法基础上加以系统实现。试验证明,该系统在中等规模常识库基础上效果显著、准确率高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号