首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper documents the development of an empirically-basedsystem implemented in Prolog that automatically resolves severalkinds of anaphora in Spanish texts. These are pronominalreferences, surface-count anaphora, one-anaphora and ellipticalzero-subject constructions (i.e., sentences that omit theirpronominal subject). The resolution is based onrepresentations resulting from either partial or full parsing. Thesystem developed can also work on the output of a POStagger or with different dictionaries, without changing thegrammar. This grammar represents the syntactic information of eachlanguage by means of the Slot Unification Grammar formalism. The different kinds of information used for anaphora resolution in full and partial parsing are shown, as wellas evaluation results. The system has been adapted toEnglish texts, obtaining encouraging results that prove that itcan be applied with only a very few refinements to other languagesas well as Spanish and English. In addition, the differencesbetween English and Spanish anaphora are noted.  相似文献   

2.
汉语中人称代词的消解研究   总被引:15,自引:0,他引:15  
人称代词的消解是自然语言处理中十分重要的问题,人称代词消解,就是确定人称代词与先行语之间的相互关系,从而明确人称代词究竟指代什么对象,现有的许多应用系统,如文本摘要、信息抽取等采取了从文本中直接抽取句子的做法,而结果可能会含有某些无先行语的人称代词,使理解变得非常困难,人称代词消解无疑可以解决类似的问题。该文主要结合句类基本知识,根据人称代词所在语义块中的语义角色和人称代词对应的先行语可能的语义角色,给出了消解人称代词的基本规则。同时,作者也从句法的角度,结合局部焦点法给出了优选性规则。  相似文献   

3.
在信息抽取过程中,无法被判别的回指易造成信息抽取不完整的情况,这种指代关系可通过分析当前语境下的指代部分、被指代部分、周围的信息及原文内容生成的唯一判别信息进行判断。为此,构建一个多层注意力机制模型,在不同层次上对上述信息进行基于注意力机制的概率计算,利用最终结果判别回指关系是否成立。在指代部分与被指代部分向量化后,通过2个注意力层上的4次概率计算,使每一个训练结果在判别之前都具有唯一性。在OntoNotes 5.0数据集上的实验结果表明,该模型F值在显性指代和零指代均存在的条件下为70.1%,在存在零指代的条件下为60.7%,高于尹庆宇等人提出的模型。  相似文献   

4.
Anaphora is a discourse-level linguistic phenomenon.There is consensus that anaphora resolution shouldrely on prior sentences within the context of thediscourse. We propose to cast anaphora resolution asa semantic inference process in which a combination ofmultiple strategies, each exploiting different aspectsof linguistic knowledge, is employed to provide acoherent resolution of anaphora. A framework whichencompasses several salient linguistic parameters suchas grammatical role, proximity, repetition, sentencerecency and semantic cues is demonstrated. This workalso shows how an anaphora-resolution algorithm can beembedded within a framework which captures all theabove salient parameters, as well as remedies some ofthe inadequacies found in any monolithic resolutionsystem. A language-neutral semantic representationcharacterized by semantic cues is presented in orderto capture the distilled information after resolution.The effectiveness of the language-neutralrepresentation, both for machine translation andanaphora resolution, is demonstrated through a set ofsimulations and evaluations.  相似文献   

5.
In this paper, we look at the current scenario in multilingual documentation generation and the types of tools currently being used in support of the translation task, and discuss their shortcomings. We examine emergent trends in the document industry, observing a reorganisation of the workflow which mirrors a shift of attention from translating to authoring and from the ergonomics of post-editing the target text to the ergonomics of producing the source text. We argue that these trends invite the design and development of new tools for the task of producing multilingual texts, and that multilingual generation provides the appropriate technology, shifting attention to an even earlier stage in the authoring process, that of specifying the semantics of the text to be produced. We describe a prototype system which exploits this technology to meet the expressed needs of authors and translators by supporting them in the drafting of multilingual instructions. We suggest that, in the future, a single platform to support multilingual documentation should integrate translation-oriented tools and generation-based tools to be employed as appropriate by different types of users (translators and authors) in different circumstances.  相似文献   

6.
7.
提出了信息抽取系统中同指求解方法,主要针对汉语篇章中与代词和定指短语相关的同指求解。该方法分为规则消解和统计因子消解两个组成部分。规则消解法给出了规则的一般形式、秩取途径及应用方法,还结合若干规则实例来表明规则的有效性。利用规则库中的规则对可能先行语进行分析过滤,如粜能“滤出”唯一的一个先行语,则这个即为求解到的先行语。否则进入统计因子消解过程。在这部分利用统计的方法,对消解因子库中的各因子分析求和,然后选择最大的作为求解的先行语。本算法具有很好的跨领域性和易扩展性,实现了该方法并进行了测试运行,结果表明该方法是行之有效的。  相似文献   

8.
The relationship between Lexical-Functional Grammar (LFG) functional structures (f-structures) for sentences and their semanticinterpretations can be formalized in linear logic in a way thatcorrectly explains the observed interactions between quantifier scopeambiguity, bound anaphora and intensionality.Our linear-logic formalization of the compositional properties ofquantifying expressions in natural language obviates the need forspecial mechanisms, such as Cooper storage, in representing thescoping possibilities of quantifying expressions. Instead, thesemantic contribution of a quantifier is recorded as a linear-logicformula whose use in a proof will establish the scope of thequantifier. Different proofs can lead to different scopes. In eachcomplete proof, the properties of linear logic ensure thatquantifiers are properly scoped.The interactions between quantified NPs and intensional verbs such asseek are also accounted for in this deductive setting. A singlespecification in linear logic of the argument requirements ofintensional verbs is sufficient to derive the correct readingpredictions for intensional-verb clauses both with nonquantified andwith quantified direct objects. In particular, both de dictoand de re readings are derived for quantified objects. Theeffects of type-raising or quantifying-in rules in other frameworksjust follow here as linear-logic theorems.While our approach resembles current categorial approaches inimportant ways (Moortgat, 1988, 1992a; Carpenter, 1993; Morrill, 1994)it differs from them in allowing the greater compositional flexibility ofcategorial semantics (van Benthem, 1991)while maintaining a precise connection to syntax. As a result, we areable to provide derivations for certain readings of sentences withintensional verbs and complex direct objects whose derivation inpurely categorial accounts of the syntax-semantics interface appearsto require otherwise unnecessary semantic decompositions of lexicalentries.  相似文献   

9.
董兴华  徐春  王磊  周喜 《计算机工程与应用》2012,48(15):144-148,200
描述了通过使用外部知识库和基于短语的翻译模型,利用多线程、任务分发的技术实现了一个在线的、高性能的多语言翻译引擎,已初步实现了维汉、哈汉、柯汉三种语言间的翻译。翻译引擎很容易扩展到其他语言对,具有翻译词、短语、句子、文件和网页的功能。  相似文献   

10.
指代消解研究现状综述   总被引:1,自引:0,他引:1  
指代消解是自然语言处理的一项关键环节,也是信息抽取的核心任务之一。针对指代消解的一些基本问题进行阐述,主要介绍利用机器学习的方法开展的共指消解相关研究,从共指消解模型、常见算法、语料库、特征、评测标准等方面概述相关工作。  相似文献   

11.
多语种翻译词汇的在线自动抽取   总被引:1,自引:0,他引:1  
越来越多网页以多种语言的形式在互联网上传播,从中抽取多语种翻译词汇具有重要的研究价值.针对网页的特点,提出了一种新的多语种翻译词汇的在线自动抽取方法.该方法通过对双语网页中超链接信息相似度的计算,获取多语种翻译词汇,相似性越高,对应的词条互为翻译对的可能性越大.通过对中英、德英、法英3类双语网页的抽取,结果证明它具有较高的准确率,是一种高效的与语言无关的多语种词汇对抽取方法.  相似文献   

12.
The model of prosody used in the Aculab TTS system is unusual in several respects. Firstly, it is based firmly on current metrical theories of prosody. Secondly, it is entirely knowledge-based: there are no stochastic components in the model. Thirdly, it makes use of a quasi-random element to avoid the predictability of conventional synthetic prosody. Fourthly, it is specifically designed for multilingual use: it currently handles several Germanic and Romance languages.  相似文献   

13.
This paper examines the technologies that enable the representation of Hebrew on websites. Hebrew is written from right to left and in non‐Latin characters, issues shared by a number of languages which seem to be converging on a shared solution—Unicode. Regarding the case of Hebrew, I show how competing solutions have given way to one dominant technology. I link processes in the Israeli context with broader questions about the ‘multilingual Internet,’ asking whether the commonly accepted solution for representing non‐Latin texts on computer screens is an instance of cultural imperialism and convergence around a western artifact. It is argued that while minority languages are given an online voice by Unicode, the context is still one of western power.  相似文献   

14.
This paper describes our work on developing a language-independent technique for discovery of implicit knowledge from multilingual information sources. Text mining has been gaining popularity in the knowledge discovery field, particularity with the increasing availability of digital documents in various languages from all around the world. However, currently most text mining tools mainly focus only on processing monolingual documents (particularly English documents): little attention has been paid to apply the techniques to handle the documents in Asian languages, and further extend the mining algorithms to support the aspects of multilingual information sources. In this work, we attempt to develop a language-neutral method to tackle the linguistics difficulties in the text mining process. Using a variation of automatic clustering techniques, which apply a neural net approach, namely the Self-Organizing Maps (SOM), we have conducted several experiments to uncover associated documents based on a Chinese corpus, Chinese-English bilingual parallel corpora, and a hybrid Chinese-English corpus. The experiments show some interesting results and a couple of potential paths for future work in the field of multilingual information discovery. Besides, this work is expected to act as a starting point for exploring the impacts on linguistics issues with the machine-learning approach to mining sensible linguistics elements from multilingual text collections.  相似文献   

15.
指代消解的基本方法和实现技术   总被引:29,自引:11,他引:18  
指代是自然语言中常见的语言现象,大量出现在篇章或对话中。随着篇章处理相关应用日益广泛,指代消解也显示出前所未有的重要性,并成为自然语言处理上热门的研究问题。针对指代和指代消解的有关问题,本文对基本概念作了说明,分析了语言中典型的指代现象和指代消解所需的基本语言知识;同时,介绍了指代消解中有代表性的几种计算模型和近10年来采用的若干实现技术。  相似文献   

16.
省略现象在对话中十分普遍,它的存在导致了语句成分的缺失.问答系统往往不能正确理解这些缺省的表述,这样就会产生错误的问答结果,所以,省略恢复在问答系统中是十分必要的.省略恢复通常分为零代词类别恢复、零代词指代消解2个步骤,已有工作主要是将二者顺序执行,因此会造成错误的累加.为了克服上述问题,提出了1种零代词类别恢复和零代词指代消解联合模型(joint model)的方法,旨在通过联合模型融合省略恢复的2个步骤,进而提高恢复效果.实验结果表明,相比较已有的方法,引入联合模型后,省略恢复的性能得到了显著的提升.  相似文献   

17.
Anaphora resolution in machine translation involves two aspects:(1) the identification of the antecedent, i.e., the determinationof co-reference relations between anaphor and antecedent; and (2)the translation of the anaphor, i.e., the selection of theappropriate target-language equivalent. The identification ofthe antecedent is essentially a monolingual, language-pairindependent problem which is usually solved during analysis. Theselection of the target-language equivalent, on the other hand,can be regarded as a language-pair dependent task which has to betackled during transfer and generation. In this paper, theproblems of anaphora translation are discussed for the languagepair Russian–German. Although in most cases source-languageanaphoric pronouns correspond to target-language anaphoricpronouns, in some cases this straightforward equation does nothold. Two cases of such translation discrepancies are treatedhere: zero anaphora and pronominal PPs. The differences in thedistribution of zero anaphora and pronominal PPs in Russian andGerman are described, and solutions to these translation problems basedon the Russian–German MT system T1 are presented.  相似文献   

18.
         下载免费PDF全文
Under statistical learning framework, the paper focuses on how to use traditional linguistic findings on anaphora resolution as a guide for mining and organizing contextual features for Chinese co-reference resolution. The main achievements are as follows. (1) In order to simulate "syntactic and semantic parallelism factor", we extract "bags of word form and POS" feature and "bag of seines" feature from the contexts of the entity mentions and incorporate them into the baseline feature set. (2) Because it is too coarse to use the feature of bags of word form, POS tag and seme to determine the syntactic and semantic parallelism between two entity mentions, we propose a method for contextual feature reconstruction based on semantic similarity computation, in order that the reconstructed contextual features could better approximate the anaphora resolution factor of "Syntactic and Semantic Parallelism Preferences". (3) We use an entity-mention-based contextual feature representation instead of isolated word-based contextual feature representation, and expand the size of the contextual windows in addition, in order to approximately simulate "the selectional restriction factor" for anaphora resolution. The experiments show that the multi-level contextual features are useful for co-reference resolution, and the statistical system incorporated with these features performs well on the standard ACE datasets.  相似文献   

19.
Critical path tracing,a fault simulation method for gate-level combinational circuits,is extended to the parallel critical path tracing for functional block-level combinational circuits.If the word length of the host computer is m,then the parallel critical path tracing will be approximately m times faster than the original one.  相似文献   

20.
共指消解作为自然语言处理中的一个重要问题一直受到学术界的重视。二十多年来,基于规则的和基于统计的不同方法被提出,在一定程度上推进了该问题研究的发展,并取得了大量研究成果。该文首先介绍了共指消解问题的基本概念,并采用形式化的方法对该问题做了描述;然后,针对国内外近年来在共指消解研究中的方法进行了总结;之后,对共指消解中重要的特征问题进行了分析与讨论;最后,历数了共指消解的各种国际评测,并对未来可能的研究方向进行了展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号