首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 295 毫秒
1.
提出了汉语时间信息的新分类和时间模式的概念,基于时间模式对汉语句子的时间信息进行形式化,构建汉语句子的词汇信息和语法信息时间模式库;提出多策略汉语句子时间分析和英译方法,将汉语单句时间分析算法、汉语关联词语标记句时间分析算法、类虚拟语气句时间分析算法和篇章信息识别规则相结合。实验表明该方法能有效解决汉英机器翻译中汉语句子时间分析和英译问题。  相似文献   

2.
在机器翻译的方法中,基于规则的翻译方法和基于实例的翻译方法都有各自的优缺点。结合两者的优点,提出了一种基于弱化语法规则的机器翻译方法,该方法在大量分析句子语法特征的基础上以及利用语言专家的知识构建语法规则库,在利用语法识别出句子后,把句子的各个组成部分标记出来,然后利用语义块进行匹配推理翻译出句子的各个组成部分。最后根据句法把句子的各个译出的组成部分装配起来形成译文。实验表明,该方法达到了较好的翻译效果,并且具有较大的发展潜力。  相似文献   

3.
从日语格语法表示生成汉语的难点分析   总被引:2,自引:1,他引:1  
讨论了基于转换规则的日汉机器翻译中的汉语生成方法,重点分析了基于日语格语法表示的汉语生成所面临的难点,主要包括单词词义的选择、格短语处理,基于汉语语法语义链的语序调整和句子的归并生成,同时,还对句的语气,时体态,标点符号和关联词的表层处理等进行了讨论。  相似文献   

4.
该文通过构建古汉语词典模型,结合黎锦熙先生提出的句本位句法相关规则构造知识库,使用词义消歧算法,对古汉语进行基于规则的机器翻译研究。实验以基于句本位语法进行句法标注后的《论语》作为测试语料,以句子为单位进行机器翻译,通过获取待选义项、构建义项选择模型、调整句法顺序等手段生成翻译结果集,并使用二元语法模型对结果进行优选,得到机器翻译最终结果,最后对翻译结果进行了分析测评。  相似文献   

5.
基于概率统计技术和规则方法的新词发现   总被引:9,自引:1,他引:8  
贾自艳  史忠植 《计算机工程》2004,30(20):19-21,83
新词/短语的识别是自然语言处理、信息检索和机器翻译等领域的一项基础研究。该文分析了已有短语抽取技术,并结合汉语特点,提出了基于概率统计技术和规则方法相结合的概念抽取方法。该方法包括高效的“二元语法”统计模型、统计算法、统计选词策略、丰富的规则知识和规则过滤算法。实验证明该方法适用于从大规模语料库中自动高效地发现新词/短语。  相似文献   

6.
该研究以型式语法为理论基础,通过链语法形式化语法体系对动词型式进行了形式化,并对链语法动词词典进行了重构,旨在构建一个更好的面向中国学生的英语书面语动词形式错误检查系统。测试结果显示,重构后链语法词典的查错性能和句法分析能力得到提高。对错句检查的召回率比原词典提高了4.5%,准确率提高了15.7%;对本族者正确分析句子的准确率提高了12.2%。研究表明,该研究所基于的语言学理论(动词型式语法)和形式模型(链语法)可以较好地适用于中国学生书面英语动词形式错误检查系统的构建。  相似文献   

7.
汉蒙统计机器翻译中的调序方法研究   总被引:1,自引:0,他引:1  
在基于短语的汉蒙统计机器翻译系统的研究中,我们发现存在着严重的语序错误。该文在对汉语和蒙古语句子语序进行研究的基础上,提出了基于蒙古语语序的汉语句子调序方法; 同时介绍了调序规则和调序算法的设计;最后给出了具体实验。实验证明这种方法明显提高了现有汉蒙机器翻译系统的性能。  相似文献   

8.
基于词类串的汉语句子结构相似度计算方法   总被引:9,自引:1,他引:9  
句子相似度的衡量是基于实例机器翻译研究中最重要的一个内容。对于基于实例的汉英机器翻译研究,汉语句子相似度衡量的准确性,直接影响到最后翻译结果的输出。本文提出了一种汉语句子结构相似性的计算方法。该方法比较两个句子的词类信息串,进行最优匹配,得到一个结构相似性的值。在小句子集上的初步实验结果表明,该方法可行,有效,符合人的直观判断。  相似文献   

9.
对于句子级别的神经机器翻译,由于不考虑句子所处的上下文信息,往往存在句子语义表示不完整的问题。该文通过依存句法分析,对篇章中的每句话提取有效信息,再将提取出的信息,补全到源端句子中,使得句子的语义表示更加完整。该文在汉语-英语语言对上进行了实验,并针对篇章语料稀少的问题,提出了在大规模句子级别的平行语料上的训练方法。相比于基准系统,该文提出的方法获得了1.47个BLEU值的提高。实验表明,基于补全信息的篇章级神经机器翻译,可以有效地解决句子级别神经机器翻译语义表示不完整的问题。  相似文献   

10.
汉语传统语法首推黎锦熙《新著国语文法》为代表。黎氏语法是以讲句子成分和句子格局为主要特征的语法体系,被称为“句本位”的语法。该文首先简要回顾了汉语语法体系自《马氏文通》以来的变化发展历史,梳理了传统语法与结构语法两大流派的主要思想和理论特色。然后从汉语树库角度剖析了当前中文信息处理领域主流语法体系的优缺点,并将它们与传统语法体系做了深入的比较分析,得出将传统语法应用于中文信息处理的必要性。最后讨论传统语法在中文信息处理领域应用需要面对的几个关键问题。  相似文献   

11.
Toward a lexicalized grammar for interlinguas   总被引:1,自引:0,他引:1  
In this paper we present one aspect of our research on machine translation (MT): capturing the grammatical and computational relation between (i) the interlingua (IL) as defined declaratively in the lexicon and (ii) the IL as defined procedurally by way of algorithms that compose and decompose pivot IL forms. We begin by examining the interlinguas in the lexicons of a variety of current IL-based approaches to MT. This brief survey makes it clear that no consensus exists among MT researchers on the level of representation for defining the IL. In the section that follows, we explore the consequences of this missing formal framework for MT system builders who develop their own lexical-IL entries. The lack of software tools to support rapid IL respecification and testing greatly hampers their ability to modify representations to handle new data and new domains. Our view is that IL-based MT research needs both (a) the formal framework to specify possible IL grammars and (b) the software support tools to implement and test these grammars. With respect to (a), we propose adopting a lexicalized grammar approach, tapping research results from the study oftree grammars for natural language syntax. With respect to (b), we sketch the design and functional specifications for parts of ILustrate, the set of software tools that we need to implement and test the various IL formalisms that meet the requirements of a lexicalized grammar. In this way, we begin to address a basic issue in MT research, how to define and test an interlingua as a computational language — without building a full MT system for each possible IL formalism that might be proposed.  相似文献   

12.
This paper presents the modules that comprise a knowledge-based sign synthesis architecture for Greek sign language (GSL). Such systems combine natural language (NL) knowledge, machine translation (MT) techniques and avatar technology in order to allow for dynamic generation of sign utterances. The NL knowledge of the system consists of a sign lexicon and a set of GSL structure rules, and is exploited in the context of typical natural language processing (NLP) procedures, which involve syntactic parsing of linguistic input as well as structure and lexicon mapping according to standard MT practices. The coding on linguistic strings which are relevant to GSL provide instructions for the motion of a virtual signer that performs the corresponding signing sequences. Dynamic synthesis of GSL linguistic units is achieved by mapping written Greek structures to GSL, based on a computational grammar of GSL and a lexicon that contains lemmas coded as features of GSL phonology. This approach allows for robust conversion of written Greek to GSL, which is an essential prerequisite for access to e-content by the community of native GSL signers. The developed system is sublanguage oriented and performs satisfactorily as regards its linguistic coverage, allowing for easy extensibility to other language domains. However, its overall performance is subject to current well known MT limitations.  相似文献   

13.
One may indicate the potentials of an MT system by stating what text genres it can process, e.g., weather reports and technical manuals. This approach is practical, but misleading, unless domain knowledge is highly integrated in the system. Another way to indicate which fragments of language the system can process is to state its grammatical potentials, or more formally, which languages the grammars of the system can generate. This approach is more technical and less understandable to the layman (customer), but it is less misleading, since it stresses the point that the fragments which can be translated by the grammars of a system need not necessarily coincide exactly with any particular genre. Generally, the syntactic and lexical rules of an MT system allow it to translate many sentences other than those belonging to a certain genre. On the other hand it probably cannot translate all the sentences of a particular genre. Swetra is a multilanguage MT system defined by the potentials of a formal grammar (standard referent grammar) and not by reference to a genre. Successful translation of sentences can be guaranteed if they are within a specified syntactic format based on a specified lexicon. The paper discusses the consequences of this approach (Grammatically Restricted Machine Translation, GRMT) and describes the limits set by a standard choice of grammatical rules for sentences and clauses, noun phrases, verb phrases, sentence adverbials, etc. Such rules have been set up for English, Swedish and Russian, mainly on the basis of familiarity (frequency) and computer efficiency, but restricting the grammar and making it suitable for several languages poses many problems for optimization. Sample texts — newspaper reports — illustrate the type of text that can be translated with reasonable success among Russian, English and Swedish.  相似文献   

14.
This paper presents a methodology for evaluating Arabic Machine Translation (MT) systems. We are specifically interested in evaluating lexical coverage, grammatical coverage, semantic correctness and pronoun resolution correctness. The methodology presented is statistical and is based on earlier work on evaluating MT lexicons in which the idea of the importance of a specific word sense to a given application domain and how its presence or absence in the lexicon affects the MT system’s lexical quality, which in turn will affect the overall system output quality. The same idea is used in this paper and generalized so as to apply to grammatical coverage, semantic correctness and correctness of pronoun resolution. The approach adopted in this paper has been implemented and applied to evaluating four English-Arabic commercial MT systems. The results of the evaluation of these systems are presented for the domain of the Internet and Arabization.  相似文献   

15.
蜕变测试技术综述   总被引:4,自引:0,他引:4  
软件测试是一种重要的、不可缺少的软件质量保证技术,用于发现和纠正软件中存在的缺陷和错误,但在很多情况下待测程序的预期输出难以确定。蜕变测试技术通过检查程序的多个执行结果之间的关系来测试程序,可以有效地解决上述问题。经过近十年的研究,蜕变测试技术已经在测试过程的优化、与其他验证或测试方法的结合等方面取得了巨大的进展,并被广泛地应用于各个领域中。对当前蜕变测试技术的研究进行了综述,针对已有方法的不足之处,对未来的研究方向进行了展望,包括蜕变测试充分性研究、实用蜕变关系构造技术、实用原始测试用例选取技术、新型软件中蜕变测试技术的研究、蜕变测试工具的开发等。  相似文献   

16.
Guaranteed recall of all training pairs for bidirectionalassociative memory   总被引:1,自引:0,他引:1  
Necessary and sufficient conditions are derived for the weights of a generalized correlation matrix of a bidirectional associative memory (BAM) which guarantee the recall of all training pairs. A linear programming/multiple training (LP/MT) method that determines weights which satisfy the conditions when a solution is feasible is presented. The sequential multiple training (SMT) method is shown to yield integers for the weights, which are multiplicities of the training pairs. Computer simulation results, including capacity comparisons of BAM, LP/MT BAM, and SMT BAM, are presented.  相似文献   

17.
This paper describes developments in the area of machine translation (MT). First, the paper gives an overview of developments in Germany in general; then, special problems are discussed. The system taken as an example is METAL (Machine Translation and Analysis of Natural Language), where recent development work has centered around two main topics. (i) Efforts have been made to make the system really multilingual. The German-to-English prototype had to be expanded, some system components had to be readjusted, and additional problems had to be solved. Currently, analysis and synthesis components for German, English, French, Spanish, and Dutch are under development. All these languages use a common system kernel and a standard interface structure. (ii) The system had to be made user-friendly. This was an even more important task as, up to now, MT systems have not been well accepted by users. METAL tries to be more realistic, and also tries to support the main user interfaces in a much better way than has been done before. This is based on the conviction that there are several parameters which determine the real success of an MT system. It is not just translation quality which is decisive, it is also the integration of an MT system into the whole process of preparing and translating documents.Gregor Thurmair is head of the Linguistics Department at Siemens Nixdorf Information Systems and project leader of the machine translation group, METAL. He is involved in projects in information retrieval (morphological analysis), speech understanding (parsing, semantics) and machine translation (METAL system). He has presented papers on morphology, semantics in speech understanding, transfer problems in MT, and grammar checking.  相似文献   

18.
MT信号现场处理的实现技术研究   总被引:2,自引:0,他引:2  
为满足大地电磁测探(MT)信号现场处理的要求,本文在分析与优化MT信号处理算法的基础上,提出以TMS320C30高速浮点数字信号处理器构成主从式信号处理系统,其运算速度比用Fortran及其他语言编写的程序在IBMPC机上运行要快几十倍,运算精度比有定点数字信号处理器TMS32020高,文中讨论了软件设计方法并给测试结果,本文对解决MT信号野外采集的现场处理的计算瓶颈提供了一种有效的途径。  相似文献   

19.
This paper discusses the evaluation of automated metrics developed for the purpose of evaluating machine translation (MT) technology. A general discussion of the usefulness of automated metrics is offered. The NIST MetricsMATR evaluation of MT metrology is described, including its objectives, protocols, participants, and test data. The methodology employed to evaluate the submitted metrics is reviewed. A summary is provided for the general classes of evaluated metrics. Overall results of this evaluation are presented, primarily by means of correlation statistics, showing the degree of agreement between the automated metric scores and the scores of human judgments. Metrics are analyzed at the sentence, document, and system level with results conditioned by various properties of the test data. This paper concludes with some perspective on the improvements that should be incorporated into future evaluations of metrics for MT evaluation.  相似文献   

20.
基于WordNet词义消歧的系统融合   总被引:3,自引:3,他引:0  
刘宇鹏  李生  赵铁军 《自动化学报》2010,36(11):1575-1580
最近混淆网络在融合多个机器翻译结果中展示很好的性能. 然而为了克服在不同的翻译系统中不同的词序, 假设对齐在混淆网络的构建上仍然是一个重要的问题. 但以往的对齐方法都没有考虑到语义信息. 本文为了更好地改进系统融合的性能, 提出了用词义消歧(Word sense disambiguation, WSD)来指导混淆网络中的对齐. 同时骨架翻译的选择也是通过计算句子间的相似度来获得的, 句子的相似性计算使用了二分图的最大匹配算法. 为了使得基于WordNet词义消歧方法融入到系统中, 本文将翻译错误率(Translation error rate, TER)算法进行了改进, 实验结果显示本方法的性能好于经典的TER算法的性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号