首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
We present TransType: a new approach to Machine-Aided Translation in which the human translator maintains control of the translation process while being helped by real-time completions proposed by a statistical translation engine. The TransType approach is first presented through a series of prototypes that illustrate their underlying translation model and graphical interface. The results of two rounds of in situ evaluation of TransType prototypes are discussed followed by a set of lessons learned in these experiments. It will be shown that this approach is valued by translators but given the short time allotted for the evaluation, translators were not able to quantitatively increase their productivity. TransType is compared with other approaches and new perspectives are elaborated for a new version being developed in the context of a Fifth Framework European Community Project. This revised version was published online in November 2006 with corrections to the Cover Date.  相似文献   

2.
The use of Machine Translation as a tool for professional or other highly skilled translators is for the most part currently limited to postediting arrangements in which the translator invokes MT when desired and then manually cleans up the results. A theoretically promising but hitherto largely unsuccessful alternative to postediting for this application is interactive machine translation (IMT), in which the translator and MT system work in tandem. We argue that past failures to make IMT viable as a tool for skilled translators have been the result of an infelicitous mode of interaction rather than any inherent flaw in the idea. As a solution, we propose a new style of IMT in which the target text under construction serves as the medium of communication between an MT system and its user. We describe the design, implementation, and performance of an automatic word completion system for translators which is intended to demonstrate the feasibility of the proposed approach, albeit in a very rudimentary form.  相似文献   

3.
With increasing human-machine interaction in the professional translator's work environment, more and more translator training programs are launching translation-specific computer studies. This paper focuses on the research-oriented, as opposed to the practically-oriented, translation program. We argue that computer studies in such a program should prepare students for research at either the receiving or production ends of machine translation systems, both of which require linguistic, computational and translational expertise. We discuss some general considerations for the design of such computer studies, based on a seminar given in the M.A. Translation program at the University of Ottawa, Canada.Ingrid Meyer is an assistant professor in the School of Translators and Interpreters, University of Ottawa. Her research interests are machine translation and computational lexicography.  相似文献   

4.
通过汉语到英语的翻译实验以及对结果译文的分析,对基于词的模型、基于短语的模型和基于句法的模型的翻译性能进行了比较.结果表明基于短语的模型性能优于其他两个模型,但是使用的参数较多;基于句法的模型虽然翻译性能不理想,但可以用较少的参数表达更丰富的信息,值得深入研究.  相似文献   

5.
标题反映文章的灵魂,精确把握标题能迅速领悟文章的中心内容。本文利用统计机器翻译方法搭建了一个机器翻译平台,使用兹平台对航空领域标题进行翻译,井采用国际评测NIST工具对该平台进行了开放测试和对闭测试,测试结果表明该统计方法对领域标题翻译具有有效性。  相似文献   

6.
面向统计机器翻译的重对齐方法研究   总被引:3,自引:0,他引:3  
词对齐是统计机器翻译中的重要技术之一。该文提出了一种重对齐方法,它在IBM models获得的正反双向词对齐的基础上,确定出正反双向对齐不一致的部分。之后,对双向词对齐不一致的部分进行重新对齐以得到更好的对称化的词对齐结果。此外,该文提出的方法还可以利用大规模单语语料来强化对齐结果。实验结果表明,相比在统计机器翻译中广泛使用的基于启发信息的词对齐对称化方法,该文提出的方法可以使统计机器翻译系统得到更高的翻译准确率。  相似文献   

7.
Current statistical machine translation systems are mainly based on statistical word lexicons. However, these models are usually context-independent, therefore, the disambiguation of the translation of a source word must be carried out using other probabilistic distributions (distortion distributions and statistical language models). One efficient way to add contextual information to the statistical lexicons is based on maximum entropy modeling. In that framework, the context is introduced through feature functions that allow us to automatically learn context-dependent lexicon models.In a first approach, maximum entropy modeling is carried out after a process of learning standard statistical models (alignment and lexicon). In a second approach, the maximum entropy modeling is integrated in the expectation-maximization process of learning standard statistical models.Experimental results were obtained for two well-known tasks, the French–English Canadian Parliament Hansards task and the German–English Verbmobil task. These results proved that the use of maximum entropy models in both approaches, can help to improve the performance of the statistical translation systems.This work has been partially supported by the European Union under grant IST-2001-32091 and by the Spanish CICYT under project TIC-2003-08681-C02-02. The experiments on the Verbmobil task were done when the first author was a visiting scientist at RWTH Aachen-Germany.Editors: Dan Roth and Pascale Fung  相似文献   

8.
This paper sketches research in nine areas related to spoken language translation: interactive disambiguation (two demonstrations of highly interactive, broad-coverage speech translation are reported); system architecture; data structures; the interface between speech recognition and analysis; the use of natural pauses for segmenting utterances; example-based machine translation; dialogue acts; the tracking of lexical co-occurrences; and the resolution of translation mismatches.  相似文献   

9.
为了提高翻译系统的翻译准确率,在短语基础上结合模板的方法自动抽取模板结构;解码时,首先进行模板匹配,套用模板结构进行翻译,然后再按照Beam Search搜索算法进行后续翻译。因此,该方法可以有效地解决单一的统计翻译中语序错误。以汉蒙翻译为例,实验结果显示此方法可以有效地提高翻译效果,翻译效率比基于短语的统计翻译方法提高10%。  相似文献   

10.
该文对基于短语的统计机器翻译模型的删词问题进行研究与分析,使用人工评价的方式将删词错误分为3类。该文通过两种方法,即基于频次的方法和基于词性标注的方法,对源语言句子中关键词汇进行识别。通过对传统的短语对抽取算法中引入源语言对空关键词汇的约束来缓解删词错误问题。自动评价方法以及人工评价方法证明,该方法在汉英翻译任务以及英汉翻译任务中显著的缓解了删词错误问题,同时得到一个精简的短语翻译表。  相似文献   

11.
刘颖  姜巍 《计算机工程与应用》2012,48(32):98-101,146
对齐短语是决定统计机器翻译系统质量的核心模块。提出基于短语结构树的层次短语模型,这是利用串-树模型的思想对层次短语模型的扩展。基于短语结构树的层次短语模型是在双语对齐短语的基础之上结合英语短语结构树抽取翻译规则,并利用启发式策略获得翻译规则的扩展句法标记。采用翻译规则的统计机器翻译系统在不同数据集上具有稳定的翻译结果,在训练集和测试集的平均BlEU评分高于短语模型和层次短语模型的BLEU评分。  相似文献   

12.
A multiphase machine translation approach, Generate and Repair Machine Translation (GRMT), is proposed. GRMT is designed to generate accurate translations that focus primarily on retaining the linguistic meaning of the source language sentence. GRMT presently incorporates a limited multilingual translation capability. The central idea behind the GRMT approach is to generate a translationcandidate (TC) by quick and dirty machine translation (QDMT), then investigate the accuracy of that TC by translation candidate evaluation (TCE), and, if necessary, revise the translation in the repair and iterate (RI) phase. To demonstrate the GRMT approach, a translation system that translates from English to Thai has been developed. This paper presents the design characteristics and some experimental results of QDMT and also the initial design, some experiments, and proposed ideas behind TCE and RI.  相似文献   

13.
In this paper, we describe a first version of a system for statisticaltranslation and present experimental results. The statistical translationapproach uses two types of information: a translation model and a languagemodel. The language model used is a standard bigram model. The translationmodel is decomposed into lexical and alignment models. After presenting the details of the alignment model, we describe the search problem and present a dynamic programming-based solution for the special case of monotone alignments.So far, the system has been tested on two limited-domain tasks for which abilingual corpus is available: the EuTrans traveller task (Spanish–English,500-word vocabulary) and the Verbmobil task (German–English, 3000-wordvocabulary). We present experimental results on these tasks. In addition to the translation of text input, we also address the problem of speech translation and suitable integration of the acoustic recognition process and the translation process.  相似文献   

14.
电子词典是在机器翻译系统中包含的信息量最大的一个部件,电子词典包的质量和容量直接限定机器翻译的质量和应用范围。与一般的电子词典不同,机器翻译词典每个词条都要比一般的电子词典增加词类信息、语义类别信息和成语等。文章以频率统计和频率分布统计作为维汉机器翻译词典的词条收录原则,统计维吾尔文中常用的单词数目,论述维汉机器翻译词典的设计思想,用BNF形式语言和Jackson图描述维汉机器翻译词典应包含的词条信息,最后介绍词典的具体构造方法、词条排序原则、索引表和属性库的数据结构和词典信息的查找方法。试验表明该词典在解决维吾尔语词汇歧义、结构歧义、提高汉语译文准确率等方面较为有效。  相似文献   

15.
英汉机器翻译系统ECT中的知识库   总被引:1,自引:0,他引:1  
提出 E- Chunk概念 ,它是一种新的知识表示方式 .E- Chunk是无歧义翻译单元 ,形式上是一个无翻译歧义的单词或单词串 .它是基于语义定义的 ,具有无歧义性、复现性、可嵌套性、内部结构句法自足性等特征 .本文详细介绍了英汉机器翻译系统 ECT中的三类知识库 :电子词典、E- Chunk库和规则库  相似文献   

16.
This article presents statistical language translation models,called dependency transduction models, based on collectionsof head transducers. Head transducers are middle-out finite-state transducers which translate a head word in a source stringinto its corresponding head in the target language, and furthertranslate sequences of dependents of the source head into sequencesof dependents of the target head. The models are intended to capturethe lexical sensitivity of direct statistical translation models,while at the same time taking account of the hierarchical phrasalstructure of language. Head transducers are suitable for directrecursive lexical translation, and are simple enough to be trainedfully automatically. We present a method for fully automatictraining of dependency transduction models for which the only inputis transcribed and translated speech utterances. The method has beenapplied to create English–Spanish and English–Japanese translationmodels for speech translation applications. The dependencytransduction model gives around 75% accuracy for an English–Spanishtranslation task (using a simple string edit-distance measure) and70% for an English–Japanese translation task. Enhanced with targetn-grams and a case-based component, English–Spanish accuracy is over76%; for English–Japanese it is 73% for transcribed speech, and60% for translation from recognition word lattices.  相似文献   

17.
基于短语的统计翻译模型是目前机器翻译领域广泛使用的模型之一。但是,由于在解码时采用短语精确匹配的策略,造成了严重的数据稀疏问题,短语表中的大量短语无法得到充分利用。为此,该文提出了人机互助的交互式翻译方法。对于翻译短语表中找不到的短语,首先通过模糊匹配的方法,在短语表中寻找与其相似的短语。然后利用组合分类器,判断哪些相似短语可能提高句子的翻译质量。最后,通过人机交互的方法,选择可能提高翻译质量且保持原句语义的短语。在口语语料上的实验结果证明,这种方法可以有效地提高翻译系统的译文质量。  相似文献   

18.
Word reordering is one of the challengeable problems of machine translation. It is an important factor of quality and efficiency of machine translation systems. In this paper, we introduce a novel reordering model based on an innovative structure, named, phrasal dependency tree. The phrasal dependency tree is a modern syntactic structure which is based on dependency relationships between contiguous non-syntactic phrases. The proposed model integrates syntactical and statistical information in the context of log-linear model aimed at dealing with the reordering problems. It benefits from phrase dependencies, translation directions (orientations) and translation discontinuity between translated phrases. In comparison with well-known and popular reordering models such as distortion, lexicalised and hierarchical models, the experimental study demonstrates the superiority of our model in terms of translation quality. Performance is evaluated for Persian → English and English → German translation tasks using Tehran parallel corpus and WMT07 benchmarks, respectively. The results report 1.54/1.7 and 1.98/3.01 point improvements over the baseline in terms of BLEU/TER metrics on Persian → English and German → English translation tasks, respectively. On average our model retrieved a significant impact on precision with comparable recall value with respect to the lexicalised and distortion models.  相似文献   

19.
    
Nava Ehsan  Heshaam Faili 《Software》2013,43(2):187-206
Producing electronic rather than paper documents has considerable benefits such as easier organizing and data management. Therefore, existence of automatic writing assistance tools such as spell and grammar checker/correctors can increase the quality of electronic texts by removing noise and correcting the erroneous sentences. Different kinds of errors in a text can be categorized into spelling, grammatical and real‐word errors. In this article, we present a language‐independent approach based on a statistical machine translation framework to develop a proofreading tool, which detects grammatical errors as well as context‐sensitive spelling mistakes (real‐word errors). A hybrid model for grammar checking is suggested by combining the mentioned approach with an existing rule‐based grammar checker. Experimental results on both English and Persian languages indicate that the proposed statistical method and the rule‐based grammar checker are complementary in detecting and correcting syntactic errors. The results of the hybrid grammar checker, applied to some English texts, show an improvement of about 24% with respect to the recall metric with almost similar value for precision. Experiments on real‐world data set show that state‐of‐the‐art results are achieved for grammar checking and context‐sensitive spell checking for Persian language. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

20.
This paper proposes a novel method for phrase-based statistical machine translation based on the use of a pivot language. To translate between languages L s and L t with limited bilingual resources, we bring in a third language, L p , called the pivot language. For the language pairs L s  − L p and L p  − L t , there exist large bilingual corpora. Using only L s  − L p and L p  − L t bilingual corpora, we can build a translation model for L s  − L t . The advantage of this method lies in the fact that we can perform translation between L s and L t even if there is no bilingual corpus available for this language pair. Using BLEU as a metric, our pivot language approach significantly outperforms the standard model trained on a small bilingual corpus. Moreover, with a small L s  − L t bilingual corpus available, our method can further improve translation quality by using the additional L s  − L p and L p  − L t bilingual corpora.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号