语音翻译是将源语言语音翻译为目标语言文本的过程.传统序列到序列模型应用到语音翻译领域时,模型对于序列长度较为敏感,编码端特征提取和局部依赖建模压力较大.针对这一问题,本文基于Transformer网络构建语音翻译模型,使用深度卷积网络对音频频谱特征进行前编码处理,通过对音频序列进行下采样,对音频频谱中的时频信息进行局部...  相似文献   

The Diplomat rapid-deployment speech-translation systemis intended to allow naï ve users to communicate across a languagebarrier, without strong domain restrictions, despite the error-pronenature of current speech and translation technologies. In addition,it should be deployable for new languages an order of magnitude morequickly than traditional technologies. Achieving this ambitious setof goals depends in large part on allowing the users to correct recognition and translation errors interactively. We present the Multi-Engine Machine Translation (MEMT) architecture, describing how it is well suited for such an application. We then discuss ourapproaches to rapid-deployment speech recognition and synthesis.Finally we describe our incorporation of interactive error correctionthroughout the system design. We have already developed workingbidirectional Croatian English and Spanish English systems, and have Haitian Creole English and Korean English versions under development.  相似文献   

This paper describes our initial effort in developing a trilingual speech interface for financial information inquiries. Our foreign exchange inquiry system consists of: (i) monolingual and trilingual speech recognizers, which receive the user's spoken input in the form of microphone speech; (ii) a real-time data capture component which continuously updates a relational database from a financial data satellite feed; and (iii) a trilingual speech generation component, which generates English and Chinese text based on the raw financial data. The generated text is then transformed into spoken presentations. English text is processed by the FESTIVAL synthesizer system. Chinese text is sent to our syllable-based synthesizer, which employs a concatenative resequencing technique to produce spoken presentations in Putonghua or Cantonese. The speech interface is augmented with a visual display which aims to provide feedback to the user at all times during an interaction. Within the restricted scope of foreign exchange (FOREX), our recognition performance accuracies remain above 93%. Confusions across languages contributed significantly to our recognition errors, but most are confusions between the same currency/country names spoken in different languages. These errors are not detrimental with respect to data retrieval. Our concatenative re-sequencing technique reports the date, time and exchange rates of the input currency pair. A demonstration can be found at http://www.se.cuhk.edu.hk/hccl/demos/.  相似文献   

刘宇宸  宗成庆 《软件学报》2023,34(4):1837-1849
语音翻译旨在将一种语言的语音翻译成另一种语言的语音或文本. 相比于级联式翻译系统, 端到端的语音翻译方法具有时间延迟低、错误累积少和存储空间小等优势, 因此越来越多地受到研究者们的关注. 但是, 端到端的语音翻译方法不仅需要处理较长的语音序列, 提取其中的声学信息, 而且需要学习源语言语音和目标语言文本之间的对齐关系, 从而导致建模困难, 且性能欠佳. 提出一种跨模态信息融合的端到端的语音翻译方法, 该方法将文本机器翻译与语音翻译模型深度结合, 针对语音序列长度与文本序列长度不一致的问题, 通过过滤声学表示中的冗余信息, 使过滤后的声学状态序列长度与对应的文本序列尽可能一致; 针对对齐关系难学习的问题, 采用基于参数共享的方法将文本机器翻译模型嵌入到语音翻译模型中, 并通过多任务训练方法学习源语言语音与目标语言文本之间的对齐关系. 在公开的语音翻译数据集上进行的实验表明, 所提方法可以显著提升语音翻译的性能.  相似文献   

译文质量的自动评价对机器翻译研究具有十分重要的意义。但现有方法主要是针对书面语翻译,没有考虑到口语翻译的特征。因此,本文提出了一种面向口语的新型的自动评价方法,通过定义信息段、标注权重和设计多种匹配策略等方法,使自动评价结果与人工打分更为接近,同时也提高了评价过程对不同输出译文的适应能力。各项实验表明,该算法对译文质量变化具有较高的敏感度,而且可以对输出译文质量作出与手工评判较为接近的评价结果。  相似文献   

Learning Translation Templates from Bilingual Translation Examples  
A mechanism for learning lexical correspondences between two languages from sets of translated sentence pairs is presented. These lexical level correspondences are learned using analogical reasoning between two translation examples. Given two translation examples, the similar parts of the sentences in the source language must correspond to the similar parts of the sentences in the target language. Similarly, the different parts must correspond to the respective parts in the translated sentences. The correspondences between similarities and between differences are learned in the form of translation templates. A translation template is a generalized translation exemplar pair where some components are generalized by replacing them with variables in both sentences and establishing bindings between these variables. The learned translation templates are obtained by replacing differences or similarities by variables. This approach has been implemented and tested on a set of sample training datasets and produced promising results for further investigation.  相似文献   

This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.  相似文献   

Grammar-based parsing is a prevalent method for natural language understanding(NLU)and has been introduced into dialogue systems for spoken language processing (SLP).A robust parsing scheme is proposed in this paper to overcome the notorious phenomena,such as garbage,ellipsis,word disordering,fragment ,and ill-form,which frequently occur in splien utterances,Keyword categories are used as terminal symbols,and the definition of grammar is extended by introducing three new rule types,by-passing,up-messing and overcrossing,in addition to the general rules called up-tying in this paper,and the use of semantic items simplifies the semantics extraction.The corresponding parser marionette,which is essentially a partial chart parser,is enhanced to parse the semantic grammar.The robust parsing scheme integrating the above methods has been adopted in an air traveling information service system called EasyFlight,and has achieved a high performance when used for parsing spontaneous speeches.  相似文献   

通过汉语到英语的翻译实验以及对结果译文的分析,对基于词的模型、基于短语的模型和基于句法的模型的翻译性能进行了比较。结果表明基于短语的模型性能优于其他两个模型,但是使用的参数较多;基于句法的模型虽然翻译性能不理想,但可以用较少的参数表达更丰富的信息,值得深入研究。  相似文献   

多策略机器翻译系统IHSMTS中实例模式泛化匹配算法  
基于精确匹配的EBMT,由于翻译覆盖率过低,导致其难以大规模实际应用。本文提出一种实例模式泛化匹配算法,试图改善EBMT的翻译覆盖率:以输入的待翻译句子为目标导向,对候选翻译实例有针对性地进行实时泛化,使得算法既能满足实时文档翻译对速度的要求,又能充分利用系统使用过程中用户新添加和修改的翻译知识,从而总体上提高了系统的翻译覆盖率和翻译质量。实验结果表明,在语料规模为16 万句对的情况下,系统翻译覆盖率达到了75 %左右,充分说明了本文算法的有效性。  相似文献   

In some cases, to make a proper translation of an utterance in a dialogue, different pieces of contextual information are needed. Interpreting such utterances often requires dialogue analysis including speech acts and discourse analysis. In this paper, a statistical dialogue analysis model for Korean–English dialogue machine translation based on speech acts is proposed. The model uses syntactic patterns and n-grams of speech acts. The syntactic patterns include surface syntactic features which are related to the language-dependent expressions of speech acts. Speech-act n-grams are used to approximate the context of utterances. The key feature is the use of speech-act n-grams based on hierarchical recency. Experimental results with trigrams show that the proposed model achieves an accuracy of 66.87% for the top candidate and 82.35% for the top three candidates. It indicates that the proposed model based on hierarchical recency outperforms the model based on linear recency.  相似文献   

一种基于实例的汉英机器翻译策略  
介绍了一种基于实例的汉英机器翻译策略,重点讨论了汉英双语语料库的设计和基于该语料库的汉语句子的匹配算法。在进行汉语句子的匹配时,根据汉语的特点直接采用汉字的匹配,而没有进行汉语句子的分词。另外,匹配时确定匹配片断的边界也是基于实例机器翻译的难点之一,在这方面也采取了相应的解决方法。没有对翻译句子的连接装配进行更深入的研究,这是因为该翻译策略是用于多翻译引擎系统的,它要与其它翻译策略配合使用,以提高翻译结果的正确率。基于实例的机器翻译需要大量的双语语料库作为翻译时的依据,而人工建设大型语料库费时费力,所以尝试采用计算机进行汉英双语语料库的自动建立,包括篇章对齐和单词级的对齐。  相似文献   

目前汉藏机器翻译的研究主要集中在基于规则的方法上,主要原因在于汉藏的平行语料等基础资源相对匮乏,不方便做大规模的基于统计的汉藏机器翻译实验。该文依据汉藏辅助翻译项目的实际需求,在平行语料资源较少的情况下,提出了一种基于短语串实例的机器翻译方法,为辅助翻译提供候选译文。该方法主要利用词语对齐信息来充分挖掘现有平行语料资源信息。实验结果表明,该文提出的基于短语串实例方法优于传统基于句子实例的翻译,能够检索出任意长度的短语串翻译实例。在实验测试集上,该方法与默认参数下的Moses相比,翻译的BULE值接近Moses,短语翻译实例串的召回率提高了约9.71%。在平均句长为20个词的测试语料上,翻译速度达到平均每句0.175s,满足辅助翻译实时性的要求。  相似文献   

Development of a robust two-way real-time speech translationsystem exposes researchers and system developers to various challenges of machine translation(MT) and spoken language dialogues. The need for communicating in at least two differentlanguages poses problems not present for a monolingual spoken language dialogue system,where no MT engine is embedded within the process flow. Integration of various componentmodules for real-time operation poses challenges not present for text translation. In this paper,we present the CCLINC (Common Coalition Language System at Lincoln Laboratory) English–Koreantwo-way speech translation system prototype trained on doctor–patient dialogues,which integrates various techniques to tackle the challenges of automatic real-time speechtranslation. Key features of the system include (i) language–independent meaning representation which preserves the hierarchicalpredicate–argument structure of an input utterance, providing a powerful mechanism for discourse understanding of utterances originating from different languages,word-sense disambiguation and generation of various word orders of many languages, (ii) adoptionof the DARPA Communicator architecture, a plug-and-play distributed system architecturewhich facilitates integration of component modules and system operation in real time, and (iii)automatic acquisition of grammar rules and lexicons for easy porting of the system to differentlanguages and domains. We describe these features in detail and present experimental results.  相似文献   

德汉机器翻译中的语义消歧策略  
本文首先分析了德语中的语义歧义现象,然后提出了几种借助配价和语义信息进行消歧的策略。这些策略目前都已应用于同济大学开发的TJ TITR 德汉机器翻译系统中。实践证明,它们不仅较好地解决了机器翻译中的语义歧义问题,而且大大提高了系统运行的效率。  相似文献   

基于短语的统计翻译模型是目前机器翻译领域广泛使用的模型之一。但是,由于在解码时采用短语精确匹配的策略,造成了严重的数据稀疏问题,短语表中的大量短语无法得到充分利用。为此,该文提出了人机互助的交互式翻译方法。对于翻译短语表中找不到的短语,首先通过模糊匹配的方法,在短语表中寻找与其相似的短语。然后利用组合分类器,判断哪些相似短语可能提高句子的翻译质量。最后,通过人机交互的方法,选择可能提高翻译质量且保持原句语义的短语。在口语语料上的实验结果证明,这种方法可以有效地提高翻译系统的译文质量。  相似文献   

刘占一  李生  刘挺  王海峰 《软件学报》2012,23(6):1472-1485
基于实例的机器翻译(example-based machine translation,简称EBMT)使用预处理过的双语例句作为主要翻译资源,通过编辑与待翻译句子匹配的翻译实例来生成译文.在EBMT系统中,翻译实例选择及译文选择对系统性能影响较大.提出利用统计搭配模型来增强EBMT系统中翻译实例选择及译文选择的能力,提高译文质量.首先,使用单语统计词对齐从单语语料中训练统计搭配模型.然后,利用该模型从3个方面提高EBMT的性能:(1)利用统计搭配模型估计待翻译句子与翻译实例之间的匹配度,从而增强系统的翻译实例选择能力;(2)通过引入候选译文与上下文之间搭配强度的估计来提高译文选择能力;(3)使用统计搭配模型检测翻译实例中被替换词的搭配词,同时根据新的替换词及上下文对搭配词进行矫正,进一步提高EBMT系统的译文质量.为了验证所提出的方法,在基于词的EBMT系统上评价了英汉翻译的译文质量.与基线系统相比,所提出的方法使译文的BLEU得分提高了4.73~6.48个百分点.在半结构化的EBMT系统上进一步检验了基于统计搭配模型的译文选择方法,从实验结果来看,该方法使译文的BLEU得分提高了1.82个百分点.同时,人工评价结果显示,改进后的半结构化EBMT系统的译文能够表达原文的大部分信息,并且具有较高的流利度.  相似文献   

多策略汉日机器翻译系统中的核心技术研究  
多策略的机器翻译是当今机器翻译系统的一个发展方向。该文论述了一个多策略的汉日机器翻译系统中各翻译核心子系统所使用的核心技术和算法,其中包含了使用词法分析、句法分析和语义角色标注的汉语分析子系统、利用双重索引技术的基于翻译记忆技术的机器翻译子系统、以句法树片段为模板的基于实例模式的机器翻译子系统以及综合了配价模式和断段分析的机器翻译子系统。翻译记忆子系统的测试结果表明其具有高效的特性;实例模式子系统在1 559个句子的封闭测试中达到99%的准确率,在1 500个句子的开放测试中达到85%的准确率;配价模式子系统在3 059个句子的测试中达到了89%的准确率。  相似文献   

This paper describes recent developments at NTT in the areas of speech recognition, speech synthesis, and interactive voice systems as they relate to telecommunications applications. Speaker-independent largevocabulary speech recognition based on context-dependent phone models and LR parser, and high-quality text-to-speech (TTS) conversion using the waveform concatenation method, both realized as software, have enabled interactive voice systems for fast and easy prototyping of telephone-based applications. Practical applications are discussed with examples.  相似文献   

