首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper describes a new version of a speech into sign language translation system with new tools and characteristics for increasing its adaptability to a new task or a new semantic domain. This system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). In order to increase the system adaptability, this paper presents new improvements in all the three main modules for generating automatically the task dependent information from a parallel corpus: automatic generation of Spanish variants when generating the vocabulary and language model for the speech recogniser, an acoustic adaptation module for the speech recogniser, data-oriented language and translation models for the machine translator and a list of signs to design. The avatar animation module includes a new editor for rapidly design of the required signs. These developments have been necessary to reduce the effort when adapting a Spanish into Spanish sign language (LSE: Lengua de Signos Española) translation system to a new domain. The whole translation presents a SER (Sign Error Rate) lower than 10% and a BLEU higher than 90% while the effort for adapting the system to a new domain has been reduced more than 50%.  相似文献   

2.
This paper describes the development of LSESpeak, a spoken Spanish generator for Deaf people. This system integrates two main tools: a sign language into speech translation system and an SMS (Short Message Service) into speech translation system. The first tool is made up of three modules: an advanced visual interface (where a deaf person can specify a sequence of signs), a language translator (for generating the sequence of words in Spanish), and finally, an emotional text to speech (TTS) converter to generate spoken Spanish. The visual interface allows a sign sequence to be defined using several utilities. The emotional TTS converter is based on Hidden Semi-Markov Models (HSMMs) permitting voice gender, type of emotion, and emotional strength to be controlled. The second tool is made up of an SMS message editor, a language translator and the same emotional text to speech converter. Both translation tools use a phrase-based translation strategy where translation and target language models are trained from parallel corpora. In the experiments carried out to evaluate the translation performance, the sign language-speech translation system reported a 96.45 BLEU and the SMS-speech system a 44.36 BLEU in a specific domain: the renewal of the Identity Document and Driving License. In the evaluation of the emotional TTS, it is important to highlight the improvement in the naturalness thanks to the morpho-syntactic features, and the high flexibility provided by HSMMs when generating different emotional strengths.  相似文献   

3.
One of the aims of Assistive Technologies is to help people with disabilities to communicate with others and to provide means of access to information. As an aid to Deaf people, we present in this work a production-quality rule-based machine system for translating from Spanish to Spanish Sign Language (LSE) glosses, which is a necessary precursor to building a full machine translation system that eventually produces animation output. The system implements a transfer-based architecture from the syntactic functions of dependency analyses. A sketch of LSE is also presented. Several topics regarding translation to sign languages are addressed: the lexical gap, the bootstrapping of a bilingual lexicon, the generation of word order for topic-oriented languages, and the treatment of classifier predicates and classifier names. The system has been evaluated with an open-domain testbed, reporting a 0.30 BLEU (BiLingual Evaluation Understudy) and 42% TER (Translation Error Rate). These results show consistent improvements over a statistical machine translation baseline, and some improvements over the same system preserving the word order in the source sentence. Finally, the linguistic analysis of errors has identified some differences due to a certain degree of structural variation in LSE.  相似文献   

4.
This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Española: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.  相似文献   

5.
文章探讨了如何让在手语新闻播报中的卡通人按照自然手语的语法规则而非正常人的语法规则来打手语。首先整理了现代汉语自然手语的规则并将其形式化,并建立了正常汉语到汉语自然手语转换的形式规则库;从而实现了现代汉语文本到相应的自然手语的手语动作序列的自动生成。最后将其嵌入到通过手语合成技术和卡通动画的手语新闻播报系统中,使其在线输出的是符合聋人习惯的自然手语。  相似文献   

6.
This article deals with a flexible natural language interface to access data stored in a relational data base. This interface may prove of great value to the less sophisticated user.The FIDO system (Flexible Interface for Database Operations) is presented; it accepts queries issued in natural language (Italian) and translates them into relational algebra operations. FIDO is composed of a parser (not described in the paper), a two-level semantic network, which (among other things) expresses the correspondence between the natural language terms and the conceptual database objects, and a translator/optimizer, which translates the conceptual query into its logical equivalent (i.e. into a query expressed in terms of stored relations and their attributes). The article describes the main characteristics of the semantic network and addresses, in greater detail, the problem of query translation and optimization.The flexibility of FIDO is due to the complete independence of the semantic knowledge source from the logical schema of the data base. In fact, the logical schema can be designed on the basis of considerations not related to the overall structure of FIDO (e.g. the presence of particular types of applications that have to be implemented in a particularly efficient way). In principle, the (relational) data base could be preexistent with respect to the adoption of FIDO, in that the data structures used by the translator/optimizer and described in this paper are able to describe the correspondence between the conceptual model of the domain and different logical schemas.  相似文献   

7.
System Combination for Machine Translation of Spoken and Written Language   总被引:1,自引:0,他引:1  
This paper describes an approach for computing a consensus translation from the outputs of multiple machine translation (MT) systems. The consensus translation is computed by weighted majority voting on a confusion network, similarly to the well-established ROVER approach of Fiscus for combining speech recognition hypotheses. To create the confusion network, pairwise word alignments of the original MT hypotheses are learned using an enhanced statistical alignment algorithm that explicitly models word reordering. The context of a whole corpus of automatic translations rather than a single sentence is taken into account in order to achieve high alignment quality. The confusion network is rescored with a special language model, and the consensus translation is extracted as the best path. The proposed system combination approach was evaluated in the framework of the TC-STAR speech translation project. Up to six state-of-the-art statistical phrase-based translation systems from different project partners were combined in the experiments. Significant improvements in translation quality from Spanish to English and from English to Spanish in comparison with the best of the individual MT systems were achieved under official evaluation conditions.   相似文献   

8.
面向机器翻译的中国手语的理解与合成   总被引:4,自引:0,他引:4  
徐琳  高文 《计算机学报》2000,23(1):60-65
自然语言与可视化语言之间的自动翻译研究具有重大的现实意义和学术研究价值,它是一个崭新的、有发展前任的研究领域。该文从机器翻译的角度来考察汉语和中国手语之间的相同之处和差异,探讨两种语言在语序、句子结构、短语结构、特殊词类等方面的特点,建立了汉语中国手语机器翻译的一系列规则。在此基础之上,采用规则解释方法实现了一个汉语至可视化语言中国手语的翻译系统。  相似文献   

9.
在中泰两国双边往来日益频繁,以及Android App广泛应用的背景下,设计并实现了在Android平台下的汉-英-泰互译有声电子词典软件.该应用软件是以Android Studio为开发环境,利用Java语言及SQLite数据库设计的.以一种特别的方式由泰语语料库创建本地词库,解决了SQLite可视化操作工具处理泰文会出现乱码的问题.系统关键技术是采用SQL语言在创建好的本地词库中查找单词释义.系统具有对话翻译、拍照翻译的特色,还实现了汉-英-泰三语查询互译、泰语真人朗读等功能.测试表明,软件具备一定的便利性和实用性.  相似文献   

10.
11.
The feasibility of computer translation of scientific and medical documents is controversial. This report describes a minicomputer-based translation system (TRANSOFT) that employs word order rearrangement followed by word-for-word translation and resolution of ambiguities based on context. This translation system was applied to an entire medical textbook written in German and to short medical texts written in French, Italian, Spanish and Turkish. Results suggest the versatility of TRANSOFT for narrowly defined translation problems. As foreign language medical documents and medical records become increasingly available in computer readable form through word processing, computerized typesetting and hospital information systems, computer translation methods may provide a rapid and inexpensive means of obtaining draft translations.  相似文献   

12.
13.
We propose an alternative method of machine–aided translation: Structure–Based Machine Translation (SBMT). SBMT uses language structure matching techniques to reduce complicated grammar rules and provide efficient and feasible translation results. SBMT comprises the following four features: (1) source language input sentence analysis; (2) source language sentence transformation into target language structure; (3) dictionary lookup; and (4) semantic disambiguation or word sense disambiguation (WSD) for correct output selection. SBMT has been designed and a prototype system has been implemented that generates satisfactory translations.  相似文献   

14.
傣文自动分词是傣文信息处理中的基础工作,是后续进行傣文输入法开发、傣文自动机器翻译系统开发、傣文文本信息抽取等傣文信息处理的基础,受限于傣语语料库技术,傣文自然语言处理技术较为薄弱。本文首先对傣文特点进行了分析, 并在此基础上构建了傣文语料库,同时将中文分词方法应用到傣文中,结合傣文自身的特点,设计了一个基于音节序列标注的傣文分词系统,经过实验,该分词系统达到了95.58%的综合评价值。  相似文献   

15.
16.
As the cognitive processes of natural language understanding and generation are better understood, it is becoming easier, nowadays, to perform machine translation. In this paper we present our work on machine translation from Arabic to English and French, and illustrate it with a fully operational system, which runs on PC compatibles with Arabic/Latin interface. This system is an extension of an earlier system, whose task was the analysis of the natural language Arabic. Thanks to the regularity of its phrase structures and word patterns, Arabic lends itself quite naturally to a Fillmore-like analysis. The meaning of a phrase is stored in a star-like data structure, where the verb occupies the center of the star and the various noun sentences occupy specific peripheral nodes of the star. The data structure is then translated into an internal representation in the target language, which is then mapped into the target text.  相似文献   

17.
辅助汉语学习研究作为一个重要的研究领域,已经在自然语言处理领域激发起越来越多人的兴趣。文中提出一个基于字分析单元的辅助阅读系统,它可以为汉语学习者提供即时的辅助翻译和学习功能。系统首先提出基于字信息的汉语词法分析方法,对汉语网页中文本进行分词处理,然后利用基于组成字结构信息的方法发现新词。对于通用词典未收录的新词(例如: 专业术语、专有名词和固定短语),系统提出了基于语义预测和反馈学习的方法在Web上挖掘出地道的译文。对于常用词,系统通过汉英(或汉日)词典提供即时的译文显示,用户也可通过词用法检索模块在网络上检索到该词的具体用法实例。该系统关键技术包括: 基于字信息的汉语词法分析,基于组成字结构信息的新词发现,基于语义预测和反馈学习的新词译文获取,这些模块均以字分析单元的方法为主线,并始终贯穿着整个系统。实验表明该系统在各方面都具有良好的性能。  相似文献   

18.
In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.  相似文献   

19.
This paper presents an ongoing effort toward an online collaborative framework allowing Deaf individuals to author intelligible signs using a dedicated 3D animation authoring interface. The results that are presented mainly focus on the design of a dedicated user interface assisted by novel input devices. This design cannot only benefit the Deaf but also the linguists studying sign language by providing them with a novel kind of study material. This material would consist of a symbolic representation of intelligible sign language animations together with a fine-grained log of the user’s edit actions. Two user studies demonstrate how Leap Motion and Kinect-like devices can be used together for recording and authoring hand trajectories as well as facial animation.  相似文献   

20.
随着人们对互联网多语言信息需求的日益增长,跨语言词向量已成为一项重要的基础工具,并成功应用到机器翻译、信息检索、文本情感分析等自然语言处理领域。跨语言词向量是单语词向量的一种自然扩展,词的跨语言表示通过将不同的语言映射到一个共享的低维向量空间,在不同语言间进行知识转移,从而在多语言环境下对词义进行准确捕捉。近几年跨语言词向量模型的研究成果比较丰富,研究者们提出了较多生成跨语言词向量的方法。该文通过对现有的跨语言词向量模型研究的文献回顾,综合论述了近年来跨语言词向量模型、方法、技术的发展。按照词向量训练方法的不同,将其分为有监督学习、无监督学习和半监督学习三类方法,并对各类训练方法的原理和代表性研究进行总结以及详细的比较;最后概述了跨语言词向量的评估及应用,并分析了所面临的挑战和未来的发展方向。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号