共查询到20条相似文献,搜索用时 15 毫秒
1.
Nakamura S. Markov K. Nakaiwa H. Kikui G. Kawai H. Jitsuhiro T. Jin-Song Zhang Yamamoto H. Sumita E. Yamamoto S. 《IEEE transactions on audio, speech, and language processing》2006,14(2):365-376
In this paper, we describe the ATR multilingual speech-to-speech translation (S2ST) system, which is mainly focused on translation between English and Asian languages (Japanese and Chinese). There are three main modules of our S2ST system: large-vocabulary continuous speech recognition, machine text-to-text (T2T) translation, and text-to-speech synthesis. All of them are multilingual and are designed using state-of-the-art technologies developed at ATR. A corpus-based statistical machine learning framework forms the basis of our system design. We use a parallel multilingual database consisting of over 600 000 sentences that cover a broad range of travel-related conversations. Recent evaluation of the overall system showed that speech-to-speech translation quality is high, being at the level of a person having a Test of English for International Communication (TOEIC) score of 750 out of the perfect score of 990. 相似文献
2.
John Dines Hui Liang Lakshmi Saheer Matthew Gibson William Byrne Keiichiro Oura Keiichi Tokuda Junichi Yamagishi Simon King Mirjam Wester Teemu Hirsimäki Reima Karhila Mikko Kurimo 《Computer Speech and Language》2013,27(2):420-437
In this paper we present results of unsupervised cross-lingual speaker adaptation applied to text-to-speech synthesis. The application of our research is the personalisation of speech-to-speech translation in which we employ a HMM statistical framework for both speech recognition and synthesis. This framework provides a logical mechanism to adapt synthesised speech output to the voice of the user by way of speech recognition. In this work we present results of several different unsupervised and cross-lingual adaptation approaches as well as an end-to-end speaker adaptive speech-to-speech translation system. Our experiments show that we can successfully apply speaker adaptation in both unsupervised and cross-lingual scenarios and our proposed algorithms seem to generalise well for several language pairs. We also discuss important future directions including the need for better evaluation metrics. 相似文献
3.
Rohit Prasad Prem Natarajan David Stallard Shirin Saleem Shankar Ananthakrishnan Stavros Tsakalidis Chia-lin Kao Fred Choi Ralf Meermeier Mark Rawls Jacob Devlin Kriste Krstovski Aaron Challenner 《Computer Speech and Language》2013,27(2):475-491
In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Arabic. We describe the key components of our system: automatic speech recognition (ASR), machine translation (MT), text-to-speech (TTS), dialog manager, and the user interface (UI). In addition, we present novel techniques for overcoming specific challenges in developing high-performing S2S systems. For ASR, we present techniques for dealing with lack of pronunciation and linguistic resources and effective modeling of ambiguity in pronunciations of words in these languages. For MT, we describe techniques for dealing with data sparsity as well as modeling context. We also present and compare different user confirmation techniques for detecting errors that can cause the dialog to drift or stall. 相似文献
4.
5.
Pitrelli J.F. Bakis R. Eide E.M. Fernandez R. Hamza W. Picheny M.A. 《IEEE transactions on audio, speech, and language processing》2006,14(4):1099-1108
Expressive text-to-speech (TTS) synthesis should contribute to the pleasantness, intelligibility, and speed of speech-based human-machine interactions which use TTS. We describe a TTS engine which can be directed, via text markup, to use a variety of expressive styles, here, questioning, contrastive emphasis, and conveying good and bad news. Differences in these styles lead us to investigate two approaches for expressive TTS, a "corpus-driven" and a "prosodic-phonology" approach. Each speaker records 11 h (excluding silences) of "neutral" sentences. In the corpus-driven approach, the speaker also records 1-h corpora in each expressive style; these segments are tagged by style for use during search, and decision trees for determining f/sub 0/ contours and timing are trained separately for each of the neutral and expressive corpora. In the prosodic-phonology approach, rules translating certain expressive markup elements to tones and break indices (ToBI) are manually determined, and the ToBI elements are used in single f/sub 0/ and duration trees for all expressions. Tests show that listeners identify synthesis in particular styles ranging from 70% correctly for "conveying bad news" to 85% for "yes-no questions". Further improvements are demonstrated through the use of speaker-pooled f/sub 0/ and duration models. 相似文献
6.
JongHo Shin Panayiotis G. Georgiou Shrikanth Narayanan 《Computer Speech and Language》2013,27(2):554-571
The study provides an empirical analysis of long-term user behavioral changes and varying user strategies during cross-lingual interaction using the multimodal speech-to-speech (S2S) translation system of USC/SAIL. The goal is to inform user adaptive designs of such systems. A 4-week medical-scenario-based study provides the basis for our analysis. The data analyzed includes user interviews, post-session surveys, and the extensive system logs that were post-processed and annotated. The annotations measured the meaning transfer rates using human evaluations and a scale defined here called the concept matching score.First, qualitative data analysis investigates user strategies in dealing with errors, such as repeat, rephrase, change topic, start over, and the participants’ self-reported longitudinal adaptation to errors. Post-session surveys explore participant experience with the system and point to a trend of user-perceived increased performance over time.The log data analysis provides further insightful results. Users chose to allow some degradation (84% of original concepts) of their intended meaning to proceed through the system, even after they observed potential errors in the visual output from the speech recognizer. The rejected utterances, on average, had only 25% of the original concepts. This user-filtered outcome, after the complete channel transfer through the S2S system, is that 91% of the successful turns result in transfer of at least half the intended concepts while 90% of the user rejected turns would have conveyed less than half the intended meaning.The multimodal interface results in 24% relative improvement in the confirmation mode and in 31% relative improvement in the choice mode compared to the speech-only modality. Analysis also showed that users of the multimodal interface temporally change their strategies by accepting more system-produced choices. This user behavior can expedite communication seeking an operating balance between user strategies and system performance factors. Lastly, user utterance length is analyzed. Longer utterances in general imply more information delivered per utterance but potentially at the cost of increased processing degradation. The analysis demonstrates that users reduce their utterance length after unsuccessful turns and increase it after successful turns and that there is a learning effect that increases this behavior over the duration of the study. 相似文献
7.
Sonification is a fairly new term to scientists who are unaware of its multiple use cases. Even if some general definitions of the concept of sonification are commonly accepted, heterogeneous techniques – significantly different as it regards approaches, means and goals – are available. In this work we propose a reference system useful to interpret already-existing sonification instances and to plan new sonification tasks. This work aims to present a reference system for sonification using the inherent properties in the sonic output rather than the data itself. Validation has been conducted by automatically analyzing available experiments and examples, and placing them on the proposed sonification space, according to time-granularity and abstraction-level dimensions. This work can constitute the starting point for future research on computer-assisted sonification. It will be beneficial to a wide range of readers, in particular those from different disciplines looking at new ways to present and analyze data. 相似文献
8.
A multi-agent system for the decentralized resource-constrained multi-project scheduling problem 总被引:1,自引:0,他引:1
Jörg Homberger 《International Transactions in Operational Research》2007,14(6):565-589
A restart evolution strategy (RES) for the resource‐constrained project scheduling problem (RCPSP), as well as its integration in a multi‐agent system (MAS) for solving the decentralized resource‐constrained multi‐project scheduling problem (DRCMPSP) will be presented. To evaluate the developed approach, problem instances of the RCPSP taken from the literature with up to 300 activities are used, as well as 80 generated instances of the DRCMPSP, with up to 20 projects and with up to 120 activities each. For 73 instances of the RCPSP, the RES found better solutions than the best ones found so far. In addition, the MAS is suitable for solving large multi‐project instances decentrally. The results for the DRCMPSP instances show that the presented decentralized MAS is competitive with a central solution approach. 相似文献
9.
10.
With the advent of mobile devices and the convergence of wireless technologies and the Internet, both the content and the quality of research in this field are subject to regular change. A variety of state-of-the-art computing devices that are compatible with each other have been produced. These devices have the ability to interact with people. This is also known as pervasive computing. Particularly, as smartphones have recently become one of the most popular devices worldwide, various convenient applications are being released. Smartphones available today not only provide the ordinary internal processes such as dialing or receiving phone calls, sending text messages, and doing mobile banking, but also increasingly control various other devices that are part of our daily lives. In effect, this means that through smartphone applications, we can remotely control a variety of external devices such as televisions, projectors for presentations, computers, and even cars. The research in this paper is based on the evolving technological possibilities of using smartphone applications to control external devices. This paper presents the design and implementation of a remote lock system using wireless communication on a smartphone. In this context, remote lock system refers to a lock system that can be controlled remotely by a dedicated Android application. Every smartphone is equipped with Bluetooth which makes this technology possible. The application proposed in this paper uses the existing Bluetooth function on Android smartphones to open and manage locks. The users’ lock information can be stored and managed in real time in the database via a server that is built and managed by a server manager. Even if users forget the password of the lock, our proposed lock system can guide them to retrieve it easily, and a user manual is included to help users navigate the system. This system also provides a variety of management functions such as adding, deleting, modifying, and purchasing the user’s own locks. 相似文献
11.
12.
In this paper we present EXTRA (EXample-based TRanslation Assistant), a translation memory (TM) system. EXTRA is able to propose
effective translation suggestions by relying on syntactic analysis of the text and on a rigorous, language-independent measure;
the search is performed efficiently in large amounts of bilingual texts thanks to its advanced retrieval techniques. EXTRA
does not use external knowledge requiring the intervention of users and is completely customizable and portable as it has
been implemented on top of a standard DataBase Management System. The paper provides a thorough evaluation of both the effectiveness
and the efficiency of our system. In particular, in order to quantify the benefits offered by EXTRA assisted translation over
manual translation, we introduce a simulator implementing specifically devised statistical, process-oriented, discrete-event
models. As far as we know, this is the first time statistical simulation experiments have been used to face the nontrivial
problem of evaluating TM systems, particularly for comparing the time that could be saved by performing assisted translation
versus “manual” translation and for optimally tuning the system behaviour with respect to differently skilled users. In our
experiments, we considered three scenarios, manual translation with one or two translators and assisted translation with one
translator. The time needed for one translator to do an assisted translation is significantly closer to that of a team of
two translators than to that of the single translator. The mean sentence translation time is by far the lowest for this scenario,
corresponding to the highest per translator productivity. We also estimate the total translation time when the number of query
sentences, the maximum number of suggestions to be read, and the probability of look up are varied: the best trade-off is
given by reading (and presenting) four or five suggestions at the most. 相似文献
13.
IBM大型机与小型机间汉字转换解决方案 总被引:1,自引:0,他引:1
本文描述了在IBM的大型机ES/9000(基于MVS/VSE操作系统)与小型机RS/6000(基于AIX操作系统)间通过CICS传输中文数据存在的数据转换问题,分析了汉字EBCDIC码与汉字ASCII码单纯通过CICS配置不能正确转换的原因,给出了两种解决方案:第一种方案通过CICS程序、JAVA程序、CICS配置结合实现汉字转换;第二种方案只通过JAVA程序、CICS配置实现汉字转换。 相似文献
14.
Radhakrishnan Jayaraman Michael P. Deisenroth 《Computers & Industrial Engineering》1987,12(4):275-282
Industrial robots may be programmed using teach methods, off-line programming languages or by using interactive robot programming systems. This paper briefly explains each method, describes the advantages of developing interactive robot programming systems, and then describes an interactive robot programming system developed for the IBM 7545 robot. The approach used in the development process, the interactive execution and user options, and a demonstration of the operation of this interactive robot programming system are also presented. 相似文献
15.
在直流电机控制系统的基础上作了一些改进,较好地解决了一些异常情况下导致电机异常转动或开始转动后停不下来的情况。该改进措施实用性强,可以很方便地移植到其他MCU控制系统中去。 相似文献
16.
17.
18.
《Future Generation Computer Systems》1986,2(2):117-119
NEC has been developing a Japanese-English bi-directional machine translation system called VENUS (Vehicle for Natural language Understanding & Synthesis) in order to reduce the increasing cost of the manual translation of vast amounts of in-house technical documents. In addition, a translation support subsystem has been developed on the basis of VENUS, and extended to have the requisite facilities to prepare translated documents, such as document entry, editing, translation, printing, management, etc.This paper briefly introduces the current status of the VENUS translation system, and the basic idea for the system development. 相似文献
19.
《Future Generation Computer Systems》1986,2(2):95-100
Due to the rapid advancement of both computer technology and linguistic theory, machine translation systems are now coming into practical use.Fujitsu has two machine translation systems, ATLAS-I is a syntax-based machine translation system which translates English into Japanese. ATLAS II is a semantic-based system which aims at high quality multilingual translation. In this paper, both the ATLAS-I and ATLAS II translation mechanisms are explained. 相似文献
20.
The forthcoming ambient systems will contain a large amount of sensors. Representing the data produced by these sensors in a format suitable for ambient intelligence applications would enable a large number of useful services. However, such formats tend to require processing power and communication bandwidth not available in many sensors utilizing ultra low-power microcontrollers and radio chip solutions. This paper presents a lightweight data representation, Entity Notation, to tackle this problem. Sensors with limited computation and communication capabilities can use Entity Notation to describe the data they produce. Entity Notation can be transformed into knowledge representations in a straightforward manner, and hence, the data produced by sensor nodes can be utilized with ease by any ambient intelligence system compatible with the common knowledge representations. This paper presents the design of Entity Notation, its implementations on embedded sensors and the evaluation of its performance. 相似文献