Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis

Authors:	John Dines Hui Liang Lakshmi Saheer Matthew Gibson William Byrne Keiichiro Oura Keiichi Tokuda Junichi Yamagishi Simon King Mirjam Wester Teemu Hirsimäki Reima Karhila Mikko Kurimo

Affiliation:	1. Idiap Research Institute, Martigny, Switzerland;2. Cambridge University Engineering Department, Trumpington Street, UK;3. Department of Computer Science and Engineering, Nagoya Institute of Technology, Japan;4. Centre for Speech Technology (CSTR), University of Edinburgh, UK;5. Adaptive Informatics Research Centre, Aalto University, Finland

Abstract:	In this paper we present results of unsupervised cross-lingual speaker adaptation applied to text-to-speech synthesis. The application of our research is the personalisation of speech-to-speech translation in which we employ a HMM statistical framework for both speech recognition and synthesis. This framework provides a logical mechanism to adapt synthesised speech output to the voice of the user by way of speech recognition. In this work we present results of several different unsupervised and cross-lingual adaptation approaches as well as an end-to-end speaker adaptive speech-to-speech translation system. Our experiments show that we can successfully apply speaker adaptation in both unsupervised and cross-lingual scenarios and our proposed algorithms seem to generalise well for several language pairs. We also discuss important future directions including the need for better evaluation metrics.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏