首页 | 本学科首页   官方微博 | 高级检索  
     

中文语音合成系统中的一种两层韵律结构生成体系
引用本文:董远,周涛,董乘宇,王海拉.中文语音合成系统中的一种两层韵律结构生成体系[J].自动化学报,2010,36(11):1569-1574.
作者姓名:董远  周涛  董乘宇  王海拉
作者单位:1.北京邮电大学 北京 100876
摘    要:韵律结构生成是改进一个语音合成系统中的合成语音的完整度和自然度的重要组成部分. 韵律词和韵律短语的自动切分是中文层级韵律结构的两个重要的基本层面, 本文调研了这个基本问题, 并提出了一种两层韵律结构生成体系. 为此, 我们建立了条件随机场模型为韵律词和韵律短语的预测选取不同的前端特征. 除此之外, 我们还引入了基于转换的错误驱动学习模块来修正后端的初始预测. 实验结果显示, 这种结合条件随机场和错误驱动学习的方法使得韵律词和韵律短语的自动分割的F-score值达到了94.66%.

关 键 词:语音合成    字音转换    韵律结构生成    条件随机场    错误驱动学习
收稿时间:2009-3-25
修稿时间:2010-7-1

A Two-stage Prosodic Structure Generation Strategy for Mandarin Text-to-speech Systems
DONG Yuan,ZHOU Tao,DONG Cheng-Yu,WANG Hai-La.A Two-stage Prosodic Structure Generation Strategy for Mandarin Text-to-speech Systems[J].Acta Automatica Sinica,2010,36(11):1569-1574.
Authors:DONG Yuan  ZHOU Tao  DONG Cheng-Yu  WANG Hai-La
Affiliation:1.Beijing University of Posts and Telecommunications, Beijing 100876, P.R. China;2.France Telecom R&D (Beijing), Beijing 100190, P.R. China
Abstract:Prosodic structure generation is the key component in improving the intelligibility and naturalness of synthetic speech for a text-to-speech (TTS) system. This paper investigates the problem of automatic segmentation of prosodic word and prosodic phrase, which are two fundamental layers in the hierarchical prosodic structure of Mandarin, and presents a two-stage prosodic structure generation strategy. Conditional random fields (CRF) models are built for both prosodic word and prosodic phrase prediction at the front end with different feature selections. Besides, a transformation-based error-driven learning (TBL) modification module is introduced in the back end to amend the initial prediction. Experiment results show that the approach combining CRF and TBL achieves an F-score of 94.66%.
Keywords:Text-to-speech (TTS)  prosodic structure generation  conditional random fields (CRF)  transformation-based error-driven learning (TBL)
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号