首页 | 本学科首页   官方微博 | 高级检索  
     


Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system
Authors:Yanqiu Shao  Jiqing Han  Ting Liu  Yongzhen Zhao
Affiliation:(1) Institute of Computational Linguistics, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;(2) School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
Abstract:In real speech, not like lexical words (LWs), prosodic words (PWs) are basic rhythmic units. The naturalness of a Text-to-Speech (TTS) system is directly influenced by the segmentation of the PWs. Most of the PWs are the combination of several LWs. In this paper, three Lexical Combination Models are proposed to combine LWs into PWs, including a Directed Acyclic Graph Model, a Segmentation Model and a Markov Model (MM). To cope with the situation where some long LWs should be segmented into two or more PWs, a Lexical Split Model (LSM) is applied to the long LWs. Experimental results prove that relatively constant results with various training data can be obtained from a MM. The Transformation-Based Error Driven Learning (TBED) algorithm, for its high performance of individual property, is applied in combination with the MM to improve the precision of PW segmentation. Experiments show that among the three proposed models, the MM combined with TBED and LSM, leads to the best performance, in which a precision of 93.00% and a recall of 93.23% are achieved. The perception test indicates that by using PWs as the lowest prosodic units a speech sounds more natural and acceptable than by using LWs. This paper is supported by NSFC Project (60503071); 973 Natural Basic Research Program of China (2004CB318102); Postdoctor Science Foundation of P. R. China (20070420275).
Keywords:Text-to-speech  Prosodic word  Lexical word  Prosodic structure
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号