基于汉语视频三音素的可视语音合成 Visual Speech Synthesis Algorithm Based on Chinese Visual Triphone期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于汉语视频三音素的可视语音合成

引用本文：	赵晖,唐朝京.基于汉语视频三音素的可视语音合成[J].电子与信息学报,2009,31(12):3010-3014.

作者姓名：	赵晖唐朝京

作者单位：	国防科技大学电子科学与工程学院,长沙,410073

基金项目：	国家部委基金(51329060101)资助课题

摘要：	为了合成具有真实感的视频序列，该文提出一种基于汉语视频三音素的可视语音合成方法。根据汉语的发音规律和音素与视素的对应关系，该文提出“视频三音素”的概念。在此基础上，建立隐马尔可夫(HMM)训练与合成模型，在训练过程中使用了视频音频联合特征，并加入了动态特征。在合成过程中，连接视频三音素HMM模型形成句子HMM，并从中提取特征参数，合成可视语音。从主观和客观评估结果来看，合成视频的真实感强，满意度较高。
关键词：	可视语音合成视频三音素隐马尔可夫模型联合特征
收稿时间：	2008-12-5
修稿时间：	2009-6-19
Visual Speech Synthesis Algorithm Based on Chinese Visual Triphone

Zhao Hui,Tang Chao-jing.Visual Speech Synthesis Algorithm Based on Chinese Visual Triphone[J].Journal of Electronics & Information Technology,2009,31(12):3010-3014.

Authors:	Zhao Hui Tang Chao-jing

Affiliation:	College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China

Abstract:	In order to synthesize real video sequence, a visual speech synthesis algorithm based on Chinese visual triphone is proposed. According to Chinese pronunciation principle and the relationship between phoneme and viseme, conception of ‘visual triphone’ is presented. Hidden Markov Model(HMM) is established based on visual triphones. In the training stage, combined features including visual features and audio features are used. In the synthesis stage, sentence HMM is constructed by concatenating triphone HMMs, from which the feature parameters are extracted. From the result of subjective and objective evaluation, the synthesized video is real and satisfied.

Keywords:	Visual speech synthesis Visual triphone Hidden Markov Model(HMM) Combined features
本文献已被万方数据等数据库收录！
	点击此处可从《电子与信息学报》浏览原始摘要信息
	点击此处可从《电子与信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏