基于生理舌头模型的语音可视化系统 Speech visualization system based on physiological tongue model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于生理舌头模型的语音可视化系统

引用本文：	江辰,於俊,罗常伟,李睿,汪增福. 基于生理舌头模型的语音可视化系统[J]. 中国图象图形学报, 2015, 20(9): 1237-1246

作者姓名：	江辰於俊罗常伟李睿汪增福

作者单位：	中国科学技术大学自动化系, 合肥 230027;中国科学院合肥智能机械研究所, 合肥 230031;中国科学技术大学语音及语言信息处理国家工程实验室, 合肥 230027;中国科学技术大学自动化系, 合肥 230027;中国科学技术大学语音及语言信息处理国家工程实验室, 合肥 230027;中国科学技术大学自动化系, 合肥 230027;中国科学技术大学自动化系, 合肥 230027;中国科学院合肥智能机械研究所, 合肥 230031;中国科学技术大学自动化系, 合肥 230027;中国科学院合肥智能机械研究所, 合肥 230031;中国科学技术大学语音及语言信息处理国家工程实验室, 合肥 230027

基金项目：	国家自然科学基金项目(61472393);安徽省自主创新专项资金(13Z02008-5)

摘要：	目的目前针对舌头的语音同步动画技术还未得到广泛的研究。在此背景下,提出了一种基于生理模型的舌头动画合成方法。方法首先构建了一个精细的、能够在肌肉激励下产生逼真舌头变形的舌头生理模型;其次利用该舌头模型合成了大量的舌头运动样本,并据此通过学习得到一个从肌肉激励到舌头轮廓的转换模型;然后对采集的动态2维舌头轮廓数据进行运动参数估计以得到与音素对应的体素(肌肉激励序列和刚体位移序列);最后将体素按一定的排列方式输入到舌头生理模型进行仿真以生成相应的舌头动画。结果该系统可以合成听觉效果逼真的语音和视觉效果逼真且与合成语音同步的舌头动画。结论本文方法可以根据汉语普通话或其他语言的2维舌头轮廓数据构建音素—体素数据库,并据此合成该语言对应的高真实感的3维舌头动画。
关键词：	语音可视化舌头模型人脸动画舌头动画物理仿真
收稿时间：	2015-03-11
修稿时间：	2015-05-19
Speech visualization system based on physiological tongue model

Jiang Chen,Yu Jun,Luo Changwei,Li Rui and Wang Zengfu. Speech visualization system based on physiological tongue model[J]. Journal of Image and Graphics, 2015, 20(9): 1237-1246

Authors:	Jiang Chen Yu Jun Luo Changwei Li Rui Wang Zengfu

Affiliation:	Department of Automation, University of Science and Technology of China, Hefei 230027, China;Institute of intelligent machines, Chinese Academy of Sciences, Hefei 230031, China;National Laboratory of Speech and Language Information Processing, University ofScience and Technology of China, Hefei 230027, China;Department of Automation, University of Science and Technology of China, Hefei 230027, China;National Laboratory of Speech and Language Information Processing, University ofScience and Technology of China, Hefei 230027, China;Department of Automation, University of Science and Technology of China, Hefei 230027, China;Department of Automation, University of Science and Technology of China, Hefei 230027, China;Institute of intelligent machines, Chinese Academy of Sciences, Hefei 230031, China;Department of Automation, University of Science and Technology of China, Hefei 230027, China;Institute of intelligent machines, Chinese Academy of Sciences, Hefei 230031, China;National Laboratory of Speech and Language Information Processing, University ofScience and Technology of China, Hefei 230027, China

Abstract:	Objective Speech synchronized tongue animation remains lacking in research. Under this background, this paper proposes a physiology-based tongue animation system. Method First, an accurate physiology-based tongue model is created, the deformation of which can be driven by muscle activations. Second, the model is utilized to produce numerous tongue deformation samples according to numerous designed muscle activations. With these samples, a neural network that can transform muscle activations to tongue deformation is trained. Then, from the 2D tongue deformation results on tongue X-ray data, the corresponding physemes (muscle activation and rigid movement sequences) are estimated with this neural network. Lastly, speech synchronized tongue animation is synthesized by inputting these physemes into the tongue model for simulation. Result Experiment results demonstrate that the proposed system can produce realistic-sounding voices and visually realistic speech synchronized tongue animation. Conclusion The system can be used to build a phonemes-physemes database from collected 2D tongue movement data on Mandarin Chinese or other languages and can synthesize highly realistic tongue animation corresponding to the language.

Keywords:	speech visualization tongue model facial animation tongue animation physics-based simulation

	点击此处可从《中国图象图形学报》浏览原始摘要信息
	点击此处可从《中国图象图形学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏