基于神经网络由语音预测视位参数 Predicting Viseme Parameters from Speech Based on Neural Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于神经网络由语音预测视位参数

引用本文：	王志明,蔡莲红.基于神经网络由语音预测视位参数[J].小型微型计算机系统,2005,26(6):1083-1087.

作者姓名：	王志明蔡莲红

作者单位：	1. 北京科技大学,计算机系,北京,100083 2. 清华大学,计算机系,北京,100084

基金项目：	高等学校博士学科点专项科研基金资助(20010003049)，北京科技大学校基金(2004509180)资助

摘要：	语音是由多个发音器官共同作用产生的，发音器官动作与语音之间有着内在的必然联系．研究了利用神经网络预测视位参数中的选择语音参数、确定输入语音时域范围、优化神经网络结构等因素．实验结果表明，线性预测参数加短时能量优于其他语音参数，前向协同发音较后向协同发音影响更大，反馈对前馈神经网络的性能有所改善．考虑到实验采用的是任意连续语流，均方误差约为0．0114的实验结果还是很有吸引力的．
关键词：	前馈神经网络视位线性预测系数线谱对系数实倒谱系数反射系数 Mel倒谱系数均方误差
文章编号：	1000-1220(2005)06-1083-05
Predicting Viseme Parameters from Speech Based on Neural Network

WANG Zhi-ming,CAI Lian-hong.Predicting Viseme Parameters from Speech Based on Neural Network[J].Mini-micro Systems,2005,26(6):1083-1087.

Authors:	WANG Zhi-ming CAI Lian-hong

Affiliation:	WANG Zhi-ming~1,CAI Lian-hong~2~1

Abstract:	Speech is produced by co-operation of all speech organs, and there are inherent relations between speech and movement of speech organs. To predict viseme parameters from speech using neural network, input speech parameters selection, time domain and structure of neural network were studied. Experiment results show that LPC coefficient plus short time energy are superior to other speech parameters, forward co-articulation is more server than backward co-articulation, and a delay feedback can improve the forward neural network performance. Considering experiments were based on unlimited vocabulary and continuous speech, the 0.0114 mean square error (MSE) is quite promising.

Keywords:	feed forward neural network viseme linear predictive coding (LPC) line spectral frequency (LSF) real cepstrum (RCEP) reflection coefficient (RC) mel frequency cepstrum coefficient (MFCC) mean square error (MSE)
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏