首页 | 本学科首页   官方微博 | 高级检索  
     

噪声环境下基于单高斯模型的声道归一化研究
引用本文:张文明,张向东,张兴敢,候震.噪声环境下基于单高斯模型的声道归一化研究[J].微处理机,2006,27(5):102-105.
作者姓名:张文明  张向东  张兴敢  候震
作者单位:1. 南京大学电子系,南京,210093
2. 美国富迪科技(南京)有限公司,南京,210000
摘    要:声道归一化是语音识别中说话人自适应的方法之一,在噪声环境下对其进行了研究并做了一系列的实验.在实现过程中,首次在噪声环境下采用了基于单高斯混合模型选择弯折因子的方法,并取得了良好的结果.实验基于AURORA语音数据库,并用其所带的汽车噪声环境下的测试集对模型进行了识别验证.实验结果表明,采用声道归一化后的识别结果在各个噪声下均比原来有不同程度的改善,迭代训练能改进单轮声道归一化的结果,最佳结果出现在迭代训练的第三轮.噪声环境下基于一个高斯混合模型选择的弯折因子相比其他高斯混合模型选择的弯折因子,句子平均识别率提高了近1.68%.经过声道归一化后的性别独立模型的识别结果能接近于未经声道归一化后的性别依赖模型的识别结果,如果训练数据充分,声道归一化后的性别独立模型的识别结果能更好.

关 键 词:声道归一化  语音识别  说话人自适应
文章编号:1002-2279(2006)05-0102-04
修稿时间:2004年12月5日

The Study of Vocal Tract Length Normalization based on Single Mixture in Noisy Environment
ZHANG Wen-ming,ZHANG Xiang-dong,ZHANG Xing-gan,HOU Zhen.The Study of Vocal Tract Length Normalization based on Single Mixture in Noisy Environment[J].Microprocessors,2006,27(5):102-105.
Authors:ZHANG Wen-ming  ZHANG Xiang-dong  ZHANG Xing-gan  HOU Zhen
Abstract:Vocal tract length normalization is one of speaker adaptation in speech recognition.In this paper,we focus on the study of it and do a series of experiments.In its realization,we firstly adopt the means on scale factor which is based on single mixture in noisy environment and reach the better result.The experiments are based on AURORA speech database.We recognize the models using the test set in noisy car environment which is included in AURORA speech database.The results show that in various noise the recognized results of the VTLN are better than those of no VTLN.Iterative training can improve the performance of single turn VTLN and the optimal result is in third turns.In noisy environment,the average sentence correction based on the scale factor of single mixture is improved more 1.68 percent than that of the other mixtures. The gender independent performance of no VTLN is close to the gender dependent performance of VTLN.If the training data is sufficent,the gender independent performance of VTLN is better.
Keywords:Vocal tract Length normalization  Speech recognition  Speaker adaptation  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号