首页 | 本学科首页   官方微博 | 高级检索  
     

基于三音素动态贝叶斯网络模型的大词汇量连续语音识别
引用本文:吕国云,赵荣椿,张艳宁,樊养余,Sahli Hichem.基于三音素动态贝叶斯网络模型的大词汇量连续语音识别[J].数据采集与处理,2009,24(1).
作者姓名:吕国云  赵荣椿  张艳宁  樊养余  Sahli Hichem
作者单位:1. 西北工业大学电子信息学院,西安,710072
2. 西北工业大学计算机学院,西安,710072
3. 布鲁塞尔自由大学电子与信息处理系,比利时,布鲁塞尔,B-1050
基金项目:中国博士后科学基金,国家高技术研究发展计划(863计划) 
摘    要:考虑连续语音中的协同发音现象,基于词-音素结构的DBN(WP-DBN)模型和词-音素-状态结构的DBN(WPS-DBN)模型,引入上下文相关的三音素单元,提出两个新颖的单流DBN模型:基于词-三音素结构的DBN(WT-DBN)模型和基于词-三音素-状态的DBN(WTS-DBN)模型.WTS-DBN模型是三音素模型,识别基元为三音素,以显式的方式模拟了基于三音素状态捆绑的隐马尔可夫模型(HMM).大词汇量语音识别实验结果表明:在纯净语音环境下,WTS-DBN模型的识别率比HMM,WT-DBN,WP-DBN和WPS-DBN模型的识别率分别提高了20.53%,40.77%,42.72%和7.52%.

关 键 词:语音识别  动态贝叶斯网络  三音素  音素

Continuous Speech Recognition for Large Vocabulary Based on Triphone DBN Model
Lü Guoyun,Zhao Rongchun,Zhang Yanning,Fan Yangyu,Sahli Hichem.Continuous Speech Recognition for Large Vocabulary Based on Triphone DBN Model[J].Journal of Data Acquisition & Processing,2009,24(1).
Authors:Lü Guoyun  Zhao Rongchun  Zhang Yanning  Fan Yangyu  Sahli Hichem
Abstract:To avoid coarticulatory effects in continuous speech recognition,based on word-phone structure dynamic bayesian network(WP-DBN) model and word-phone-state structure DBN(WPS-DBN) model,context-dependent triphone units are introduced.Two novel single stream DBN models,that is,word-triphone structure DBN(WT-DBN) and word-triphone-state structure DBN(WTS-DBN) models,are proposed for continuous speech recognition.WTS-DBN model is a triphone model and its modeling unit is triphone.It simulates a conventional HMM(...
Keywords:speech recognition  dynamic Bayesian network  triphone  phone  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号