首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于RBM的深层神经网络音素识别方法
引用本文:陈 琦,张文林,牛 铜,李弼程.一种基于RBM的深层神经网络音素识别方法[J].信息工程大学学报,2013,14(5):569-574.
作者姓名:陈 琦  张文林  牛 铜  李弼程
作者单位:信息工程大学, 河南 郑州 450001
基金项目:国家自然科学基金资助项目(61175107)
摘    要:为提高连续语音识别中的音素识别准确率,采用深可信网络提取语音音素后验概率进行音素识别.首先利用受限玻尔兹曼机的学习原理,对深可信网络进行逐层的预训练;然后通过增加一个“软最大化(softmax)”输出层,得到用于音素状态后验概率检测的深层神经网络,并采用后向传播算法进行网络权值的精细调整;最后以后验概率为HMM发射概率,使用Viterbi解码器进行音素识别.针对TIMIT语料库的实验结果表明,该系统的音素识别率优于GMM/HMM,MLP/HMM和TANDEM系统性能.

关 键 词:受限玻尔兹曼机  深可信网络  神经网络  音素识别

RBM-Based Phoneme Recognition by Deep Neural Network Based on RBM
CHEN Qi,ZHANG Wen lin,NIU Tong,LI Bi cheng.RBM-Based Phoneme Recognition by Deep Neural Network Based on RBM[J].Journal of Information Engineering University,2013,14(5):569-574.
Authors:CHEN Qi  ZHANG Wen lin  NIU Tong  LI Bi cheng
Affiliation:Information Engineering University, Zhengzhou 450001, China
Abstract:To improve the performance of phoneme recognition in automatic speech recognition, a phoneme recognition method is built based on phoneme posteriors which are extracted by deep belief networks. Firstly, a deep belief network is pre-trained and layered as RBM greedily, and a deep neural network is created by adding a "softmax" output layer to the network. Subsequently, discriminative fine-tuning by back-propagation is done to adjust the weights and to make them better at predicting the probability distribution over the states of monophone hidden Markov models. Finally the sequence of the predicted probability distribution is fed into a standard Viterbi decoder. It is found that the method performs better on the TIMIT dataset than GMM/HMM, MLP/HMM and TANDEM methods.
Keywords:restricted Boltzmann machine (RBM)  deep belief networks  neural network  phoneme recognition
本文献已被 维普 等数据库收录!
点击此处可从《信息工程大学学报》浏览原始摘要信息
点击此处可从《信息工程大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号