首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的声学模型研究
引用本文:沈东风,张二华.基于深度学习的声学模型研究[J].计算机与数字工程,2021,49(2):315-321.
作者姓名:沈东风  张二华
作者单位:南京理工大学计算机科学与工程学院 南京 210094;南京理工大学计算机科学与工程学院 南京 210094
基金项目:军委装备发展部十三五装备预研领域基金项目
摘    要:近年来,深度学习凭借其优越的性能广泛应用于图像处理、自然语言处理、语音识别等领域,它对性能的提升远超于以往的传统方法。论文采取循环神经网络(Recurrent Neural Networks,RNN)中的长短期记忆模型(Long Short Time Memory,LSTM)实现了语音识别中的声学模型构建,并增加反向时序信息对训练的影响,构成了双向长短期记忆模型(Bi-directional Long Short Time Memory,BLSTM)。语音信号是一种复杂的时变信号,而BLSTM能够在处理时间序列数据的同时,选择性地记住有效信息,丢弃无用信息,实验表明该方法的识别率较传统的高斯混合模型-隐马尔可夫模型(Gaussian Mixture Model-Hidden Markov Model,GMM-HMM)有显著的提高。

关 键 词:语音识别  声学模型  深度学习  BLSTM

Research on Acoustic Model Based on Deep Learning
SHEN Dongfeng,ZHANG Erhua.Research on Acoustic Model Based on Deep Learning[J].Computer and Digital Engineering,2021,49(2):315-321.
Authors:SHEN Dongfeng  ZHANG Erhua
Affiliation:(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)
Abstract:In recent years,deep learning has been widely used in many fields with its advantages,such as image processing,natural language processing,speech recognition and so on.It improves performance far beyond the traditional methods.In this pa?per,the long short time memory(LSTM)model of recurrent neural networks(RNN)is used to construct the acoustic model in speech recognition,and the effect of reverse timing information on training is added to form the bi-directional long short time memo?ry(BLSTM).Speech signal is a complex time-varying signal.BLSTM can selectively remember valid information and discard use?less information while processing time series data.Experiments show that the recognition accuracy of BLSTM is significantly im?proved compared with the traditional Gauss Mixture Model-Hidden Markov Model(GMM-HMM).
Keywords:speech recognition  acoustic model  deep learning  BLSTM
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号