基于注意力机制的TDNN-LSTM模型及应用 Attention mechanism based TDNN-LSTM model and its application期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于注意力机制的TDNN-LSTM模型及应用

引用本文：	金浩,朱文博,段志奎,陈建文,李艾园. 基于注意力机制的TDNN-LSTM模型及应用[J]. 声学技术, 2021, 40(4): 508-514

作者姓名：	金浩朱文博段志奎陈建文李艾园

作者单位：	佛山科学技术学院, 广东佛山 528000

基金项目：	广东省基础与应用基础研究基金项目支持-粤佛联合基金项目支持（2019A1515110273）

摘要：	在大数据规模下,基于深度学习的语音识别技术已经相当成熟,但在小样本资源下,由于特征信息的关联性有限,模型的上下文信息建模能力不足从而导致识别率不高.针对此问题,提出了一种嵌入注意力机制层(Attention Mechanism)的时延神经网络(Time Delay Neural Network,TDNN)结合长短时记忆...
关键词：	小样本注意力机制时延神经网络长短时记忆递归网络
收稿时间：	2020-11-08
修稿时间：	2021-01-23
Attention mechanism based TDNN-LSTM model and its application

JIN Hao,ZHU Wenbo,DUAN Zhikui,CHEN Jianwen,LI Aiyuan. Attention mechanism based TDNN-LSTM model and its application[J]. Technical Acoustics, 2021, 40(4): 508-514

Authors:	JIN Hao ZHU Wenbo DUAN Zhikui CHEN Jianwen LI Aiyuan

Affiliation:	Foshan University, Foshan 528000, Guangdong, China

Abstract:	With the development of big data, speech recognition technology based on deep learning has been quite mature, but under small sample resources, due to the limited relevance of feature information, the modeling ability of contextual information of the model is insufficient, which leads to low recognition rate. To solve this problem, a timing prediction acoustic model (named TLSTM-Attention), which consists of a time delay neural network (TDNN) embedded by attention mechanism layer (Attention) and a long and short time memory (LSTM) recurrent neural network, is proposed in this paper. This model can effectively fuse the coarse and fine particle features with important information to improve the modeling ability of context information. By using the velocity perturbation technique to amplify the data and combining the speaker''s channel information features and the lattice-free maximum mutual information training criteria, and by selecting different input features, model structures and numbers of nodes, a series of comparative experiments are conducted. The experimental results show that compared with the baseline model, the word error rate of the model is reduced by 3.77 percentage points.

Keywords:	small sample attention mechanism time delay neural network (TDNN) long and short time memory recurrent network

	点击此处可从《声学技术》浏览原始摘要信息
	点击此处可从《声学技术》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏