混合连接时间/注意力机制端到端语音识别 End-to-end Speech Recognition of Hybrid Connection Time and Attention Mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

混合连接时间/注意力机制端到端语音识别

引用本文：	陈聪,贺杰,陈佳.混合连接时间/注意力机制端到端语音识别[J].控制工程,2021,28(3):585-591.

作者姓名：	陈聪贺杰陈佳

作者单位：	梧州学院大数据与软件工程学院,广西梧州,543002

基金项目：	国家自然科学基金项目(61562074,61961036);广西高校行业软件技术重点实验室资助项目。

摘要：	为提高常规自动语音识别(ASR)系统的精度,提出基于隐式马尔可夫模型混合连接时间分类/注意力机制的端到端ASR系统设计方法.首先,针对可观测时变序列语音识别过程中存在的连续性强、词汇量大的语音识别难点,基于隐式马尔可夫模型对语音识别过程进行模拟,实现了语音识别模型参数化;其次,使用连接时间分类目标函数作为辅助任务,在多...
关键词：	隐式马尔可夫连接时间分类注意力机制端到端语音识别
End-to-end Speech Recognition of Hybrid Connection Time and Attention Mechanism

CHEN Cong,HE Jie,CHEN Jia.End-to-end Speech Recognition of Hybrid Connection Time and Attention Mechanism[J].Control Engineering of China,2021,28(3):585-591.

Authors:	CHEN Cong HE Jie CHEN Jia

Affiliation:	(School of Data Science and Sofware Engineering,Wuzhou University,Wuzhou 543002,China)

Abstract:	In order to improve the accuracy of the conventional automatic speech recognition(ASR) system, an end-to-end ASR system design method based on the Hidden Markov Model(HMM) connection time classification/attention mechanism is proposed. Firstly, the speech recognition process is simulated based on the implicit Markov model to realize the parameterization of the speech recognition model, aiming at the difficulty of speech recognition with strong continuity and large vocabulary in the speech recognition process of observable time variant sequence. Secondly, using the objective function as the auxiliary task, the attention model coder of the speech recognition process is trained in the multi-target learning framework, which can reduce the approximate degree of the sequence level connection time classification target and improve the accuracy of the speech recognition process. Finally, simulation experiments on the self-built speech recognition library verify the performance advantages of the proposed algorithm in terms of recognition efficiency and accuracy.

Keywords:	Hidden Markov connection time classification attention mechanism end-to-end speech recognition
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏