首页 | 本学科首页   官方微博 | 高级检索  
     

混合连接时间/注意力机制端到端语音识别
引用本文:陈聪,贺杰,陈佳.混合连接时间/注意力机制端到端语音识别[J].控制工程,2021(3).
作者姓名:陈聪  贺杰  陈佳
作者单位:梧州学院大数据与软件工程学院
基金项目:国家自然科学基金项目(61562074,61961036);广西高校行业软件技术重点实验室资助项目。
摘    要:为提高常规自动语音识别(ASR)系统的精度,提出基于隐式马尔可夫模型混合连接时间分类/注意力机制的端到端ASR系统设计方法。首先,针对可观测时变序列语音识别过程中存在的连续性强、词汇量大的语音识别难点,基于隐式马尔可夫模型对语音识别过程进行模拟,实现了语音识别模型参数化;其次,使用连接时间分类目标函数作为辅助任务,在多目标学习框架中训练语音识别过程的关注模型编码器,可降低序列级连接时间分类目标近似度,实现语音识别过程精度提升;最后,通过在自建语音识别库上的仿真实验,验证所提算法在识别效率和精度上的性能优势。

关 键 词:隐式马尔可夫  连接时间分类  注意力机制  端到端  语音识别

End-to-end Speech Recognition of Hybrid Connection Time and Attention Mechanism
CHEN Cong,HE Jie,CHEN Jia.End-to-end Speech Recognition of Hybrid Connection Time and Attention Mechanism[J].Control Engineering of China,2021(3).
Authors:CHEN Cong  HE Jie  CHEN Jia
Affiliation:(School of Data Science and Sofware Engineering,Wuzhou University,Wuzhou 543002,China)
Abstract:In order to improve the accuracy of the conventional automatic speech recognition(ASR) system, an end-to-end ASR system design method based on the Hidden Markov Model(HMM) connection time classification/attention mechanism is proposed. Firstly, the speech recognition process is simulated based on the implicit Markov model to realize the parameterization of the speech recognition model, aiming at the difficulty of speech recognition with strong continuity and large vocabulary in the speech recognition process of observable time variant sequence. Secondly, using the objective function as the auxiliary task, the attention model coder of the speech recognition process is trained in the multi-target learning framework, which can reduce the approximate degree of the sequence level connection time classification target and improve the accuracy of the speech recognition process. Finally, simulation experiments on the self-built speech recognition library verify the performance advantages of the proposed algorithm in terms of recognition efficiency and accuracy.
Keywords:Hidden Markov  connection time classification  attention mechanism  end-to-end  speech recognition
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号