首页 | 本学科首页   官方微博 | 高级检索  
     

基于ResNet-LSTM的声纹识别方法
引用本文:刘勇,梁宏涛,刘国柱,胡强.基于ResNet-LSTM的声纹识别方法[J].计算机系统应用,2021,30(6):215-219.
作者姓名:刘勇  梁宏涛  刘国柱  胡强
作者单位:青岛科技大学 信息科学技术学院, 青岛 266061
基金项目:国家自然科学基金(61973180)
摘    要:针对传统声纹识别方法实现过程复杂、识别率低等问题, 提出了一种基于ResNet-LSTM的声纹识别方法.首先采用ResNet残差网络提取声纹的空间特征, 其次采用LSTM长短期记忆循环神经网络提取声纹的时序特征,通过ResNet与LSTM结合的特征提取方法获得了同时包含空间特征与时序特征的深度声纹特征. 实验结果表明,采用ResNet-LSTM网络的声纹识别方法的等错误率降低至1.196%, 较基线方法d-vector以及VGGNet分别降低了3.68%与1.95%, 识别准确率达到了98.8%.

关 键 词:声纹识别  ResNet-LSTM  空间特征  时序特征
收稿时间:2020/9/25 0:00:00
修稿时间:2020/10/21 0:00:00

Voiceprint Recognition Method Based on ResNet-LSTM
LIU Yong,LIANG Hong-Tao,LIU Guo-Zhu,HU Qiang.Voiceprint Recognition Method Based on ResNet-LSTM[J].Computer Systems& Applications,2021,30(6):215-219.
Authors:LIU Yong  LIANG Hong-Tao  LIU Guo-Zhu  HU Qiang
Affiliation:College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
Abstract:Aiming at the complex process and low recognition rate of traditional methods, this study proposes a voiceprint recognition method based on ResNet-LSTM. In this method, ResNet and LSTM are respectively used to extract the spatial and temporal features of voiceprints. Thus, the deep voiceprint features including both spatial and temporal features are obtained. The experimental results show that the equal error rate of the proposed method is 1.196%, which is 3.68% and 1.95% lower than that of the baseline methods d-vector and VGGNet, respectively, and the recognition accuracy reaches 98.8%.
Keywords:voice recognition  ResNet-LSTM  spatial features  temporal features
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号