基于 A-DResUnet 的语音增强方法 Speech enhancement method based on A-DResUnet期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于 A-DResUnet 的语音增强方法

引用本文：	李吉祥,倪旭昇,颜上取,邹孝,钱盛友.基于 A-DResUnet 的语音增强方法[J].电子测量与仪器学报,2022,36(10):131-137.

作者姓名：	李吉祥倪旭昇颜上取邹孝钱盛友

作者单位：	1.湖南师范大学物理与电子科学学院

基金项目：	国家自然科学基金（11774088）项目资助

摘要：	为了更精确地从语谱图中提取特征信息,提出了一种基于 A-DResUnet 的语音增强方法。 A-DResUnet 模型在 ResUnet 模型的基础上融合了空洞卷积,提升捕获语音上下文信息的能力;同时在编码器中加入卷积注意力模块(CBAM),提高对噪声谱图特征的关注。实验结果表明,与模型输出目标为干净语音语谱图相比,用噪声谱图作为模型输出目标时,该模型对未知噪声具有更强的分离能力;相较 ResUnet 模型,提出的 A-DResUnet 模型减少了语音细节信息的损失;对比基于 DNN、GAN 的语音增强方法,PESQ 平均提升了 22. 81%、33. 11%,STOI 平均提升了 9. 62%、15. 33%,为复杂环境下的语音增强提供了一种更有效的方法。
关键词：	语音增强语谱图模型输出目标空洞卷积卷积注意力模块
Speech enhancement method based on A-DResUnet

Li Jixiang,Ni Xusheng,Yan Shangqu,Zou Xiao,Qian Shengyou.Speech enhancement method based on A-DResUnet[J].Journal of Electronic Measurement and Instrument,2022,36(10):131-137.

Authors:	Li Jixiang Ni Xusheng Yan Shangqu Zou Xiao Qian Shengyou

Affiliation:	1.School of Physics and Electronics, Hunan Normal University

Abstract:	In order to extract feature information from spectrogram more accurately, this paper proposes a speech enhancement method based on A-DResUnet ( attention-dilated ResUnet). The A-DResUnet model incorporates dilated convolution on the basis of ResUnet model to improve the ability to capture the contextual information of speech; at the same time, the convolution block attention module (CBAM) is added into the ResUnet encoder to improve the attention to the features of the noise spectrogram. The experimental results show that when the noise spectrum is used as the output target of the model, the model has a stronger ability to separate unknown noise than when the output target of the model is clean speech spectrum; compared with the ResUnet model, the proposed A-DResUnet model reduces the loss of speech detail information; compared with the speech enhancement methods based on DNN and GAN, PESQ increased by an average of 22. 81%, 33. 11%, STOI increased by an average of 9. 62%, 15. 33%, which is a more effective method for speech enhancement in complex environments.

Keywords:	speech enhancement spectrogram the output target of the model dilated convolution convolution block attention module

	点击此处可从《电子测量与仪器学报》浏览原始摘要信息
	点击此处可从《电子测量与仪器学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏