首页 | 本学科首页   官方微博 | 高级检索  
     

联合频谱映射与掩蔽估计的协作式语音增强方法
引用本文:罗庆予,张天骐,方 蓉,张慧芝. 联合频谱映射与掩蔽估计的协作式语音增强方法[J]. 电子测量与仪器学报, 2023, 37(10): 14-23
作者姓名:罗庆予  张天骐  方 蓉  张慧芝
作者单位:1.重庆邮电大学通信与信息工程学院
基金项目:国家自然科学基金 ( 61671095, 61771085)、 重庆市自然科学基金 ( cstc2021jcyj-msxmX0836)、 重庆市教育委员会科研项目(KJ1600427, KJ1600429)资助
摘    要:为提高目前基于掩蔽与基于频谱映射的语音增强方法性能上界以及复杂环境下的泛化能力,提出了一种在联合复频谱与复掩蔽学习框架下的协作式单通道语音增强方法。 该方法采用编码器-双分支解码器结构,在编解码部分设计了一种交互协作学习单元(ICU)来监督交互语音信息流,并提供有效的潜在特征空间;中间层则是设计出一种多尺度融合 Transformer,以少量参数在空间-通道维度上多尺度地提取细节信息后融合输出,同时对语音子频带与全频带信息建模。 在大、小数据集与 115种噪声环境下进行实验,结果表明该方法仅以 0. 57 M 的参数量,取得比大部分先进且相关方法更优的主、客观指标,具有良好的鲁棒性与有效性。

关 键 词:语音增强  复频谱映射  复掩蔽  多尺度融合 Transformer  轻量型网络

Collaborative speech enhancement method combiningspectral mapping and masking estimation
Luo Qingyu,Zhang Tianqi,Fang Rong,Zang Huizhi. Collaborative speech enhancement method combiningspectral mapping and masking estimation[J]. Journal of Electronic Measurement and Instrument, 2023, 37(10): 14-23
Authors:Luo Qingyu  Zhang Tianqi  Fang Rong  Zang Huizhi
Affiliation:1.School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications
Abstract:In order to improve the performance upper bound and generalization ability of current speech enhancement methods based onmasking and spectrum mapping, a collaborative monaural speech enhancement method based on the learning framework of combinedcomplex spectrum and masking is proposed. An interactive cooperative learning unit (ICU) is designed in the codec part to monitor theinteractive speech information flow and provide an effective potential feature space. In the middle layer, a multi-scale fusion Transformeris designed to extract multi-scale details in the spatial-channel dimension with a small number of parameters for fusion output, at themeanwhile, modeling the voice sub-band and full band information. Experiments on large and small data sets and 115 noise environmentsshow that the proposed method only uses 0. 57 M parameters to obtain better subjective and objective indicators than most advanced andrelated methods, which has good robustness and effectiveness.
Keywords:speech enhancement   complex spectral mapping   complex masking   multi scale fusion Transformer   lightweight network
点击此处可从《电子测量与仪器学报》浏览原始摘要信息
点击此处可从《电子测量与仪器学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号