基于双层字典学习的单通道语音增强方法 A single-channel speech enhancement method based on double-layer dictionary期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于双层字典学习的单通道语音增强方法

引用本文：	孙林慧,吴子皓,谢可丽,李平安.基于双层字典学习的单通道语音增强方法[J].信号处理,2020,36(6):1001-1012.

作者姓名：	孙林慧吴子皓谢可丽李平安

作者单位：	南京邮电大学通信与信息工程学院

基金项目：	国家自然科学基金项目（61901227）；江苏省高等学校自然科学研究项目(19KJB510049）

摘要：	为了提升复杂噪声环境下语音增强效果,该文提出了一种基于双层字典学习的单通道语音增强方法。在训练阶段首先采用干净语音和噪声训练初始化特征子字典，然后基于区分性约束和抗混淆约束的优化函数训练双层联合字典，第一层字典表达语音信号和噪声的可区分分量，而第二层字典表达语音信号和噪声的易混淆成分。在测试阶段含噪语音在双层联合字典上投影得到稀疏系数矩阵，然后重构得到增强后的语音。该方法利用目标优化函数的约束性减少“交叉投影”现象的发生，降低了信号在联合字典的混淆，从而进一步提升了语音增强的效果。实验结果表明，从全局信噪比（SNR）、主观语音质量评估（PESQ）和对数频谱距离（LSD）三个方面评价，相比于基于稀疏约束非负矩阵分解和改进的维纳滤波的语音增强方法，该方法具有更好的性能，能够更有效地去除噪声。
关键词：	语音增强稀疏表示联合字典优化函数
收稿时间：	2020-04-08
A single-channel speech enhancement method based on double-layer dictionary

Affiliation:	College of Telecommunications & Information Engineering, Nanjing University of Posts and Telecommunications

Abstract:	A single-channel speech enhancement based on jointly constrained double-layer dictionary learning is proposed to improve the quality of speech in the complex noisy environment.Firstly, the characteristic sub-dictionaries that describe the clean speech and noisy speech are trained. Then, with the new optimization function of discriminative constraints and anti-substitution constraints, a double-layer joint dictionary is trained. The first layer dictionary expresses the separable components of the speech signal and noisy signal, and the second layer expresses easily decomposed components of the speech signal and noisy signal. The constraint of the objective optimization function is used to reduce the occurrence of "cross-projection" phenomenon and the confusion of the signals in the joint dictionary. Furthermore, we can improve the effect of speech enhancement through the double-layer dictionary. The experimental results show that compared with the speech enhancement methods based on the non-negative matrix factorization with sparsity-regularized constraints and the improved wiener filtering, in three aspects including Signal to Noise Ratio(SNR),Perceptual Evaluation of Speech Quality(PESQ) and Logarithmic Spectral Distance(LSD), the proposed method has better performance and can remove noise more effectively.

Keywords:

	点击此处可从《信号处理》浏览原始摘要信息
	点击此处可从《信号处理》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏