首页 | 本学科首页   官方微博 | 高级检索  
     

基于非负矩阵分解的语音增强方法综述
引用本文:鲍长春,白志刚.基于非负矩阵分解的语音增强方法综述[J].信号处理,2020,36(6):791-803.
作者姓名:鲍长春  白志刚
作者单位:北京工业大学信息学部, 语音与音频信号处理研究室
基金项目:国家自然科学基金(61831019,61471014)
摘    要:语音增强在语音信号处理领域举足轻重,其目的在于减少背景噪声对语音信号的影响。然而,如何从极度非平稳噪声环境下有效地分离出目标语音仍然是一个具有挑战性的问题。基于非负矩阵分解(Nonnegative matrix factorization, NMF)的语音增强算法利用非负的语音和噪声基矩阵来建模语音和噪声的频谱子空间,是目前一种先进的对抑制非平稳噪声非常有效的技术。本文首先详细地介绍了非负矩阵分解理论,包括非负矩阵分解模型,代价函数(Cost function)的定义以及常用的乘法更新准则(Multiplicative update rules)。然后,本文详细地介绍了基于非负矩阵分解的语音增强方法的基本原理,包括训练阶段和增强阶段的具体过程,并进行了实验,此外,还利用一个基于非负矩阵分解的语音重构实验验证了语音基矩阵对语音频谱的建模能力。最后,本文总结了传统的基于非负矩阵分解的算法的不足,并对一些现有的基于非负矩阵分解的算法分别做了一个简单的概述,包括其创新点和优缺点,并对比分析了几种具有代表性的方法。本文从历史的角度展示了基于非负矩阵分解的语音增强方法的不断发展。

关 键 词:语音增强  非负矩阵分解  非平稳噪声  稀疏性  深度神经网络  半监督方法
收稿时间:2019-11-06

Speech Enhancement Based on Nonnegative Matrix Factorization: An Overview
Bao Changchun,Bai Zhigang.Speech Enhancement Based on Nonnegative Matrix Factorization: An Overview[J].Signal Processing,2020,36(6):791-803.
Authors:Bao Changchun  Bai Zhigang
Affiliation:Speech and Audio Signal Processing Lab, Faculty of Information Technology Beijing University of Technology
Abstract:As an important application of speech signal processing, speech enhancement aims to reduce the influence of background noise on speech signals. However, how to effectively separate target speech in extremely nonstationary noise environment is still a challenging problem. Speech enhancement based on nonnegative matrix factorization (NMF) is currently an advanced and effective technique for suppressing nonstationary noise, which models spectral subspaces of speech and noise using nonnegative basis matrices. First, in this paper, the theory of nonnegative matrix factorization is introduced in details, including the model of the NMF, the definition of cost functions and the commonly used multiplicative update rules. Then, the basic principle of the NMF-based speech enhancement methods is reviewed in details, including the specific processes of the training and enhancement stages, and the experiments are carried out. In addition, an NMF-based speech reconstruction experiment is used to verify the ability of speech basis matrix for modeling the speech spectrums. Finally, the shortcomings of the traditional NMF-based algorithms are summarized, and some existing NMF-based algorithms are respectively briefly reviewed including their innovations, advantages and disadvantages. Moreover, several typical methods are analyzed and compared. This paper shows the continuous developments of the NMF-based speech enhancement methods in a historical perspective. 
Keywords:speech enhancement  nonnegative matrix factorization  nonstationary noise  sparseness  deep neural networks  semi-supervised methods
本文献已被 维普 等数据库收录!
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号