Audio enhancement using local SNR-based sparse binary mask estimation and spectral imputation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Audio enhancement using local SNR-based sparse binary mask estimation and spectral imputation

Affiliation:	1. Department of Electronics and Communication Engineering, National Institute of Technology, Patna, India;2. Department of Computer Science, University of Crete, Greece;3. Department of Electronics and Communication Engineering, National Institute of Technology, Sikkim, India;4. Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati, India;1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China;2. School of Computer Science and Technology, Harbin Institute of Technology, Weihai, Shandong 264209, China;3. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, Heilongjiang 150080, China

Abstract:	This paper proposes a method for enhancing speech and/or audio quality under noisy conditions. The proposed method first estimates the local signal-to-noise ratio (SNR) of the noisy input signal via sparse non-negative matrix factorization (SNMF). Next, a sparse binary mask (SBM) is proposed that separates the audio signal from the noise by measuring the sparsity of the pool of local SNRs from the adjacent frequency bands of the current and several previous frames. However, some spectral gaps remain across frequency bands after applying the binary masks, which distorts the separated audio signal due to spectral discontinuity. Thus, a spectral imputation technique is used to fill the empty spectrum of the frequency band where it is removed by the SBM. Spectral imputation is conducted by online learning NMF with the spectra of the neighboring non-overlapped frequency bands and their local sparsity. The effectiveness of the proposed enhancement method is demonstrated on two different tasks use speech and musical content, respectively. Consequently, objective measurements and subjective listening tests show that the proposed method outperforms conventional speech and audio enhancement methods, such as SNMF-based alternatives and deep recurrent neural networks for speech enhancement, block thresholding, and a commercially available software tool for audio enhancement.

Keywords:	Speech enhancement Audio enhancement Sparse binary mask Local signal-to-noise ratio Spectral imputation Non-negative matrix factorization
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏