首页 | 本学科首页   官方微博 | 高级检索  
     


Audio enhancement using local SNR-based sparse binary mask estimation and spectral imputation
Affiliation:1. Department of Electronics and Communication Engineering, National Institute of Technology, Patna, India;2. Department of Computer Science, University of Crete, Greece;3. Department of Electronics and Communication Engineering, National Institute of Technology, Sikkim, India;4. Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati, India;1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China;2. School of Computer Science and Technology, Harbin Institute of Technology, Weihai, Shandong 264209, China;3. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin, Heilongjiang 150080, China
Abstract:This paper proposes a method for enhancing speech and/or audio quality under noisy conditions. The proposed method first estimates the local signal-to-noise ratio (SNR) of the noisy input signal via sparse non-negative matrix factorization (SNMF). Next, a sparse binary mask (SBM) is proposed that separates the audio signal from the noise by measuring the sparsity of the pool of local SNRs from the adjacent frequency bands of the current and several previous frames. However, some spectral gaps remain across frequency bands after applying the binary masks, which distorts the separated audio signal due to spectral discontinuity. Thus, a spectral imputation technique is used to fill the empty spectrum of the frequency band where it is removed by the SBM. Spectral imputation is conducted by online learning NMF with the spectra of the neighboring non-overlapped frequency bands and their local sparsity. The effectiveness of the proposed enhancement method is demonstrated on two different tasks use speech and musical content, respectively. Consequently, objective measurements and subjective listening tests show that the proposed method outperforms conventional speech and audio enhancement methods, such as SNMF-based alternatives and deep recurrent neural networks for speech enhancement, block thresholding, and a commercially available software tool for audio enhancement.
Keywords:Speech enhancement  Audio enhancement  Sparse binary mask  Local signal-to-noise ratio  Spectral imputation  Non-negative matrix factorization
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号