首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a new speech enhancement scheme for a single-microphone system to meet the demand for quality noise reduction algorithms capable of operating at a very low signal-to-noise ratio. A psychoacoustic model is incorporated into the generalized perceptual wavelet denoising method to reduce the residual noise and improve the intelligibility of speech. The proposed method is a generalized time-frequency subtraction algorithm, which advantageously exploits the wavelet multirate signal representation to preserve the critical transient information. Simultaneous masking and temporal masking of the human auditory system are modeled by the perceptual wavelet packet transform via the frequency and temporal localization of speech components. The wavelet coefficients are used to calculate the Bark spreading energy and temporal spreading energy, from which a time-frequency masking threshold is deduced to adaptively adjust the subtraction parameters of the proposed method. An unvoiced speech enhancement algorithm is also integrated into the system to improve the intelligibility of speech. Through rigorous objective and subjective evaluations, it is shown that the proposed speech enhancement system is capable of reducing noise with little speech degradation in adverse noise environments and the overall performance is superior to several competitive methods.  相似文献   

2.
一种Bark子波变换的电子耳蜗语音增强算法   总被引:1,自引:0,他引:1       下载免费PDF全文
提出了一种Bark子波变换的电子耳蜗语音增强算法。该算法首先引入与人耳听觉系统更为适应的Bark子波变换来进行电子耳蜗CIS语音信号处理,然后在每个Bark通道中利用非线性谱减法对其进行语音增强,谱减法的参数由人耳隐蔽阈值来控制。结果表明:即使在低信噪比的情况下,信噪比也能提高16 dB左右,合成的语音对于电子耳蜗使用者具有较好的清晰度和可懂度。  相似文献   

3.
洪晓芬 《计算机工程与设计》2007,28(22):5453-5454,5477
语音增强技术是解决噪声污染的一项强有力的预处理技术.谱减法通过处理后的语音中会留下所谓的"音乐噪声",针对这个问题,提出了一种多带谱相减与感觉加权相结合的语音增强方法.对带噪语音进行多带谱相减,并根据人的听觉掩蔽特性,对多带谱相减后的信号进行感觉加权,从而进一步降低背景噪声.在语音失真和噪声抑制之间取得良好的折中,减少语音的听觉失真,有效地抑制"音乐噪声",提高语音的清晰度.  相似文献   

4.
基于感知小波变换的语音增强方法研究   总被引:3,自引:1,他引:2  
在ERB尺度下构造的感知小波符合人耳对固有语音的频率感知特性,通过一种纯数学算法计算其参数,在听觉感知上可以近乎完美地使信号进行重构。首先采用感知小波对带噪语音进行分解,其次在语音信号的子带层次上用一种类似于软阈值的无穷阶可导的函数进行阈值处理,最后应用谱减法进行二次增强。实验表明,该算法使信噪比和PESQ得分都有较大提高,特别是在信噪比较高时,语音具有很好的清晰度和可懂度。  相似文献   

5.
针对现有的助听器语音增强算法在非平稳噪声环境下,残留大量背景噪声的同时还引入了“音乐噪声”,致使增强语音可懂度和信噪比不理想等问题。提出了一种基于噪声估计的二值掩蔽语音增强算法,该算法利用人耳听觉感知理论,结合人耳的听觉特性和耳蜗的工作机理。采用最小值控制递归平均(Minima-Controlled Recursive Averaging,MCRA)算法获得估计噪声和初步增强语音;将估计噪声和初步增强语音分别通过可以模拟人工耳蜗模型的gammatone滤波器组进行滤波处理,得到各自的时频表示形式;利用人耳的听觉掩蔽特性,计算含噪语音在时频域的二值掩蔽;利用二值掩蔽得到增强语音。实验结果表明:该算法很大程度上去除了谱减法引入的“音乐噪声”,与基于MCRA谱减法相比,增强语音的语言可懂度指数(Speech Intelligibility Index,SII)、主观语音质量评估(Perceptual Evaluation of Speech Quality,PESQ)和信噪比(Signal to Noise Ratio,SNR)都得到了提高。  相似文献   

6.
罗瀛  曾庆宁  龙超 《计算机应用》2019,39(8):2426-2430
为提高双微阵列语音增强系统在多噪声环境下的消噪性能,提出一种适用于双微阵列的改进广义旁瓣抵消器语音增强算法。根据双微麦克风阵列的结构特点,首先,用基于噪声互功率谱估计的改进相干滤波算法消除距离较远麦克风之间产生的弱相关噪声;然后,利用广义旁瓣抵消算法消除距离较近麦克风之间产生的强相关噪声;最后,通过基于最小值控制递归平均的子带谱减法有针对性地消除不同频带上的残留噪声。仿真实验表明,在多噪声环境下所提算法较现有的双微阵列语音增强算法取得了更好的感知语音质量评价得分,一定程度上改善了双微阵列语音增强系统对复杂噪声的抑制效果。  相似文献   

7.
语音增强主要用来提高受噪声污染的语音可懂度和语音质量,它的主要应用与在嘈杂环境中提高移动通信质量有关。传统的语音增强方法有谱减法、维纳滤波、小波系数法等。针对复杂噪声环境下传统语音增强算法增强后的语音质量不佳且存在音乐噪声的问题,提出了一种结合小波包变换和自适应维纳滤波的语音增强算法。分析小波包多分辨率在信号频谱划分中的作用,通过小波包对含噪信号作多尺度分解,对不同尺度的小波包系数进行自适应维纳滤波,使用滤波后的小波包系数重构进而获取增强的语音信号。仿真实验结果表明,与传统增强算法相比,该算法在低信噪比的非平稳噪声环境下不仅可以更有效地提高含噪语音的信噪比,而且能较好地保存语音的谱特征,提高了含噪语音的质量。  相似文献   

8.
This paper presents a new approach to speech enhancement based on modified least mean square-multi notch adaptive digital filter (MNADF). This approach differs from traditional speech enhancement methods since no a priori knowledge of the noise source statistics is required. Specifically, the proposed method is applied to the case where speech quality and intelligibility deteriorates in the presence of background noise. Speech coders and automatic speech recognition systems are designed to act on clean speech signals. Therefore, corrupted speech signals by the noise must be enhanced before their processing. The proposed method uses a primary input containing the corrupted speech signal and a reference input containing noise only. The new computationally efficient algorithm is developed here based on tracking significant frequencies of the noise and implementing MNADF at those frequencies. To track frequencies of the noise time-frequency analysis method such as short time frequency transform is used. Different types of noises from Noisex-92 database are used to degrade real speech signals. Objective measures, the study of the speech spectrograms and global signal-to-noise ratio (SNR), segmental SNR (segSNR) as well as subjective listing test demonstrate consistently superior enhancement performance of the proposed method over tradition speech enhancement method such as spectral subtraction.  相似文献   

9.
In this paper, we proposed a new speech enhancement system, which integrates a perceptual filterbank and minimum mean square error–short time spectral amplitude (MMSE–STSA) estimation, modified according to speech presence uncertainty. The perceptual filterbank was designed by adjusting undecimated wavelet packet decomposition (UWPD) tree, according to critical bands of psycho-acoustic model of human auditory system. The MMSE–STSA estimation (modified according to speech presence uncertainty) was used for estimation of speech in undecimated wavelet packet domain. The perceptual filterbank provides a good auditory representation (sufficient frequency resolution), good perceptual quality of speech and low computational load. The MMSE–STSA estimator is based on a priori SNR estimation. A priori SNR estimation, which is a key parameter in MMSE–STSA estimator, was performed by using “decision directed method.” The “decision directed method” provides a trade off between noise reduction and signal distortion when correctly tuned. The experiments were conducted for various noise types. The results of proposed method were compared with those of other popular methods, Wiener estimation and MMSE–log spectral amplitude (MMSE–LSA) estimation in frequency domain. To test the performance of the proposed speech enhancement system, three objective quality measurement tests (SNR, segSNR and Itakura–Saito distance (ISd)) were conducted for various noise types and SNRs. Experimental results and objective quality measurement test results proved the performance of proposed speech enhancement system. The proposed speech enhancement system provided sufficient noise reduction and good intelligibility and perceptual quality, without causing considerable signal distortion and musical background noise.  相似文献   

10.
In this paper, we propose a speech enhancement method where the front-end decomposition of the input speech is performed by temporally processing using a filterbank. The proposed method incorporates a perceptually motivated stationary wavelet packet filterbank (PM-SWPFB) and an improved spectral over-subtraction (I-SOS) algorithm for the enhancement of speech in various noise environments. The stationary wavelet packet transform (SWPT) is a shift invariant transform. The PM-SWPFB is obtained by selecting the stationary wavelet packet tree in such a manner that it matches closely the non-linear resolution of the critical band structure of the psychoacoustic model. After the decomposition of the input speech, the I-SOS algorithm is applied in each subband, separately for the estimation of speech. The I-SOS uses a continuous noise estimation approach and estimate noise power from each subband without the need of explicit speech silence detection. The subband noise power is estimated and updated by adaptively smoothing the noisy signal power. The smoothing parameter in each subband is controlled by a function of the estimated signal-to-noise ratio (SNR). The performance of the proposed speech enhancement method is tested on speech signals degraded by various real-world noises. Using objective speech quality measures (SNR, segmental SNR (SegSNR), perceptual evaluation of speech quality (PESQ) score), and spectrograms with informal listening tests, we show that the proposed speech enhancement method outperforms than the spectral subtractive-type algorithms and improves quality and intelligibility of the enhanced speech.  相似文献   

11.
李艳生  刘园  张毅 《计算机应用》2019,39(3):894-898
针对非负矩阵分解(NMF)语音增强算法在低信噪比(SNR)非稳定环境下存在噪声残留的问题,提出一种基于感知掩蔽的重构NMF(PM-RNMF)单通道语音增强算法。首先,将心理声学掩蔽特性应用于NMF语音增强算法中;其次,对不同频率位采用不同的掩蔽阈值,建立自适应感知掩蔽增益函数,通过阈值约束残余噪声能量和语音失真能量;最后,结合语音存在概率(SPP)进行感知增益修正,重构NMF算法,以此建立新的目标函数。仿真结果表明,在不同SNR的3种非稳定噪声环境下,与NMF、重构NMF(RNMF)、感知掩蔽深度神经网络(PM-DNN)算法相比,PM-RNMF算法的感知语音质量评估(PESQ)平均值分别提高了0.767、0.474、0.162,信源失真比(SDR)平均值分别提高了2.785、1.197、0.948。实验结果表明,无论是在低频还是高频PM-RNMF有更好的降噪效果。  相似文献   

12.
Numerous efforts have focused on the problem of reducing the impact of noise on the performance of various speech systems such as speech coding, speech recognition and speaker recognition. These approaches consider alternative speech features, improved speech modeling, or alternative training for acoustic speech models. In this paper, we propose a new speech enhancement technique, which integrates a new proposed wavelet transform which we call stationary bionic wavelet transform (SBWT) and the maximum a posterior estimator of magnitude-squared spectrum (MSS-MAP). The SBWT is introduced in order to solve the problem of the perfect reconstruction associated with the bionic wavelet transform. The MSS-MAP estimation was used for estimation of speech in the SBWT domain. The experiments were conducted for various noise types and different speech signals. The results of the proposed technique were compared with those of other popular methods such as Wiener filtering and MSS-MAP estimation in frequency domain. To test the performance of the proposed speech enhancement system, four objective quality measurement tests [signal to noise ratio (SNR), segmental SNR, Itakura–Saito distance and perceptual evaluation of speech quality] were conducted for various noise types and SNRs. Experimental results and objective quality measurement test results proved the performance of the proposed speech enhancement technique. It provided sufficient noise reduction and good intelligibility and perceptual quality, without causing considerable signal distortion and musical background noise.  相似文献   

13.
基于听觉掩蔽效应的MMSE语音增强算法   总被引:2,自引:2,他引:0  
针对MMSE语音增强算法低信噪比时产生较大的语音畸变的缺点,提出了一种结合人耳听觉掩蔽效应的MMSE语音增强算法。该算法利用掩蔽阈值来调整MMSE算法中的增益值,使得增强后的语音信号残留噪声和语音畸变较小。通过计算机仿真对增强前后语音信号的信噪比分析以及主观试听表明:改进的MMSE语音增强算法不仅提高了语音信号的信噪比,而且减少了语音畸变,提高了语音的可懂度。  相似文献   

14.
针对低信噪比条件下基本谱减算法存在降噪效果不佳,产生音乐噪声过大,语音可懂度不高的问题,提出了一种改进型的谱减算法。算法先计算语音信号的倒谱距离值,检测出噪音段和语音段,用动态计算的噪声值代替基本谱减法采用的噪声统计均值;根据当前帧和噪声帧的倒谱距离比值动态设置谱减系数,改进了传统算法中谱减系数保持不变的缺点;同时采用三种方法抑制音乐噪声。仿真实验表明,在低信噪比情况下,改进型的谱减算法可以有效降噪,提高信噪比和可懂度,达到语音增强的目的。  相似文献   

15.
针对语音系统受外界强噪声干扰而导致识别精度降低以及通信质量受损的问题,提出一种基于自适应噪声估计的语音增强方法。通过端点检测将语音信号分为语音段与非语音段,对这两种情况的噪声幅度谱分别进行自适应估计,并对谱减法中不具有通用性的假设进行研究从而改进原理公式。实验结果表明,相对于传统谱减法,该方法能更好地抑制音乐噪声,并保持较高清晰度和可懂度,提高了强噪声环境下的语音识别精度和通信质量。  相似文献   

16.
A speech pre-processing algorithm is presented that improves the speech intelligibility in noise for the near-end listener. The algorithm improves intelligibility by optimally redistributing the speech energy over time and frequency according to a perceptual distortion measure, which is based on a spectro-temporal auditory model. Since this auditory model takes into account short-time information, transients will receive more amplification than stationary vowels, which has been shown to be beneficial for intelligibility of speech in noise. The proposed method is compared to unprocessed speech and two reference methods using an intelligibility listening test. Results show that the proposed method leads to significant intelligibility gains while still preserving quality. Although one of the methods used as a reference obtained higher intelligibility gains, this happened at the cost of decreased quality. Matlab code is provided.  相似文献   

17.
基于谱减法的听觉模拟的语音增强   总被引:1,自引:0,他引:1  
提出了一种适于低信噪比下的语音增强算法。该算法以传统的谱减法为基础,所用减参数是根据人耳听觉掩蔽效应提出的且是自适应的。对该算法进行了客观和主观测试,结果表明:相对于传统的谱减法,该算法能更好地抑制残留噪声和背景噪声,特别是对低信噪比的语音信号。  相似文献   

18.
提出一种基于谱减法和听觉掩蔽效应的改进的卡尔曼滤波语音增强算法.引入基于谱减法的AR参数估计使卡尔曼算法降低了复杂度和计算量从而易于实现.用卡尔曼滤波滤除噪声的同时结合人耳听觉掩蔽特性设计一个后置感知滤波器,使得从卡尔曼滤波获得的估计误差低于人耳掩蔽阈值,在去噪和语音失真之间取较好的折中.仿真结果表明所提方法优于传统的卡尔曼滤波增强法,能够有效地减少语音失真,并且更符合人耳听觉特性,特别是在低信噪比的情况下,语音具有更好的清晰度和可懂度.  相似文献   

19.
语音可懂度增强是一种在嘈杂环境中再现清晰语音的感知增强技术. 许多研究通过说话风格转换(SSC)来增强语音可懂度, 这种方法仅依靠伦巴第效应, 因此在强噪声干扰下效果不佳. SSC还利用简单的线性变换对基频(F0)的转换进行建模, 并且只映射很少维的梅尔倒谱系数(MCEPs). 因为F0和MCEPs是语音的两个重要特征, 对这些特征进行充分的建模是非常必要的. 因此本文进行了一个创新性研究即通过连续小波变换(CWT)将F0分解为10维来描述不同时间尺度的语音, 以实现F0的有效转换, 而且使用20维表示MCEPs实现MCEPs的转换. 除此之外, 还利用iMetricGAN网络来优化强噪声中的语音可懂度指标. 实验结果表明, 提出的基于CycleGAN使用CWT和iMetricGAN的非平行语音风格转换方法(NS-CiC)在客观和主观评价上均显著提高了强噪声环境下的语音可懂度.  相似文献   

20.
一种改进的基于谱熵的语音端点检测技术   总被引:1,自引:2,他引:1  
论文提出了基于时频谱减增强和谱熵的语音端点检测算法。算法对带噪语音在频域利用谱减法去除宽带加性噪声,在时域去除由谱减带来的残差噪声从而对语音进行了增强。对增强后的语音利用谱熵特征进行端点检测。实验结果表明,此算法快速有效,具有较强的抗噪能力,特别适合低信噪比的语音端点检测。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号