首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
针对染有加性噪声的语音信号,提出了一种基于信号子空间和信息复杂度相结合的语音端点检测方法。该方法先利用信号子空间法去除加性噪声,再对增强后的语音利用信息复杂度进行端点检测。实验仿真表明,该方法相对传统的语音端点检测方法,能提高语音端点检测准确率,特别在低信噪比条件下具有较高的端点检测准确率。  相似文献   

2.
This paper presents a new time domain noise reduction approach based on Singular Value Decomposition (SVD) technique. In the proposed approach, the noisy signal is initially represented in a Hankel Matrix. Then SVD is applied on the Hankel Matrix to divide the data into signal subspace and noise subspace. Since singular vectors are the span bases of the matrix, reducing the effect of noise from the singular vectors and using them in reproducing the matrix leads to considerable enhancement of information embedded in the matrix. The noise-reduced singular vectors from the signal subspace are utilized to reconstruct the data matrix. This matrix is finally used to obtain the time-series signal. The results of applying the proposed method to different synthetic noisy signals indicate a better efficiency in noise reduction compared to the other time series methods.  相似文献   

3.
改进的子空间语音增强算法   总被引:1,自引:0,他引:1  
单通道子空间语音增强算法在加性噪声为白噪声的情况下,效果比较理想.加性噪声为有色噪声的情况下,通常用广义奇异值分解算法来进行处理.为了降低低信噪比情况下残留的音乐噪声,结合人耳的听觉掩蔽效应,提出了一种基于感官抑制的广义奇异值分解算法.实验结果显示,该算法能够明显地提高语音质量、可懂度和识别率,特别是在加性噪声是有色噪声的情况下实验结果明显优于其他的语音增强算法.  相似文献   

4.
A signal subspace scheme based on masking properties is proposed for enhancement of speech degraded by additive noise. Since the masking properties are related to the critical frequency band that is derived from the characteristics of human cochlea, the incorporation of masking threshold into a subspace technique requires the transformation between the frequency and eigen domains. We present and apply an invertible transformation between the frequency and eigen domains. In this paper, we use masking properties of the human auditory system to define the audible noise quantity in the eigendomain. We derive the eigen-decomposition of the estimated speech autocorrelation matrix with the assumption of white noise. Subsequently, an audible noise reduction scheme is developed based on a signal subspace technique, and the implementation of our proposed scheme is outlined. We further extend the scheme to the colored noise case. Simulation results show the superiority of our proposed scheme over other existing subspace methods in terms of segmental signal-to-noise ratio (SNR), perceptual evaluation of speech quality (PESQ), modified Bark spectral distortion (MBSD), spectrogram and informal listening tests.  相似文献   

5.
We present a novel subspace modeling and selection approach for noisy speech recognition. In subspace modeling, we develop a factor analysis (FA) representation of noisy speech, which is a generalization of a signal subspace (SS) representation. Using FA, noisy speech is represented by the extracted common factors, factor loading matrix, and specific factors. The observation space of noisy speech is accordingly partitioned into a principal subspace, containing speech and noise, and a minor subspace, containing residual speech and residual noise. We minimize the energies of speech distortion in the principal subspace as well as in the minor subspace so as to estimate clean speech with residual information. Importantly, we explore the optimal subspace selection via solving the hypothesis test problems. We test the equivalence of eigenvalues in the minor subspace to select the subspace dimension. To fulfill the FA spirit, we also examine the hypothesis of uncorrelated specific factors/residual speech. The subspace can be partitioned according to a consistent confidence towards rejecting the null hypothesis. Optimal solutions are realized through the likelihood ratio tests, which arrive at the approximated chi-square distributions as test statistics. In the experiments on the Aurora2 database, the FA model significantly outperforms the SS model for speech enhancement and recognition. Subspace selection via testing the correlation of residual speech achieves higher recognition accuracies than that of testing the equivalent eigenvalues in the minor subspace.  相似文献   

6.
In this paper, a new signal subspace-based approach for enhancing a speech signal degraded by environmental noise is presented. The Perceptual Karhunen–Loève Transform (PKLT) method is improved here by including the Variance of the Reconstruction Error (VRE) criterion, in order to optimize the subspace decomposition model. The incorporation of the VRE in the PKLT (namely the PKLT-VRE hybrid method) yields a good tradeoff between the noise reduction and the speech distortion thanks to the combination of a perceptual criterion and the optimal determination of the noisy subspace dimension. In adverse conditions, the experimental tests, using objective quality measures, show that the proposed method provides a higher noise reduction and a lower signal distortion than the existing speech enhancement techniques.  相似文献   

7.
针对复杂背景噪声下语音增强后带有音乐噪声的问题,提出一种子空间与维纳滤波相结合的语音增强方法。对带噪语音进行KL变换,估计出纯净语音的特征值,再利用子空间域中的信噪比计算公式构成一个维纳滤波器,使该特征值通过这个滤波器,从而得到新的纯净语音特征值,由KL逆变换还原出纯净语音。仿真结果表明,在白噪声和火车噪声的背景下,信噪比都比传统子空间方法有明显提高,并有效抑制了增强后产生的音乐噪声。  相似文献   

8.
A new signal subspace-based approach is proposed for the enhancement of speech corrupted by a high level of noise. Conventional subspace-based methods use the minimum mean square error criterion to optimize the Karhunen-Loève Transform (KLT). In non-stationary noisy environments, the selection of the optimal order of the KLT-based speech enhancement model is a critical issue. Indeed, estimation of the relevant subspace dimensions depends on the environmental conditions that may change unpredictably. Therefore, a drastic KLT-based dimension reduction may induce the loss of relevant components of speech and conversely, a reconstruction using a higher order of the KLT model will be ineffective to remove the noise. The method presented in this paper uses a Variance of Reconstruction Error (VRE) criterion to optimally select the KLT order model. A prominent point of this subspace method is that it incorporates the Minima Controlled Recursive Averaging (MCRA) to estimate the noise Power Spectral Density (PSD) used in the gain function. Three variants of the VRE combined with MCRA methods are implemented and compared, namely the VRE-MCRA, VRE-MCRA2 and VRE-IMCRA. Objective measures show that VRE-based approaches achieve a lower signal distortion and a higher noise reduction than existing enhancement methods.  相似文献   

9.
针对频域受限子空间语音增强在构造增强矩阵时,采用固定拉格朗日乘子,使得减小语音畸变和提高语音可懂度的过程中,有音乐噪声残留,提出一种变拉格朗日乘子的算法。利用听觉特性中较强的频率成分对噪声进行掩蔽,通过掩蔽阈值的频率域与子空间特征值之间的变换算法,用变量控制子空间拉格朗日乘子计算增益函数的对角矩阵。对比实验和试听结果表明,提出算法增强的语音信号不仅信噪比有较大提高,语音质量主观感知度也有明显改善。  相似文献   

10.
基于鲁棒H滤波器理论和共轭梯度自适应参数估计方法提出了一种对复杂噪声有抑制效果的语音增强算法。应用这种方法自适应地从带噪信号中提取语音参数时不必预先知道噪声源的统计特性,只要求噪声信号能量有限。因为它基于H滤波器,所以可保证由外界干扰和附加噪声引起的性能指标恶化达到最小。仿真结果表明:该语音增强算法具有计算速度快、鲁棒性好、语音增强效果明显、易于实现、可抑制复杂背景噪声等特点。  相似文献   

11.
All discrete Fourier transform (DFT) domain-based speech enhancement gain functions rely on knowledge of the noise power spectral density (PSD). Since the noise PSD is unknown in advance, estimation from the noisy speech signal is necessary. An overestimation of the noise PSD will lead to a loss in speech quality, while an underestimation will lead to an unnecessary high level of residual noise. We present a novel approach for noise tracking, which updates the noise PSD for each DFT coefficient in the presence of both speech and noise. This method is based on the eigenvalue decomposition of correlation matrices that are constructed from time series of noisy DFT coefficients. The presented method is very well capable of tracking gradually changing noise types. In comparison to state-of-the-art noise tracking algorithms the proposed method reduces the estimation error between the estimated and the true noise PSD. In combination with an enhancement system the proposed method improves the segmental SNR with several decibels for gradually changing noise types. Listening experiments show that the proposed system is preferred over the state-of-the-art noise tracking algorithm.  相似文献   

12.
联合听觉掩蔽效应的子空间语音增强算法   总被引:1,自引:0,他引:1       下载免费PDF全文
在经典子空间语音增强算法中,因语音特征值估计偏差会造成语音失真和音乐噪声。针对该问题,提出一种联合听觉掩蔽效应的语音增强算法。该算法联合掩蔽阈值自适应调节噪声特征值的抑制系数,并利用维纳滤波对音乐噪声的抑制性,对该特征值并行修正,最终还原出纯净的语音。实验结果证明,该算法在白噪声和有色噪声的背景下,与经典子空间的语音增强算法相比,能提高信噪比,减少语音失真和音乐噪声。  相似文献   

13.
A gain factor adapted by both the intra-frame masking properties of the human auditory system and the inter-frame SNR variation is proposed to enhance a speech signal corrupted by additive noise. In this article we employ an averaging factor, varying with time–frequency, to improve the estimate of the a priori SNR. In turn, this SNR estimate is utilized to adapt a gain factor for speech enhancement. This gain factor reduces the spectral variation over successive frames, so the effect of musical residual noise is mitigated. In addition, the simultaneous masking property of the human ears is also employed to adapt the gain factor. Imperceptive residual noise with energy below the noise masking threshold is retained, resulting in a reduction of speech distortion. Experimental results show that the proposed scheme can efficiently reduce the effect of musical residual noise.  相似文献   

14.
Estimating the noise power spectral density (PSD) from the corrupted speech signal is an essential component for speech enhancement algorithms. In this paper, a novel noise PSD estimation algorithm based on minimum mean-square error (MMSE) is proposed. The noise PSD estimate is obtained by recursively smoothing the MMSE estimation of the current noise spectral power. For the noise spectral power estimation, a spectral weighting function is derived, which depends on the a priori signal-to-noise ratio (SNR). Since the speech spectral power is highly important for the a priori SNR estimate, this paper proposes an MMSE spectral power estimator incorporating speech presence uncertainty (SPU) for speech spectral power estimate to improve the a priori SNR estimate. Moreover, a bias correction factor is derived for speech spectral power estimation bias. Then, the estimated speech spectral power is used in “decision-directed” (DD) estimator of the a priori SNR to achieve fast noise tracking. Compared to three state-of-the-art approaches, i.e., minimum statistics (MS), MMSE-based approach, and speech presence probability (SPP)-based approach, it is clear from experimental results that the proposed algorithm exhibits more excellent noise tracking capability under various nonstationary noise environments and SNR conditions. When employed in a speech enhancement system, improved speech enhancement performances in terms of segmental SNR improvements (SSNR+) and perceptual evaluation of speech quality (PESQ) can be observed.  相似文献   

15.
利用子空间方法来实现语音的增强,在语音失真和残留噪声之间进行折中处理:既最小化语音失真,同时又使残留噪声保持在一个预先设定的值.传统的子空间法在平稳噪声环境下是有效的,但在非平稳环境下效果却不是很明显,因此利用语音端点检测(VAD)对噪声的协方差进行及时地更新.实验表明,采用基于VAD的子空间方法实现语音增强可以达到很好的效果.  相似文献   

16.
基于语音增强失真补偿的抗噪声语音识别技术   总被引:1,自引:0,他引:1  
本文提出了一种基于语音增强失真补偿的抗噪声语音识别算法。在前端,语音增强有效地抑制背景噪声;语音增强带来的频谱失真和剩余噪声是对语音识别不利的因素,其影响将通过识别阶段的并行模型合并或特征提取阶段的倒谱均值归一化得到补偿。实验结果表明,此算法能够在非常宽的信噪比范围内显著的提高语音识别系统在噪声环境下的识别精度,在低信噪比情况下的效果尤其明显,如对-5dB的白噪声,相对于基线识别器,该算法可使误识率下降67.4%。  相似文献   

17.
We consider the enhancement of speech corrupted by additive white Gaussian noise. In a Bayesian inference framework, maximum a posteriori (MAP) estimation of the signal is performed, along the lines developed by Lim & Oppenheim (1978). The speech enhancement problem is treated as a signal estimation problem, whose aim is to obtain a MAP estimate of the clean speech signal, given the noisy observations. The novelty of our approach, over previously reported work, is that we relate the variance of the additive noise and the gain of the autoregressive (AR) process to hyperparameters in a hierarchical Bayesian framework. These hyperparameters are computed from the noisy speech data to maximize the denominator in Bayes formula, also known as the evidence. The resulting Bayesian scheme is capable of performing speech enhancement from the noisy data without the need for silence detection. Experimental results are presented for stationary and slowly varying additive white Gaussian noise. The Bayesian scheme is also compared to the Lim and Oppenheim system, and the spectral subtraction method.  相似文献   

18.
Speech recognizers achieve high recognition accuracy under quiet acoustic environments, but their performance degrades drastically when they are deployed in real environments, where the speech is degraded by additive ambient noise. This paper advocates a two phase approach for robust speech recognition in such environment. Firstly, a front end subband speech enhancement with adaptive noise estimation (ANE) approach is used to filter the noisy speech. The whole noisy speech spectrum is portioned into eighteen dissimilar subbands based on Bark scale and noise power from each subband is estimated by the ANE approach, which does not require the speech pause detection. Secondly, the filtered speech spectrum is processed by the non parametric frequency domain algorithm based on human perception along with the back end building a robust classifier to recognize the utterance. A suite of experiments is conducted to evaluate the performance of the speech recognizer in a variety of real environments, with and without the use of a front end speech enhancement stage. Recognition accuracy is evaluated at the word level, and at a wide range of signal to noise ratios for real world noises. Experimental evaluations show that the proposed algorithm attains good recognition performance when signal to noise ratio is lower than 5 dB.  相似文献   

19.
基于小波变换和Kalman滤波的语音增强方法   总被引:1,自引:0,他引:1  
针对受加性噪声干扰的语音信号,采用基于小波变换的Kalman滤波方法,提出一种有效的语音增强方法.分析在实际处理中所遇到的二进小波变换、滤波参数估计、Kalman滤波发散等问题.语音增强的效果采用信噪比来进行评估.仿真实验表明在加性噪声为高斯白噪声和色噪的情况下,该方法均具有较好的有效性.  相似文献   

20.
Hamid   《Digital Signal Processing》2008,18(5):728-738
This paper proposes a technique for reducing noise from a signal's time series using a time–frequency distribution. The technique is based on the SVD of the matrix associated with the time–frequency representation of the signal. In this approach the time–frequency representation of the signal is initially divided into signal subspace and noise subspace using singular values of the time–frequency matrix as a criterion for space division. Since singular vectors are the span bases of the matrix, reducing the effect of noise from the singular vectors and using them in reproducing the matrix enhances the information embedded in the time–frequency representation of the signal. The proposed approach utilizes the Savitzky–Golay low-pass filter for noise attenuation from the singular vectors. The results of applying the proposed method on both synthetic signals and newborn EEGs indicate superiority of the proposed technique over the existing one in reducing noise from signals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号