期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

靳立燕陈莉樊泰亭高晶《计算机应用》2015,35(8):2336-2340

针对维纳滤波算法对非平稳语音信号去噪存在的信号失真、信噪比(SNR)不高的问题,提出了一种奇异谱分析(SSA)和维纳滤波(WF)相结合的语音去噪算法SSA-WF。通过奇异谱分析将非线性、非平稳的语音信号初步去噪,提高含噪语音的信噪比以获取尽可能平稳的语音,并将其作为维纳滤波的输入,以剔除其中仍存在的高频噪声,最终获取纯净的去噪语音。在不同强度的背景噪声下进行仿真实验,结果表明SSA-WF算法在SNR和均方根误差(RMSE)等方面都要优于传统的语音去噪算法,能够有效去除背景噪声,降低有用信号的失真,适用于非线性、非平稳语音信号的去噪。相似文献

2.

Speech enhancement based on stationary bionic wavelet transform and maximum a posterior estimator of magnitude-squared spectrum

Talbi Mourad 《International Journal of Speech Technology》2017,20(1):75-88

Numerous efforts have focused on the problem of reducing the impact of noise on the performance of various speech systems such as speech coding, speech recognition and speaker recognition. These approaches consider alternative speech features, improved speech modeling, or alternative training for acoustic speech models. In this paper, we propose a new speech enhancement technique, which integrates a new proposed wavelet transform which we call stationary bionic wavelet transform (SBWT) and the maximum a posterior estimator of magnitude-squared spectrum (MSS-MAP). The SBWT is introduced in order to solve the problem of the perfect reconstruction associated with the bionic wavelet transform. The MSS-MAP estimation was used for estimation of speech in the SBWT domain. The experiments were conducted for various noise types and different speech signals. The results of the proposed technique were compared with those of other popular methods such as Wiener filtering and MSS-MAP estimation in frequency domain. To test the performance of the proposed speech enhancement system, four objective quality measurement tests [signal to noise ratio (SNR), segmental SNR, Itakura–Saito distance and perceptual evaluation of speech quality] were conducted for various noise types and SNRs. Experimental results and objective quality measurement test results proved the performance of the proposed speech enhancement technique. It provided sufficient noise reduction and good intelligibility and perceptual quality, without causing considerable signal distortion and musical background noise. 相似文献

3.

多通道助听器语音降噪算法研究

奚吉梁瑞宇王国伟仇晓梅马安骏《计算机工程与应用》2014,50(11):237-240

维纳滤波算法是改善噪声环境下听障患者语音理解度的常用算法之一。针对传统维纳滤波算法噪声谱估计偏差大的问题,提出一种基于改进的多通道维纳滤波算法的助听器语音降噪算法。算法首先结合人耳听觉特性和助听器响度补偿的特点,将语音信号进行Gammatone分解为多路子带信号。然后在每个子带内用基于先验信噪比估计的维纳滤波器进行语音增强处理。最后通过综合子带信号,得到增强的语音。此外,为了改善维纳滤波算法噪声谱估计的问题,提出一种基于包络估计的语音活动检测算法,并用于改善维纳滤波性能。实验结果表明,与传统维纳滤波法相比,该方法能更有效地抑制残留噪声,提高语音可懂度,具有较高的实用价值。相似文献

4.

Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

Anirban Bhowmick Mahesh Chandra Astik Biswas 《International Journal of Speech Technology》2017,20(4):813-827

In recent past, wavelet packet (WP) based speech enhancement techniques have been gaining popularity due to their inherent nature of noise minimization. WP based techniques appeared as more robust and efficient than short-time Fourier transform based methods. In the present work, a speech enhancement method using Teager energy operated equal rectangular bandwidth (ERB)-like WP decomposition has been proposed. Twenty four sub-band perceptual wavelet packet decomposition (PWPD) structure is implemented according to the auditory ERB scale. ERB scale based decomposition structure is used because the central frequency of the ERB scale distribution is similar to the frequency response of the human cochlea. Teager energy operator is applied to estimate the threshold value for the PWPD coefficients. Lastly, Wiener filtering is applied to remove the low frequency noise before final reconstruction stage. The proposed method has been applied to evaluate the Hindi sentences database, corrupted with six noise conditions. The proposed method’s performance is analysed with respect to several speech quality parameters and output signal to noise ratio levels. Performance indicates that the proposed technique outperforms some traditional speech enhancement algorithms at all SNR levels. 相似文献

5.

Car Speech Enhancement Using a Microphone Array

Jen-Tzung?Chien Email author Po-Yin?Lai 《International Journal of Speech Technology》2005,8(1):79-91

This paper proposes a speech enhancement approach to suppress the interference of car noise. A linear microphone array is adopted for far-talking speech acquisition and delay-and-sum beamforming noise reduction. We present an effective time delay estimator using the coherence function between the reference microphone and the beamformed speech. To further enhance the beamformed speech, we exploit an improved Wiener filter where the resulting noise correlation in microphone array is relatively small so that the performance of optimal Wiener filtering could be achieved. Also, due to the serious degradation in low frequency car speech, we develop a spectral weighting function to compensate the low frequency filtering. These two processing units serve as the post filters to attain the desirable enhancement performance. In the experiments on microphone array speech in presence of real and simulated car noises, we find that the proposed algorithm performs well. Performance is measured in terms of the signal-to-noise ratio and the word error rate. The combined delay-and-sum beamformer and two post filters obtain the best results compared to other methods. 相似文献

6.

Speech enhancement based on undecimated wavelet packet-perceptual filterbanks and MMSE–STSA estimation in various noise environments

Hac&#x; Ergun 《Digital Signal Processing》2008,18(5):797-812

In this paper, we proposed a new speech enhancement system, which integrates a perceptual filterbank and minimum mean square error–short time spectral amplitude (MMSE–STSA) estimation, modified according to speech presence uncertainty. The perceptual filterbank was designed by adjusting undecimated wavelet packet decomposition (UWPD) tree, according to critical bands of psycho-acoustic model of human auditory system. The MMSE–STSA estimation (modified according to speech presence uncertainty) was used for estimation of speech in undecimated wavelet packet domain. The perceptual filterbank provides a good auditory representation (sufficient frequency resolution), good perceptual quality of speech and low computational load. The MMSE–STSA estimator is based on a priori SNR estimation. A priori SNR estimation, which is a key parameter in MMSE–STSA estimator, was performed by using “decision directed method.” The “decision directed method” provides a trade off between noise reduction and signal distortion when correctly tuned. The experiments were conducted for various noise types. The results of proposed method were compared with those of other popular methods, Wiener estimation and MMSE–log spectral amplitude (MMSE–LSA) estimation in frequency domain. To test the performance of the proposed speech enhancement system, three objective quality measurement tests (SNR, segSNR and Itakura–Saito distance (ISd)) were conducted for various noise types and SNRs. Experimental results and objective quality measurement test results proved the performance of proposed speech enhancement system. The proposed speech enhancement system provided sufficient noise reduction and good intelligibility and perceptual quality, without causing considerable signal distortion and musical background noise. 相似文献

7.

基于卷积神经网络的面罩语音识别

王霞杜桂明王光艳张艳《传感器与微系统》2017,36(10)

针对带噪面罩语音识别率低的问题,结合语音增强算法,对面罩语音进行噪声抑制处理,提高信噪比,在语音增强中提出了一种改进的维纳滤波法,通过谱熵法检测有话帧和无话帧来更新噪声功率谱,同时引入参数控制增益函数;提取面罩语音信号的Mel频率倒谱系数(MFCC)作为特征参数;通过卷积神经网络(CNN)进行训练和识别,并在每个池化层后经局部响应归一化(LRN)进行优化.实验结果表明:该识别系统能够在很大程度上提高带噪面罩语音的识别率. 相似文献

8.

Single-Channel Speech Separation Using Soft Mask Filtering 总被引：2，自引：0，他引：2

Radfar M.H. Dansereau R.M. 《IEEE transactions on audio, speech, and language processing》2007,15(8):2299-2310

We present an approach for separating two speech signals when only one single recording of their linear mixture is available. For this purpose, we derive a filter, which we call the soft mask filter, using minimum mean square error (MMSE) estimation of the log spectral vectors of sources given the mixture's log spectral vectors. The soft mask filter's parameters are estimated using the mean and variance of the underlying sources which are modeled using the Gaussian composite source modeling (CSM) approach. It is also shown that the binary mask filter which has been empirically and extensively used in single-channel speech separation techniques is, in fact, a simplified form of the soft mask filter. The soft mask filtering technique is compared with the binary mask and Wiener filtering approaches when the input consists of male+male, female+female, and male+female mixtures. The experimental results in terms of signal-to-noise ratio (SNR) and segmental SNR show that soft mask filtering outperforms binary mask and Wiener filtering. 相似文献

9.

A gain factor adapted by masking property and SNR variation for speech enhancement in colored-noise corruptions

Ching-Ta Lu Kun-Fu Tseng 《Computer Speech and Language》2010,24(4):632-647

A gain factor adapted by both the intra-frame masking properties of the human auditory system and the inter-frame SNR variation is proposed to enhance a speech signal corrupted by additive noise. In this article we employ an averaging factor, varying with time–frequency, to improve the estimate of the a priori SNR. In turn, this SNR estimate is utilized to adapt a gain factor for speech enhancement. This gain factor reduces the spectral variation over successive frames, so the effect of musical residual noise is mitigated. In addition, the simultaneous masking property of the human ears is also employed to adapt the gain factor. Imperceptive residual noise with energy below the noise masking threshold is retained, resulting in a reduction of speech distortion. Experimental results show that the proposed scheme can efficiently reduce the effect of musical residual noise. 相似文献

10.

含噪语音实时迭代维纳滤波 总被引：1，自引：1，他引：0

下载免费PDF全文

王景芳《计算机工程与应用》2011,47(19):132-135

针对传统去噪方法在强背景噪声情况下,提取声音信号的能力变弱甚至失效与对不同噪声环境适应性差,提出了迭代维纳滤波声音信号特征提取方法。给出了语音噪声频谱与功率谱信噪比迭代更新机制与具体实施方案。实验仿真表明,该算法能有效地去噪滤波,显著地提高语音识别系统性能,且在不同的噪声环境和信噪比条件下具有鲁棒性。该算法计算代价小,简单易实现,适用于嵌入式语音识别系统。相似文献

11.

Speech enhancement with an adaptive Wiener filter

Marwa A. Abd El-Fattah Moawad I. Dessouky Alaa M. Abbas Salaheldin M. Diab El-Sayed M. El-Rabaie Waleed Al-Nuaimy Saleh A. Alshebeili Fathi E. Abd El-samie 《International Journal of Speech Technology》2014,17(1):53-64

This paper proposes an adaptive Wiener filtering method for speech enhancement. This method depends on the adaptation of the filter transfer function from sample to sample based on the speech signal statistics; the local mean and the local variance. It is implemented in the time domain rather than in the frequency domain to accommodate for the time-varying nature of the speech signals. The proposed method is compared to the traditional frequency-domain Wiener filtering, spectral subtraction and wavelet denoising methods using different speech quality metrics. The simulation results reveal the superiority of the proposed Wiener filtering method in the case of Additive White Gaussian Noise (AWGN) as well as colored noise. 相似文献

12.

Computational auditory models in predicting noise reduction performance for wideband telephony applications

Nazanin Pourmand Vijay Parsa Angela Weaver 《International Journal of Speech Technology》2013,16(4):363-379

The performance of several noise reduction algorithms intended for wideband telephony was evaluated both subjectively and objectively. The chosen algorithms were based on statistical modeling, spectral subtraction, Wiener filtering, or subspace modelling principles. A customized wideband noise reduction database containing speech samples corrupted by three types of background noises at three SNR levels, along with their enhanced versions was created. The overall quality of the speech samples in the database was subsequently rated by a group of listeners with normal hearing capabilities. Comprehensive statistical analyses were performed to assess the reliability of the subjective data, and to assess the performance of noise reduction algorithms across varied noisy conditions. The subjective quality ratings were then used to investigate the performance of several auditory model-based objective quality metrics. Key results from these investigations include: (a) there was a high degree of inter- and intra-subject reliability in the subjective ratings, (b) noise reduction algorithms enhance speech quality for only a subset of the noise conditions, and (c) auditory model-based metrics perform similarly in predicting speech quality ratings, when speech quality scores pertaining to a particular noise condition were averaged. 相似文献

13.

基于修正维纳滤波的小波包变换图像去噪

李云红伊欣《计算机工程与应用》2012,48(21):182-185

图像去噪是图像处理中一个非常重要的环节。为了改善降质图像质量,根据Donoho提出的小波阈值去噪算法,分析了维纳滤波原理,提出了一种基于修正维纳滤波的小波包变换图像去噪方法。利用修正维纳滤波对噪声图像进行处理,用处理后的图像计算噪声的标准方差,以此作为小波包的阈值。利用小波包对维纳滤波后的图像进行分解,实现对图像的低频和高频部分分别进行分解,用计算出的阈值对小波包树系数进行软阈值处理。利用小波包逆变换来获取去噪后的图像。结果表明：在噪声方差为0.01时,经该算法去噪后图像的PSNR比小波包自适应阈值去噪后的PSNR高出8.8 dB。该算法不仅能有效地去除加性高斯白噪声,而且能很好地保留边缘信息,极大地改善了图像的视觉质量。相似文献

14.

基于自适应倒谱距离的强噪声语音端点检测 总被引：4，自引：0，他引：4

赵新燕王炼红彭林哲《计算机科学》2015,42(9):83-85, 117

在有噪声干扰的情况下,传统的语音端点检测方法的检测准确度明显下降。为了在强背景噪声环境下有效区分出语音信号和非语音信号,针对倒谱距离端点检测方法进行了研究,提出了一种基于自适应倒谱距离的强噪声语音端点检测方法。本方法引入倒谱距离乘数和门限增量系数,针对不同信噪比采用不同的倒谱距离乘数,并采用自适应判决门限的方法进行语音端点检测。MATLAB仿真实验结果显示,在不同背景噪声和不同信噪比下,本方法对于语音端点检测具有较高的检测正确率,其端点检测效果明显优于传统端点检测方法,适用于强背景噪声下的端点检测。相似文献

15.

DCT域维纳滤波语音增强

宁矿凤王景芳《计算机工程与应用》2015,51(8):226-230

针对非平稳噪声和强背景噪声下声音信号难以提取的实际问题,提出了一种DCT域的维纳滤波方法。列出了DCT域清浊音分割步骤,给出了DCT域频谱信噪比迭代更新机制与具体实施方案,设计了DCT域的二维维纳滤波。实验仿真表明,该算法能有效地去噪滤波,改善可懂度,且在不同的噪声环境和信噪比条件下具有鲁棒性。该算法计算代价小,简单易实现。相似文献

16.

基于谱减法和变步长LMS语音增强算法

徐文超王光艳耿艳香白芳费腾《计算机工程与应用》2015,51(1):213-217

谱减法是目前有效的增强语音信号质量的技术之一,低信噪比下降噪效果明显,而LMS自适应滤波算法收敛速度慢,步长需在收敛速度和失调折中选择。提出了先经过谱减法然后采用变步长LMS自适应滤波算法联合去噪来提高信号质量,通过改变误差的平方项来调节步长,步长采用先固定后变化的原则,兼顾了提高收敛速度和缩小稳态误差。在MATLAB 环境下进行仿真实验,测试结果表明提出的经过基本谱减法后再采用变步长LMS自适应滤波算法能有效消除背景噪声,信噪比SNR和PESQ分值得到了较大的提高,减少了原始语音信号的失真,提高了信号质量。相似文献

17.

Speech transmission with COFDM based on different discrete transforms

Naglaa F. Soliman Samia M. Abd-Alhalem Sahar A. El_Rahman Mohammed M. Fouad Fathi E. Abd El-Samie 《International Journal of Speech Technology》2016,19(3):565-576

Recently, the multimedia and cellular technologies have spread dramatically. Therefore, the demand for digital information has increased. Speech compression is one of the most effective forms of communication. This paper presents three approaches for the transmission of compressed speech signals over convolutional Coded Orthogonal Frequency Division Multiplexing (COFDM) system with a chaotic interleavering technique. The speech signal has is compressed using the Set Partitioning In Hierarchical trees (SPIHT) algorithm, which is an improved version of EZW and which is characterized by a simple and effective method for further compression. For mitigation of the fading due to multipath wireless channels, this paper proposes a COFDM system based on fractional Fourier transform (FrFT), a COFDM system based on discrete Cosine transform (DCT), and a COFDM system based on discrete wavelet transform (DWT). The FrFT has the ability of solving the frequency offset problem, which causes the received frequency-domain sub-carriers to be shifted, and therefore, the orthogonality between subcarriers deteriorates even with equalization. The DCT has an advantage of increased computational speed as only real calculations are required. The DWT is spectrally efficient since it does not utilize cyclic prefix (CP). These systems have been designed under the assumption that corruptive background noises are absent. Therefore, denoising techniques, namely wavelet denoising and Wiener filtering methods are suggested at the receiver to achieve enhancement in the speech quality. The simulation experiments shows that the proposed COFDM–DWT with Wiener filtering at the receiver has a better trade-off between BER, spectral efficiency and signal distortion. Hence, the BER performance is improved with small bandwidth occupancy. Moreover, due to the denoising stage, the speech quality is improved to achieve good intelligibility. 相似文献

18.

Estimators of The Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty 总被引：1，自引：0，他引：1

Lu Y Loizou PC 《IEEE transactions on audio, speech, and language processing》2011,19(5):1123-1137

Statistical estimators of the magnitude-squared spectrum are derived based on the assumption that the magnitude-squared spectrum of the noisy speech signal can be computed as the sum of the (clean) signal and noise magnitude-squared spectra. Maximum a posterior (MAP) and minimum mean square error (MMSE) estimators are derived based on a Gaussian statistical model. The gain function of the MAP estimator was found to be identical to the gain function used in the ideal binary mask (IdBM) that is widely used in computational auditory scene analysis (CASA). As such, it was binary and assumed the value of 1 if the local SNR exceeded 0 dB, and assumed the value of 0 otherwise. By modeling the local instantaneous SNR as an F-distributed random variable, soft masking methods were derived incorporating SNR uncertainty. The soft masking method, in particular, which weighted the noisy magnitude-squared spectrum by the a priori probability that the local SNR exceeds 0 dB was shown to be identical to the Wiener gain function. Results indicated that the proposed estimators yielded significantly better speech quality than the conventional MMSE spectral power estimators, in terms of yielding lower residual noise and lower speech distortion. 相似文献

19.

面罩语音质量评价算法适用性研究

王霞马俊晖王光艳张艳《计算机工程与应用》2017,53(19):114-117

针对语音编码的音质评价算法性能已十分明确,但对于面罩语音不一定适用。讨论了语音质量评价算法对空气语音与面罩语音在不同噪声环境下的适用性。采用主观意见得分和三种客观评价测度对多种信噪比的带噪语音和增强语音进行评价,包括分段信噪比、改进的巴克谱失真（MBSD）和语音感知质量评价（PESQ）,根据与主观评价的一致性判断客观评价方法的适用性。增强算法采用维纳滤波法和对数谱最小均方误差法（LSA-MMSE）,噪声采用粉红噪声、海浪噪声。仿真结果表明,语音质量评价算法的适用性与语音类型、信噪比、背景噪声、增强算法种类有关。粉红噪声环境下,PESQ不适合评价经维纳滤波增强的空气语音;MBSD算法只适用于评价经LSA-MMSE增强的面罩语音。海浪噪声环境下,PESQ适用于评价面罩语音,MBSD不适合评价面罩语音。相似文献

20.

On the Importance of the Pearson Correlation Coefficient in Noise Reduction

Benesty J. Jingdong Chen Yiteng Huang 《IEEE transactions on audio, speech, and language processing》2008,16(4):757-765

Noise reduction, which aims at estimating a clean speech from noisy observations, has attracted a considerable amount of research and engineering attention over the past few decades. In the single-channel scenario, an estimate of the clean speech can be obtained by passing the noisy signal picked up by the microphone through a linear filter/transformation. The core issue, then, is how to find an optimal filter/transformation such that, after the filtering process, the signal-to-noise ratio (SNR) is improved but the desired speech signal is not noticeably distorted. Most of the existing optimal filters (such as the Wiener filter and subspace transformation) are formulated from the mean-square error (MSE) criterion. However, with the MSE formulation, many desired properties of the optimal noise-reduction filters such as the SNR behavior cannot be seen. In this paper, we present a new criterion based on the Pearson correlation coefficient (PCC). We show that in the context of noise reduction the squared PCC (SPCC) has many appealing properties and can be used as an optimization cost function to derive many optimal and suboptimal noise-reduction filters. The clear advantage of using the SPCC over the MSE is that the noise-reduction performance (in terms of the SNR improvement and speech distortion) of the resulting optimal filters can be easily analyzed. This shows that, as far as noise reduction is concerned, the SPCC-based cost function serves as a more natural criterion to optimize as compared to the MSE. 相似文献