期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

语音识别中的两级MEL域滤波器组维纳滤波方法 总被引：2，自引：0，他引：2

刘波李锦宇戴礼荣王仁华《信号处理》2004,20(2):133-137

欧洲电信标准化协会(European Telecommunications Standards Institute,简称ETSI)于2002年10月发布了分布式语音识别的鲁棒性前端标准。该标准参数的鲁棒性远优于MFCC参数。为了能够在低运算资源的设备上实现鲁棒性前端,在ETSI标准的核心两级维纳滤波算法的基础上,我们提出了一种新方法以提高算法效率。我们首先在Mel域滤波器组幅度上构造维纳滤波器,然后对维纳滤波器系数进行平滑。最后,将维纳滤波器直接应用到Mel域滤波器组幅度上。实验表明,新方法在保持ETSI两级维纳滤波算法出色性能的同时,大大地降低了运算量。相似文献

2.

Mel域滤波在语音增强中的应用及其算法的优化

陈卓何强《电声技术》2006,(3):49-51,61

讨论了将Mel域滤波器组应用于语音增强中,使其在保证语音可懂度的前提下,达到降低背景噪声的良好效果。同时,为了能够在低运算资源的设备上实现鲁棒性前端,在ETSI标准的核心维纳算法的基础上,提出了一种优化的算法以提高算法效率。数值仿真结果验证了该算法运算量小,降噪性能强,语音清晰度高。相似文献

3.

Two‐Microphone Generalized Sidelobe Canceller with Post‐Filter Based Speech Enhancement in Composite Noise

下载免费PDF全文

Jinsoo Park Wooil Kim David K. Han Hanseok Ko 《ETRI Journal》2016,38(2):366-375

This paper describes an algorithm to suppress composite noise in a two‐microphone speech enhancement system for robust hands‐free speech communication. The proposed algorithm has four stages. The first stage estimates the power spectral density of the residual stationary noise, which is based on the detection of nonstationary signal‐dominant time‐frequency bins (TFBs) at the generalized sidelobe canceller output. Second, speech‐dominant TFBs are identified among the previously detected nonstationary signal‐dominant TFBs, and power spectral densities of speech and residual nonstationary noise are estimated. In the final stage, the bin‐wise output signal‐to‐noise ratio is obtained with these power estimates and a Wiener post‐filter is constructed to attenuate the residual noise. Compared to the conventional beamforming and post‐filter algorithms, the proposed speech enhancement algorithm shows significant performance improvement in terms of perceptual evaluation of speech quality. 相似文献

4.

Online Blind Channel Normalization Using BPF‐Based Modulation Frequency Filtering

下载免费PDF全文

Yun‐Kyung Lee Ho‐Young Jung Jeon Gue Park 《ETRI Journal》2016,38(6):1190-1196

We propose a new bandpass filter (BPF)‐based online channel normalization method to dynamically suppress channel distortion when the speech and channel noise components are unknown. In this method, an adaptive modulation frequency filter is used to perform channel normalization, whereas conventional modulation filtering methods apply the same filter form to each utterance. In this paper, we only normalize the two mel frequency cepstral coefficients (C0 and C1) with large dynamic ranges; the computational complexity is thus decreased, and channel normalization accuracy is improved. Additionally, to update the filter weights dynamically, we normalize the learning rates using the dimensional power of each frame. Our speech recognition experiments using the proposed BPF‐based blind channel normalization method show that this approach effectively removes channel distortion and results in only a minor decline in accuracy when online channel normalization processing is used instead of batch processing. 相似文献

5.

Performance Evaluation of Silence-Feature Normalization Model using Cepstrum Features of Noise Signals

SangYeob Oh Kyungyong Chung 《Wireless Personal Communications》2018,98(4):3287-3297

Speech enhancement algorithms play an important role in speech signal processing. Over the past several decades, many algorithms have been studied for speech enhancement. A speech enhancement algorithm uses a noise removal method and a statistical model filter to analyze the speech signal in the frequency domain. Spectral subtraction and Wiener filters have been used as representative algorithms. These algorithms have excellent speech enhancement performance, but suffer from deterioration in performance due to specific noise or low signal-to-noise ratio (SNR) environments. In addition, according to estimations of erroneous noise, a noise existing in a voice signal is maintained so that a spectrum corresponding to a voice signal is distorted, or a frame corresponding to a voice signal cannot be retrieved, and voice recognition performance deteriorates. The problem of deterioration in speech recognition performance arises from the difference between speech recognition and training model. We use silence-feature normalization model as a methodology to improve the recognition rate resulting from the difference in the noisy environments. Conventional silence-feature normalization has a problem in that the silent part of the energy increases, which affects recognition performance due to unclear boundaries categorizing the voice. In this study, we use the cepstrum feature of the noise signals in the silence-feature normalization model to improve the performance of silence-feature normalization in a signal with a low SNR by setting a reference value for voiced and unvoiced classification. As a result of recognition rate confirmation, the recognition rates improve in performance, compared with other methods. 相似文献

6.

Adaptive Channel Normalization Based on Infomax Algorithm for Robust Speech Recognition

Ho‐Young Jung 《ETRI Journal》2007,29(3):300-304

This paper proposes a new data‐driven method for high‐pass approaches, which suppresses slow‐varying noise components. Conventional high‐pass approaches are based on the idea of decorrelating the feature vector sequence, and are trying for adaptability to various conditions. The proposed method is based on temporal local decorrelation using the information‐maximization theory for each utterance. This is performed on an utterance‐by‐utterance basis, which provides an adaptive channel normalization filter for each condition. The performance of the proposed method is evaluated by isolated‐word recognition experiments with channel distortion. Experimental results show that the proposed method yields outstanding improvement for channel‐distorted speech recognition. 相似文献

7.

一种用于3G系统中复杂背景噪声环境下的话音激活检测算法 总被引：2，自引：0，他引：2

陈东赵胜辉匡镜明《通信学报》2001,22(4):45-50

本文讨论了一种新的应用在3G自适应多速率系统中复杂背景噪声环境下的话音激活检测算法。这种算法基于谱估计理论和周期信号检测方法,应用一个IIR滤波器组把输入窄带语音信号分面九个频带,进而估计每个频带的语音信号和背景噪声的电平,结合基音和音调检测,在区分语音和移动环境中的大多数背景噪声对表现得足够健壮,最后,基于欧洲电信标准委员会建议的3G平台对这种算法进行了仿真评估,并就其健壮性从主客观两个方面进行了分析。相似文献

8.

Speech Enhancement Using Phase‐Dependent A Priori SNR Estimator in Log‐Mel Spectral Domain

Yun‐Kyung Lee Jeon Gue Park Yun Keun Lee Oh‐Wook Kwon 《ETRI Journal》2014,36(5):721-729

We propose a novel phase‐based method for single‐channel speech enhancement to extract and enhance the desired signals in noisy environments by utilizing the phase information. In the method, a phase‐dependent a priori signal‐to‐noise ratio (SNR) is estimated in the log‐mel spectral domain to utilize both the magnitude and phase information of input speech signals. The phase‐dependent estimator is incorporated into the conventional magnitude‐based decision‐directed approach that recursively computes the a priori SNR from noisy speech. Additionally, we reduce the performance degradation owing to the one‐frame delay of the estimated phase‐dependent a priori SNR by using a minimum mean square error (MMSE)‐based and maximum a posteriori (MAP)‐based estimator. In our speech enhancement experiments, the proposed phase‐dependent a priori SNR estimator is shown to improve the output SNR by 2.6 dB for both the MMSE‐based and MAP‐based estimator cases as compared to a conventional magnitude‐based estimator. 相似文献

9.

改进的后滤波波束形成器语音增强算法 总被引：1，自引：0，他引：1

阎兆立杜利民《电子与信息学报》2006,28(12):2269-2272

该文提出了一种具有后滤波的波束形成器的语音增强改进算法。该算法主要解决维纳滤波器的理想信号功率谱估计,结合自功率谱减法和互功率谱减法计算出尽可能多的功率谱估计值,以使平均结果更接近于真实值,同时修正了声源移动引起的互功率谱变化。实验结果信噪比提高5dB以上,汽车环境中基于隐含马尔可夫模型(HMM)的小词汇量短语识别达到84％。从信噪比、平均谱距离和语音识别率可以看出该算法有效去除了原始算法中易残留的低频噪声,减少了语音信号失真。相似文献

10.

Wideband Extended Range-Doppler Imaging and Waveform Design in the Presence of Clutter and Noise

Yazici B. Gang Xie 《IEEE transactions on information theory / Professional Technical Group on Information Theory》2006,52(10):4563-4580

This paper presents a group-theoretic approach to address the wideband extended range-Doppler target imaging and design of clutter rejecting waveforms. An exact imaging method based on the inverse Fourier transform of the affine group is presented. A Wiener filter is designed in the affine group Fourier transform domain to minimize wideband clutter range-Doppler reflectivity. The Wiener filter is then used to form an operator to precondition transmitted waveforms to reject clutter. Alternatively, the imaging and clutter rejection methods are equivalently re-expressed to perform clutter suppression upon reception. These methods are coupled with noise suppression upon reception. Numerical simulations are performed to demonstrate the performance of the proposed approach. Our study shows that the framework introduced in this paper can address the joint design of receive and transmit processing, design of clutter rejecting waveforms, suppression of noise, and reduction of computational complexity in receive processing 相似文献

11.

一种小波变换与维纳滤波结合的语音抗噪研究

李楠《电声技术》2007,31(5):46-48

语音信号与随机噪声在不同尺度上进行小波变换时,其小波变换系数和尺度大小的特性关系存在着不同的特征表现,而且,浊音和清音也各有其特性。给出了一种基于小波变换的维纳滤波语音增强方法;采用维纳滤波对浊音和清音信号的小波变换系数进行不同的处理,既抑制了噪声,又减少了语音段信息的损失,提高了信噪比。仿真结果说明,这是一种有效的语音增强方法。相似文献

12.

一种基于软判决的立体声回声抵消算法

杨飞然吴鸣杨军《电声技术》2014,38(10):50-52

提出了一种新的基于维纳滤波的频域算法来解决立体声回声抵消问题,该算法不需要对立体声信号预处理,从而最大程度地保证了近端语音质量。并且它具有很好的鲁棒性,很快的收敛速度和跟踪速度,因而具有一定的实用价值。引入了语音增强中的软判决方法来进一步提高算法的性能。新的算法在保证近端语音质量的同时达到了更好的回声压制效果。仿真实验证明了新算法的良好性能。相似文献

13.

基于二进制小波变换和维纳滤波的语音降噪研究 总被引：3，自引：0，他引：3

侯正风《信号处理》2002,18(3):257-260

本文综合应用小波变换理论和维纳滤波技术,提出一种语音降噪算法,该算法不仅能够较好地提高信噪比,而且能够有效地抑制传统的维纳滤波所产生的音乐噪声。本文最后提供的实验结果表明,该算法对于受自噪声干扰的语音具有较好的降噪效果。相似文献

14.

汉语语音识别的抗噪性前端算法及性能分析

林建臻孙甲松王作英《电声技术》2004,(3):45-48,52

讨论了欧洲电信标准委员会ETSI提出的分布式语音识别系统的抗噪前端特征提取算法,该算法融合多种抗噪技术。结合汉语语音的特点,进行了汉语语音识别整体框架下的算法实现,并进行了实验和分析,典型噪声环境下的识别结果证明,相对于基线MFCC特征提取算法,稳健性有较大提高。相似文献

15.

基于倒谱特征的带噪语音端点检测 总被引：44，自引：0，他引：44

下载免费PDF全文

胡光锐韦晓东《电子学报》2000,28(10):95-97

在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测. 相似文献

16.

Filtering of Filter‐Bank Energies for Robust Speech Recognition

Ho‐Young Jung 《ETRI Journal》2004,26(3):273-276

We propose a novel feature processing technique which can provide a cepstral liftering effect in the log‐spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance‐based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log‐spectral domain corresponding to the cepstral liftering. The proposed method performs a high‐pass filtering based on the decorrelation of filter‐bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature. 相似文献

17.

基于两步噪声消除技术与高斯统计模型的语音增强算法

欧世峰王显云高颖赵晓晖《信号处理》2011,27(8):1171-1178

针对语音增强技术中先验信噪比参数的估计问题,本文通过结合两步噪声消除技术以及语音与噪声分量的高斯统计模型,在频率域中提出了一种新的先验信噪比估计算法。该算法基于直接判决方法的输出结果,利用最小均方误差估计理论直接计算当前帧纯净语音分量的谱能量,以获取带噪语音的先验信噪比估计。算法在保留两步噪声消除算法优点的基础上,无需语音增强系统中增益因子的任何先验条件,且在有效消除背景噪声的同时能够最大程度地抑制输出语音中音乐噪声的生成。多种噪声背景下的仿真结果表明:相对于经典的直接判决方法和新近的两步噪声消除算法,基于本文先验信噪比估计方案的语音增强系统在主观与客观评价标准下都具有更加优良的语音增强效果。相似文献

18.

Noise‐Robust Speaker Recognition Using Subband Likelihoods and Reliable‐Feature Selection

Sungtak Kim Mikyong Ji Hoirin Kim 《ETRI Journal》2008,30(1):89-100

We consider the feature recombination technique in a multiband approach to speaker identification and verification. To overcome the ineffectiveness of conventional feature recombination in broadband noisy environments, we propose a new subband feature recombination which uses subband likelihoods and a subband reliable‐feature selection technique with an adaptive noise model. In the decision step of speaker recognition, a few very low unreliable feature likelihood scores can cause a speaker recognition system to make an incorrect decision. To overcome this problem, reliable‐feature selection adjusts the likelihood scores of an unreliable feature by comparison with those of an adaptive noise model, which is estimated by the maximum a posteriori adaptation technique using noise features directly obtained from noisy test speech. To evaluate the effectiveness of the proposed methods in noisy environments, we use the TIMIT database and the NTIMIT database, which is the corresponding telephone version of TIMIT database. The proposed subband feature recombination with subband reliable‐feature selection achieves better performance than the conventional feature recombination system with reliable‐feature selection. 相似文献

19.

Interference Suppression Using Principal Subspace Modification in Multichannel Wiener Filter and Its Application to Speech Recognition

Gibak Kim 《ETRI Journal》2010,32(6):921-931

It has been shown that the principal subspace‐based multichannel Wiener filter (MWF) provides better performance than the conventional MWF for suppressing interference in the case of a single target source. It can efficiently estimate the target speech component in the principal subspace which estimates the acoustic transfer function up to a scaling factor. However, as the input signal‐to‐interference ratio (SIR) becomes lower, larger errors are incurred in the estimation of the acoustic transfer function by the principal subspace method, degrading the performance in interference suppression. In order to alleviate this problem, a principal subspace modification method was proposed in previous work. The principal subspace modification reduces the estimation error of the acoustic transfer function vector at low SIRs. In this work, a frequency‐band dependent interpolation technique is further employed for the principal subspace modification. The speech recognition test is also conducted using the Sphinx‐4 system and demonstrates the practical usefulness of the proposed method as a front processing for the speech recognizer in a distant‐talking and interferer‐present environment. 相似文献

20.

基于噪声整形的语音去噪算法

浦小祥董恩清《通信技术》2008,41(12)

针对非平稳环境噪声提出一种基于噪声整形的语音去噪算法.该算法以最小感知均方误差为准则,在Wiener滤波的基础上,采用听觉感知加权函数修正Wiener滤波方程,实现对噪声谱整形,使噪声谱分布特性跟随语音谱而变:同时引入频率补偿因子克服非平稳噪声谱对语音影响的不均匀性;采用快速噪声估计算法实现对非平稳的估计.实验表明,该算法能更有效地抑制背景噪声,提高了去噪后的语音质量. 相似文献