期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An iterative longest matching segment approach to speech enhancement with additive noise and channel distortion

《Computer Speech and Language》2014,28(6):1269-1286

This paper presents a new approach to speech enhancement from single-channel measurements involving both noise and channel distortion (i.e., convolutional noise), and demonstrates its applications for robust speech recognition and for improving noisy speech quality. The approach is based on finding longest matching segments (LMS) from a corpus of clean, wideband speech. The approach adds three novel developments to our previous LMS research. First, we address the problem of channel distortion as well as additive noise. Second, we present an improved method for modeling noise for speech estimation. Third, we present an iterative algorithm which updates the noise and channel estimates of the corpus data model. In experiments using speech recognition as a test with the Aurora 4 database, the use of our enhancement approach as a preprocessor for feature extraction significantly improved the performance of a baseline recognition system. In another comparison against conventional enhancement algorithms, both the PESQ and the segmental SNR ratings of the LMS algorithm were superior to the other methods for noisy speech enhancement. 相似文献

2.

基于GFCC与RLS的说话人识别抗噪系统研究

茅正冲王正创黄芳《计算机工程与应用》2015,51(10):215-218

为了提高说话人识别抗噪系统的性能,提出了将RLS自适应滤波器作为语音信号去噪的预处理器,进一步提高语音信号的信噪比,再通过Gammatone滤波器组,对去噪后的说话人语音信号进行处理,提取说话人语音信号的特征参数GFCC,进而将特征参数GFCC用于说话人识别系统中。仿真实验在高斯混合模型识别系统中进行。实验结果表明,采用这种方法应用于说话人识别抗噪系统,系统的识别率及鲁棒性都有明显的提高。相似文献

3.

基于RBF神经网络的抗噪语音识别 总被引：1，自引：0，他引：1

白静张雪英侯雪梅《计算机工程与应用》2007,43(22):28-30

针对目前在噪音环境下语音识别系统性能较差的问题,利用RBF神经网络具有最佳逼近性能、训练速度快等特性,分别采用聚类和全监督训练算法,实现了基于RBF神经网络的抗噪语音识别系统。聚类算法的隐含层训练采用K－均值聚类算法,输出层的学习采用线性最小二乘法;全监督算法中所有参数的调整基于梯度下降法,它是一种有监督学习算法,能够选出性能优良的参数。实验表明,在不同的信噪比下,全监督算法较之聚类算法有更高的识别率。相似文献

4.

基于子带分解的DFRFT自适应滤波语音增强算法 总被引：1，自引：0，他引：1

杨桂芹徐红莉宁寰宇《测控技术》2014,33(3):42-44

提出一种改进的语音增强方法,利用子带分解对带噪语音信号进行处理,再在离散分数傅里叶变换(DFRFT)域采用最小均方(LMs)自适应算法进行滤波,对滤波后的子带信号进行DFRFT逆变换,最后利用综合滤波器组合成增强后的语音信号。仿真结果表明,本算法明显提高了收敛速度,减少了计算时间。在主客观评价中均具有较好的语音增强效果。相似文献

5.

Auditory driven subband speech enhancement for automatic recognition of noisy speech

Navneet Upadhyay Hamurabi Gamboa Rosales 《International Journal of Speech Technology》2016,19(4):869-880

Speech recognizers achieve high recognition accuracy under quiet acoustic environments, but their performance degrades drastically when they are deployed in real environments, where the speech is degraded by additive ambient noise. This paper advocates a two phase approach for robust speech recognition in such environment. Firstly, a front end subband speech enhancement with adaptive noise estimation (ANE) approach is used to filter the noisy speech. The whole noisy speech spectrum is portioned into eighteen dissimilar subbands based on Bark scale and noise power from each subband is estimated by the ANE approach, which does not require the speech pause detection. Secondly, the filtered speech spectrum is processed by the non parametric frequency domain algorithm based on human perception along with the back end building a robust classifier to recognize the utterance. A suite of experiments is conducted to evaluate the performance of the speech recognizer in a variety of real environments, with and without the use of a front end speech enhancement stage. Recognition accuracy is evaluated at the word level, and at a wide range of signal to noise ratios for real world noises. Experimental evaluations show that the proposed algorithm attains good recognition performance when signal to noise ratio is lower than 5 dB. 相似文献

6.

Multi-environment model adaptation based on vector Taylor series for robust speech recognition

Yong Lü Author VitaeAuthor Vitae Lin Zhou Author Vitae Author Vitae 《Pattern recognition》2010,43(9):3093-3099

In this paper, we propose a multi-environment model adaptation method based on vector Taylor series (VTS) for robust speech recognition. In the training phase, the clean speech is contaminated with noise at different signal-to-noise ratio (SNR) levels to produce several types of noisy training speech and each type is used to obtain a noisy hidden Markov model (HMM) set. In the recognition phase, the HMM set which best matches the testing environment is selected, and further adjusted to reduce the environmental mismatch by the VTS-based model adaptation method. In the proposed method, the VTS approximation based on noisy training speech is given and the testing noise parameters are estimated from the noisy testing speech using the expectation-maximization (EM) algorithm. The experimental results indicate that the proposed multi-environment model adaptation method can significantly improve the performance of speech recognizers and outperforms the traditional model adaptation method and the linear regression-based multi-environment method. 相似文献

7.

A Study of Variable-Parameter Gaussian Mixture Hidden Markov Modeling for Noisy Speech Recognition

Cui X. Gong Y. 《IEEE transactions on audio, speech, and language processing》2007,15(4):1366-1376

To improve recognition performance in noisy environments, multicondition training is usually applied in which speech signals corrupted by a variety of noise are used in acoustic model training. Published hidden Markov modeling of speech uses multiple Gaussian distributions to cover the spread of the speech distribution caused by noise, which distracts the modeling of speech event itself and possibly sacrifices the performance on clean speech. In this paper, we propose a novel approach which extends the conventional Gaussian mixture hidden Markov model (GMHMM) by modeling state emission parameters (mean and variance) as a polynomial function of a continuous environment-dependent variable. At the recognition time, a set of HMMs specific to the given value of the environment variable is instantiated and used for recognition. The maximum-likelihood (ML) estimation of the polynomial functions of the proposed variable-parameter GMHMM is given within the expectation-maximization (EM) framework. Experiments on the Aurora 2 database show significant improvements of the variable-parameter Gaussian mixture HMMs compared to the conventional GMHMMs 相似文献

8.

一种改进的特征提取方法在语音识别中的应用

陈树于海波《传感器与微系统》2018,(5):154-157

针对梅尔频率倒谱系数(MFCC)参数在噪声环境中语音识别率下降的问题,提出了一种基于耳蜗倒谱系数(CFCC)的改进的特征参数提取方法.提取具有听觉特性的CFCC特征参数;运用改进的线性判别分析(LDA)算法对提取出的特征参数进行线性变换,得到更具有区分性的特征参数和满足隐马尔可夫模型(HMM)需要的对角化协方差矩阵;进行均值方差归一化,得到最终的特征参数.实验结果表明:提出的方法能有效地提高噪声环境中语音识别系统的识别率和鲁棒性. 相似文献

9.

多噪声环境下的层级语音识别模型

曹晶晶许洁萍邵聖淇《计算机应用》2018,38(6):1790-1794

针对多噪声环境下的语音识别问题,提出了将环境噪声作为语音识别上下文考虑的层级语音识别模型。该模型由含噪语音分类模型和特定噪声环境下的声学模型两层组成,通过含噪语音分类模型降低训练数据与测试数据的差异,消除了特征空间研究对噪声稳定性的限制,并且克服了传统多类型训练在某些噪声环境下识别准确率低的弊端,又通过深度神经网络（DNN）进行声学模型建模,进一步增强声学模型分辨噪声的能力,从而提高模型空间语音识别的噪声鲁棒性。实验中将所提模型与多类型训练得到的基准模型进行对比,结果显示所提层级语音识别模型较该基准模型的词错率（WER）相对降低了20.3%,表明该层级语音识别模型有利于增强语音识别的噪声鲁棒性。相似文献

10.

基于Gammatone滤波器和子带能量规整的语音特征提取

龙乐凯周萍杨海燕《测控技术》2017,36(5):21-24

为了改善传统语音特征参数在复杂环境下识别性能不足的问题,提出了一种基于Gammatone滤波器和子带能量规整的语音特征提取方法.该方法以能量规整倒谱系数(PNCC)特征算法为基础,在前端引入平滑幅度包络和归一化Gammatone滤波器组,并通过子带能量规整方法抑制真实环境的背景噪声,最后在后端进行特征弯折和信道补偿处理加以改进.实验采用高斯混合通用背景分类器模型(GMM-UBM)将该算法和其他特征参数进行对比.结果表明,在多种噪声环境中相比其他特征参数,本文方法表现出良好的抗噪能力,即使在低信噪比下仍有较好的识别效果. 相似文献

11.

含噪语音实时迭代维纳滤波 总被引：1，自引：1，他引：0

下载免费PDF全文

王景芳《计算机工程与应用》2011,47(19):132-135

针对传统去噪方法在强背景噪声情况下,提取声音信号的能力变弱甚至失效与对不同噪声环境适应性差,提出了迭代维纳滤波声音信号特征提取方法。给出了语音噪声频谱与功率谱信噪比迭代更新机制与具体实施方案。实验仿真表明,该算法能有效地去噪滤波,显著地提高语音识别系统性能,且在不同的噪声环境和信噪比条件下具有鲁棒性。该算法计算代价小,简单易实现,适用于嵌入式语音识别系统。相似文献

12.

基于FPGA的双轴倾角计信号提取方法研究

杨澜赵祥模惠飞史昕张建阳《计算机应用与软件》2012,29(4):89-93

为从车辆复杂噪声背景下实时提取双轴倾角计的有效信号,在分析弱信号特征提取方法的基础上,针对LMS算法在处理相关信号时收敛速度降低的缺点,提出一种对噪声敏感度较低的变步长LMS算法,基于FPGA平台设计实现一种采用一阶滤波单元重用方式实现多阶LMS滤波器的可扩展滤波器结构.试验首先通过收敛速度评价指标验证算法的优越性;其次利用滤波后的双轴倾角计信号的频谱分析证明算法对其自身噪声与车载环境噪声有较好抑制;最后对比5种不同硬件平台实现结果的差异,试验表明FPGA实现方案在执行时间、功耗和硬件占用率方面具有明显的优势. 相似文献

13.

弯折滤波器在说话人识别的鲁棒特征提取中的应用

邓蕾高勇《计算机系统应用》2017,26(12):227-232

针对噪声环境中说话人识别性能急剧下降的问题. 提出了一种用于说话人识别的鲁棒特征提取的方法. 采用弯折滤波器组（Warped filter banks,WFBS）来模拟人耳听觉特性,将立方根压缩算法、相对谱滤波技术（RASTA）、倒谱均值方差归一化算法（CMVN）引入到鲁棒特征的提取中. 在高斯混合模型（GMM）下进行仿真,实验结果表明该方法提取的特征参数在鲁棒性和识别性能上均优于MFCC特征参数和CFCC特征参数. 相似文献

14.

Identification of quadratic systems using higher order cumulants and neural networks: Application to model the delay of video-packets transmission

J. Antari S. Chabaa R. Iqdour A. Zeroual S. Safi 《Applied Soft Computing》2011,11(1):1-10

This work concerns the development of two approaches for the identification of diagonal parameters of quadratic systems from only the output observation. The systems considered are excited by an unobservable independent identically distributed (i.i.d), stationary zero mean, non-Gaussian process and corrupted by an additive Gaussian noise. The proposed approaches exploit higher order cumulants (HOC) (fourth order cumulants) and are the extension of the algorithms developed in the linear version 1D, which uses a non-Gaussian signal input. For test and validity purpose, these approaches are compared to recursive least square (RLS), least mean square (LMS) and neural network identification algorithms using non-linear model in noisy environment. To demonstrate the applicability of the theoretical methods on real processes, we applied the developed approaches to search for models able to describe the delay of the video-packets transmission over IP networks from video server. The simulation results show the correctness and the efficiency of the developed approaches. 相似文献

15.

Stereo hidden Markov modeling for noise robust speech recognition

Xiaodong Cui Mohamed Afify Yuqing Gao Bowen Zhou 《Computer Speech and Language》2013,27(2):407-419

This paper investigates a noise robust technique for automatic speech recognition which exploits hidden Markov modeling of stereo speech features from clean and noisy channels. The HMM trained this way, referred to as stereo HMM, has in each state a Gaussian mixture model (GMM) with a joint distribution of both clean and noisy speech features. Given the noisy speech input, the stereo HMM gives rise to a two-pass compensation and decoding process where MMSE denoising based on N-best hypotheses is first performed and followed by decoding the denoised speech in a reduced search space on lattice. Compared to the feature space GMM-based denoising approaches, the stereo HMM is advantageous as it has finer-grained noise compensation and makes use of information of the whole noisy feature sequence for the prediction of each individual clean feature. Experiments on large vocabulary spontaneous speech from speech-to-speech translation applications show that the proposed technique yields superior performance than its feature space counterpart in noisy conditions while still maintaining decent performance in clean conditions. 相似文献

16.

模糊加权中值滤波器

鲁瑞华张为群《计算机科学》2006,33(6):186-187

本文介绍了一种模糊加权中值滤波器,该滤波器由模糊布尔函数和滤波加权确定。本文用S型函数逼近模糊布尔函数。此外,用模糊理论领域中使用的S型函数逼近所滤波的加权。模糊加权中值滤波器只由4个参数确定。所提出的滤波在均方误差准则下能够由最小均方算法导出。图像复原的实验结果表明,本文介绍的模糊加权中值滤波方法既能去除脉冲噪声和平滑高斯噪声,又能同时有效地保持边缘和图像细节,漠糊加权中值滤波器明显优于加权中值滤波器,也优于Wiener滤波器。相似文献

17.

抗噪声语音识别及语音增强算法的应用 总被引：1，自引：0，他引：1

汤玲戴斌《计算机仿真》2006,23(9):80-82,143

提高语音识别系统的鲁棒性是语音识别技术一个重要的研究课题。语音识别系统往往由于训练环境下的数据和识别环境下的数据不匹配造成系统的识别性能下降,为了让语音识别系统在含噪的环境下获得令人满意的工作性能,该文根据人耳听觉特性提出了一种鲁棒语音特征提取方法。在MFCC特征提取之前先对含噪语音特征进行掩蔽特性处理,同时结合语音增强方法对特征进行处理,最后得到鲁棒语音特征。通过4种不同试验结果分析表明,将这种方法用于抗噪声分析可以提高系统的抗噪声能力;同时这种特征的处理方法对不同噪声在不同信噪比有很好的适应性。相似文献

18.

FRFT滤波的语音增强

王景芳许慧燕《计算机工程与应用》2012,48(12):129-134,167

针对传统去噪方法在强背景噪声情况下,提取声音信号的能力变弱甚至失效与对不同噪声环境适应性差,提出了一种动态FRFT滤波声音信号语音增强方法。给出了不同语音噪声环境下FRFT最优聚散度的更新机制与具体实施方案。用TIMIT标准语音库与Noisex-92噪声库搭配,实验仿真表明,该算法能有效地去噪滤波,显著地提高语音识别系统性能,且在不同的噪声环境和信噪比条件下具有鲁棒性。算法计算代价小,简单易实现。相似文献

19.

Stochastic analysis of the Least Mean Kurtosis algorithm for Gaussian inputs

《Digital Signal Processing》2016

The Least Mean Kurtosis (LMK) algorithm was initially proposed as an adaptive algorithm that is robust to the observation noise distribution. Good performances of this algorithm have been shown for non-Gaussian additive measurement noise. However, the complexity of the algorithm imposes difficulties for the development of a reasonably complete theoretical stochastic model for its behavior. The purpose of this paper is to contribute to the development of such a model. We study the stochastic behavior of Least Mean Kurtosis (LMK) algorithm for Gaussian inputs and for additive noises with even probability density functions. Deterministic recursions are derived for the adaptive weight error covariance matrix in a very novel manner, leading to a recursive model for the excess mean square error (EMSE) behavior that is shown to be accurate for Gaussian, uniform and binary noise distributions. The analysis results are then used to compare the performances of LMK with the least mean squares (LMS) and least mean fourth (LMF) algorithms under different circumstances. 相似文献

20.

Temporal modulation normalization for robust speech feature extraction and recognition

Xugang Lu Shigeki Matsuda Masashi Unoki Satoshi Nakamura 《Multimedia Tools and Applications》2011,52(1):187-199

Speech signals are produced by the articulatory movements with a certain modulation structure constrained by the regular phonetic sequences. This modulation structure encodes most of the speech intelligibility information that can be used to discriminate the speech from noise. In this study, we proposed a noise reduction algorithm based on this speech modulation property. Two steps are involved in the proposed algorithm: one is the temporal modulation contrast normalization, another is the modulation events preserved smoothing. The purpose for these processing is to normalize the modulation contrast of the clean and noisy speech to be in the same level, and to smooth out the modulation artifacts caused by noise interferences. Since our proposed method can be used independently for noise reduction, it can be combined with the traditional noise reduction methods to further reduce the noise effect. We tested our proposed method as a front-end for robust speech recognition on the AURORA-2J data corpus. Two advanced noise reduction methods, ETSI advanced front-end (AFE) method, and particle filtering (PF) with minimum mean square error (MMSE) estimation method, are used for comparison and combinations. Experimental results showed that, as an independent front-end processor, our proposed method outperforms the advanced methods, and as combined front-ends, further improved the performance consistently than using each method independently. 相似文献