共查询到19条相似文献,搜索用时 140 毫秒
1.
2.
人脸语音动画是虚拟现实领域的热点,语音特征参数提取是实现语音同步动画的前提和关键所在。为了能够提取鲁棒性更好的语音特征参数,在小波变换的理论基础上,借鉴MFCC特征参数的提取方法,运用表征语音动态特征的特征差分算法,提出了一种基于离散小波变换的语音特征参数(DWTMFCC)提取方法,并与反映语音情感特征的韵律参数相结合。通过基于LGB算法的VQ模型进行说话人语音识别,可以得到组合特征参数的识别率较高。 相似文献
3.
基于连续HMM的孤立语音鲁棒性识别方法 总被引:5,自引:1,他引:4
对于基于连续稳马尔可夫模的语音识别系统,为了提高系统在环境噪声下的鲁棒性,本文提出了一种能有效抑制加性平稳噪声和通道卷积噪声的相对自相关序列的Mel倒谱参数(RAS_MFCC+△RAS_MFCC),进行特征参数级的去噪,明显地改善了系统的噪声鲁棒性。 相似文献
4.
基于不变集多小波的语音特征参数提取 总被引:1,自引:0,他引:1
在研究不变集多小波理论的基础上,借鉴Mel频率倒谱系数(MFCC)的提取算法,用多小波交换代替傅里叶变换及Mel滤波.构造了一种新的语音特征参数MWBC。汉语数字识别实验结果表明,提出的新语音特征参数MWBC的识别性能和抗噪性能均优于MFCC,为提高语音识别系统的噪声鲁棒性提供了一条新途径。 相似文献
5.
提出了一种基于小波变换的鲁棒性基音周期检测方法。首先结合平均能量频带分布和短时过零率这两个特征参数对语音信号进行清浊音判决,然后对浊音段采用空域相关函数提取基音周期。实验表明,与传统的小波变换和自相关算法相比,该方法鲁棒性好,对基音检测具有更高的准确性。 相似文献
6.
基于小波变换的鲁棒型特征提取及说话人识别 总被引:4,自引:0,他引:4
说话人识别系统在实际应用中面临的主要困难之一是鲁棒性问题,干净语音环境下识别率很高的说话人识别系统,在有噪语音环境下识别性能显著降低。解决这一问题的方法之一是寻找具有鲁棒性的特征参数。本文结合具有多分辨率分析特点的小波变换技术,提出一种基于小波变换的鲁棒型特征提取算法,以提高说话人识别系统在噪声环境下的识别性能。对40个说话人的语音库SUDA2002-D2,在加性高斯白噪声环境下进行的识别实验结果表明,本文提出的特征提取算法可以有效地提高说话人识别系统在噪声环境下的识别性能。 相似文献
7.
将小波变换的多分辨率特性用于改进Mel频率倒谱系数MFCC的前端处理中,给出了一种新的语音特征参数——小波MFCC。其特点在于采用小波变换、分层FFT和频率合成代替原来MFCC中的FFT部分,使频谱分辨率提高了一倍。试验证明,小波MFCC特征参数在噪声环境和较大词汇量情况下,其抗噪性和识别率均优于MFCC特征参数的结果。 相似文献
8.
9.
10.
针对非平稳环境噪声提出一种基于噪声整形的语音去噪算法.该算法以最小感知均方误差为准则,在Wiener滤波的基础上,采用听觉感知加权函数修正Wiener滤波方程,实现对噪声谱整形,使噪声谱分布特性跟随语音谱而变:同时引入频率补偿因子克服非平稳噪声谱对语音影响的不均匀性;采用快速噪声估计算法实现对非平稳的估计.实验表明,该算法能更有效地抑制背景噪声,提高了去噪后的语音质量. 相似文献
11.
Datao You Jiqing Han Guibin Zheng Tieran Zheng Jie Li 《Circuits, Systems, and Signal Processing》2014,33(7):2267-2291
Traditionally, most of voice activity detection (VAD) methods are based on speech features such as spectrum, temporal energy, and periodicity. The robustness of these features plays a critical role on the performance of VAD. However, since these features are always directly generated from observed signal, the robustness of these features would be significantly degraded in non-stationary noise environments, especially at low level signal-to-noise ratio (SNR) condition. This paper proposes a kind of robust feature for VAD based on sparse representation with an optimized learned dictionary. To do so, a speech dictionary and a noise dictionary are first learned from speech corpus and noise corpus, respectively. Then an optimization algorithm is designed to reduce the mutual coherence between the two learned dictionaries. After that the proposed feature is generated from the optimized dictionary-based sparse representation, and a VAD method is derived from the proposed feature. The proposed method is evaluated over seven types of noise and four types of SNR level, experimental results show that the optimized dictionary is important for enhancing the robustness of the proposed method, and the proposed method performs well under non-stationary noise, especially at low level SNR condition. 相似文献
12.
基于短时能量的语音端点检测算法研究 总被引:14,自引:1,他引:13
研究了噪声环境下,利用短时能量为特征进行语音端点检测的问题。在采用短时全带能量为特征的基础上,提出的算法将短时高频能量作为辅助特征,同时使用了最优边沿检测滤波以及双门限-三态转换判决机制,从而保证了算法在噪声环境下的端点检测准确性和对信号绝对幅度变化的稳健性。实验结果表明,与传统的能量闻值法以及G.729中使用的VAD算法相比,提出的算法在噪声环境下具有更好的性能,是一个简单、高效和稳健的语音端点检测算法。 相似文献
13.
According to the decline of recognition rate of speech recognition system in the noise environments, an improved perceptually non-uniform spectral compression feature extraction algorithm is put forward in this paper. This method can realize an effective compression of the speech signals and make the training and recognition environments more matching, so the recognition rate can be improved in the noise environments. By experimenting on the intelligent wheelchair platform, the result shows that the algorithm can effectively enhance the robustness of speech recognition, and ensure the recognition rate in the noise environments. 相似文献
14.
15.
16.
17.
18.
Ho‐Young Jung 《ETRI Journal》2004,26(3):273-276
We propose a novel feature processing technique which can provide a cepstral liftering effect in the log‐spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance‐based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log‐spectral domain corresponding to the cepstral liftering. The proposed method performs a high‐pass filtering based on the decorrelation of filter‐bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature. 相似文献