共查询到20条相似文献,搜索用时 156 毫秒
1.
在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测. 相似文献
2.
针对传统能量熵的短时能量与子带谱熵容易受噪声环境影响,低信噪比下端点检测性能下降的问题,提出一种基于噪声估计的改进能量熵语音端点检测算法.首先对语音进行噪声估计并以此计算语音存在概率;然后利用估计的噪声能量修正短时能量,用语音存在概率作为加权系数优化子带谱熵,并将两者结合生成改进的能量熵;最后给出基于噪声估计的动态门限以及实时的端点检测策略.实验结果表明,在信噪比5 dB、0 dB的多种噪声环境中,基于噪声估计的改进能量熵端点检测算法相比传统能量熵算法与改进子带能谱比算法,检测正确率平均提升7%. 相似文献
3.
针对短时TEO能量算法抗噪性差的缺点,提出了一种强噪声下的端点检测新算法.该算法在短时TEO能量端点检测的基础上,增加Mel倒谱距离判断环节,采用先粗判后精判的互补性两级判决机制.首先利用强抗噪性Mel倒谱距离进行端点粗判,然后再利用体现语音信号时域特征与语音共振峰特性的短时TEO能量进行端点精判.实验表明,在信噪比相对较低的环境下,该改进算法与传统的双门限法和短时TEO能量相比,在没有增加运算复杂度的同时提高了检测系统的准确度. 相似文献
4.
5.
6.
7.
传统语音端点检测方法利用语音和噪声在某单一参数特征上的差异进行信号中语音起止点的切分,但不同参数在低信噪比不同噪声环境下表现不稳定,鲁棒性差。因此,本文提出了基于均匀子带谱方差,能熵比,梅尔倒谱距离,似然比四种参数相融合的语音端点检测方法。该方法能自适应地改变各参数阈值,并通过实时监测噪声段能熵比的值确定所采用的投票判决机制,从而进行语音端点判定。实验结果表明,该方法在低信噪比下较常用的端点检测方法有更高的检测正确率及鲁棒性,对语音信号后续处理工作有一定的借鉴意义。 相似文献
8.
为了提高低信噪比下语音端点检测的准确性,提出一种基于经验模态分解与功率谱熵的语音端点检测方法。对带噪语音信号进行经验模态分解获得一系列语音本征模函数,选取功率谱熵作为语音端点检测的特征,并计算特定阶本征模函数的功率谱熵实现语音的端点检测。通过EMD分解可以有效地消除白噪声的影响,仿真结果表明,在低噪比情况下结合经验模态分解和功率谱熵的方法能够有效实现语音端点检测。 相似文献
9.
基于LPC美尔倒谱特征的带噪语音端点检测 总被引:2,自引:0,他引:2
复杂的噪声环境是语音识别系统在实际应用中性能下降的原因之一,识别预处理中的带噪端点检测作为关键技术,其性能的优劣某种程度上决定了识别率的高低。笔者提出了基于LPC美尔倒谱特征的带噪端点检测方法,对语音信号分高低频段分别提取IPC美尔倒谱特征分析,根据Mel倒谱距离判决,采用自适应噪声估计,实验结果表明,该方法计算效率较高,低信噪比下有较好的检测性能。 相似文献
10.
在强背景噪声的情况下,针对传统倒谱距离法端点检测难以判断语音段起止点的问题,提出了一种基于多窗谱估计的谱减法与改进的倒谱距离语音端点检测新方法.首先对每一帧带噪信号进行多窗谱估计得到平滑功率谱,提取前导无话段平均功率谱,再利用谱减法对带噪语音信号进行减噪处理,对语音的减噪是为了更好地进行下一步的端点检测,然后对传统的倒谱距离门限阈值进行改进,得到一种改进的自适应阈值,并结合倒谱距离法进行端点检测.通过仿真实验结果表明,与传统倒谱距离端点检测算法相比,本文方法提高了低信噪比语音端点检测的精度,具有良好的鲁棒性能. 相似文献
11.
12.
Shota Morita Masashi Unoki Xugang Lu Masato Akagi 《Journal of Signal Processing Systems》2016,82(2):163-173
Voice activity detection (VAD) is used to detect speech and non-speech periods from observed speech signals. It is an important front-end technique for many speech technology applications. Many VAD methods have been proposed. However most of them have been applied under clean or noisy conditions. Only a few methods have been proposed for reverberant conditions, particularly under noisy reverberant conditions. We therefore need to understand the ill effects of noise and reverberation on speech to design an accurate and robust method of VAD under noisy reverberant conditions. The ill effects of noise and reverberation for speech can be regarded as the modulation transfer function (MTF) under noisy and reverberant conditions. Therefore, our study is based on the MTF concept to reduce the ill effects of noise and reverberation on speech, and propose a robust VAD method that we obtained in this study. Noise reduction and dereverberation were first applied to the temporal power envelope of the speech signal to restore the temporal power envelope with this method. Then, power thresholding as a VAD decision was designed based on the restored temporal power envelope. A method of estimating the signal to noise ratio (SNR) was proposed to accurately estimate the SNR in the noise reduction stage. Experiments under both artificial and realistic noisy reverberant conditions were carried out to evaluate the performance of the proposed method of VAD and it was compared with conventional VAD methods. The results revealed that the proposed method significantly outperformed the conventional methods under artificial and realistic noisy reverberant conditions. 相似文献
13.
14.
Beritelli F. Casale S. Cavallaero A. 《Selected Areas in Communications, IEEE Journal on》1998,16(9):1818-1829
Discontinuous transmission based on speech/pause detection represents a valid solution to improve the spectral efficiency of new generation wireless communication systems. In this context, robust voice activity detection (VAD) algorithms are required, as traditional solutions present a high misclassification rate in the presence of the background noise typical of mobile environments. This paper presents a voice detection algorithm which is robust to noisy environments, thanks to a new methodology adopted for the matching process. More specifically, the VAD proposed is based on a pattern recognition approach in which the matching phase is performed by a set of six fuzzy rules, trained by means of a new hybrid learning tool. A series of objective tests performed on a large speech database, varying the signal-to-noise ratio (SNR), the types of background noise, and the input signal level, showed that, as compared with the VAD standardized by ITU-T in Recommendation G.729 annex B, the fuzzy VAD, on average, achieves an improvement in reduction both of the activity factor of about 25% and of the clipping introduced of about 43%. Informal listening tests also confirm an improvement in the perceived speech quality 相似文献
15.
为了改善在复杂环境下声源定位算法的性能,提出了一种新的时延估计(TDE)方法,即基于传递函数比的统计模型方法(ATFR-SM)。该方法采用统计模型去除噪声对传递函数(ATF)的影响,在计算传递函数时对功率谱密度(PSD)进行平滑和“白化”,以去除混响对传递函数的影响。同时,算法中引入话音激活检测(VAD)去除对求取传递函数无用的噪声段,以提高时延估计的准确性。此外,将所提时延估计方法与线性定位法相结合,构成一套完整的声源定位方法。实验结果表明,在复杂环境下,时延估计方法具有更低的异常点百分比(PAP)和均方根误差(RMSE),且明显优于传统的参考算法,同时声源定位方法具有更高的定位精度。 相似文献
16.
基于短时能量的语音端点检测算法研究 总被引:14,自引:1,他引:13
研究了噪声环境下,利用短时能量为特征进行语音端点检测的问题。在采用短时全带能量为特征的基础上,提出的算法将短时高频能量作为辅助特征,同时使用了最优边沿检测滤波以及双门限-三态转换判决机制,从而保证了算法在噪声环境下的端点检测准确性和对信号绝对幅度变化的稳健性。实验结果表明,与传统的能量闻值法以及G.729中使用的VAD算法相比,提出的算法在噪声环境下具有更好的性能,是一个简单、高效和稳健的语音端点检测算法。 相似文献
17.
针对现有双通道语音活动检测(Voice Activity Detection, VAD)算法依赖于固定阈值难以在多种噪声环境下准确地检测语音和噪声,应用于手机消噪系统会造成语音失真或噪声消除不好等问题,该文提出一种基于神经网络的VAD算法,该算法以分频带能量差和归一化互通道相关为特征,采用神经网络对语音和噪声进行分类。在此基础上,将神经网络VAD与基于互通道信号功率比值的VAD相结合,提出一种新的适用于手机消噪系统的语音和噪声活动检测算法分别对语音和噪声进行检测,并以此进行噪声抑制处理,减少了消噪系统因VAD误判而造成的性能下降。实验结果表明,该处理方法在抑制背景噪声和减少语音失真等方面优于现有的消噪算法,对于方向性语音干扰也有很好的抑制效果。 相似文献
18.
19.
针对语音增强技术中先验信噪比参数的估计问题,本文通过结合两步噪声消除技术以及语音与噪声分量的高斯统计模型,在频率域中提出了一种新的先验信噪比估计算法。该算法基于直接判决方法的输出结果,利用最小均方误差估计理论直接计算当前帧纯净语音分量的谱能量,以获取带噪语音的先验信噪比估计。算法在保留两步噪声消除算法优点的基础上,无需语音增强系统中增益因子的任何先验条件,且在有效消除背景噪声的同时能够最大程度地抑制输出语音中音乐噪声的生成。多种噪声背景下的仿真结果表明:相对于经典的直接判决方法和新近的两步噪声消除算法,基于本文先验信噪比估计方案的语音增强系统在主观与客观评价标准下都具有更加优良的语音增强效果。 相似文献
20.
A. Revathi R. Chinnadurai Y. Venkataramani 《International Journal of Electronics》2013,100(12):1171-1179
This paper discusses the new method on noise reduction exploiting the combined effects of wavelet decomposition, ICA and spectral analysis on noisy speech. The input noisy speech is wavelet decomposed into two signals. Wavelet entropy is computed based on the modified probability density function for the signal derived from the approximation coefficients during wavelet decomposition. By proper entropy comparison, the starting frame is detected. Between the two signals obtained from the wavelet decomposition, one is speech combined with noise and another one is noise alone. These two signals are analysed in independent component analysis (ICA) domain, in order to generate an enhanced speech. Zero-crossing rate is computed and used to discriminate between speech and noise. Then, spectral analysis is performed on the noise prior to starting frame and noisy speech. Elimination of noise frequencies in the noisy speech leads to noise reduced speech. Subjective analysis and experimental results show the considerable noise reduction capability of the proposed algorithm. 相似文献