首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 453 毫秒
1.
语音业务中鲁棒性VAD算法分析   总被引:4,自引:0,他引:4  
采用话音激活检测(Voiced Activity Detection,VAD)技术的目的是检测语音通信时是否有话音存在,检测到静音时加以抑制,使其不占用或极少占用信道带宽,检测到话音时才对其进行压缩编码与传输.鲁棒性语音识别系统、数字移动通信和因特网实时语音传输等领域要求在恶劣声学环境条件下进行VAD检测,以节省带宽并抑制噪声,因此VAD技术是目前语音处理领域的重要问题.文中给出的几种最新VAD算法(EZCR-VAD,STAT-VAD和E-VAD)是在低信噪比环境下的话音检测具有很好的鲁棒性的算法.  相似文献   

2.
李菁菁  黄孝建  李敬 《通信技术》2010,43(4):164-166
语音在采集和传输过程中,由于语音源的差异、信道的衰减、噪声干扰以及远近效应,导致信号的幅度相差较大,利用自动增益控制(AGC)可以优化信号电平,提高通信质量。而传统的AGC缺少对话音和非话音信号的判别,会导致对非话音信号的放大,造成干扰。深入分析了Speex中基于语音活动性检测(VAD)的AGC算法,通过仿真说明该算法可以实现自动电平控制。  相似文献   

3.
针对现有双通道语音活动检测(Voice Activity Detection, VAD)算法依赖于固定阈值难以在多种噪声环境下准确地检测语音和噪声,应用于手机消噪系统会造成语音失真或噪声消除不好等问题,该文提出一种基于神经网络的VAD算法,该算法以分频带能量差和归一化互通道相关为特征,采用神经网络对语音和噪声进行分类。在此基础上,将神经网络VAD与基于互通道信号功率比值的VAD相结合,提出一种新的适用于手机消噪系统的语音和噪声活动检测算法分别对语音和噪声进行检测,并以此进行噪声抑制处理,减少了消噪系统因VAD误判而造成的性能下降。实验结果表明,该处理方法在抑制背景噪声和减少语音失真等方面优于现有的消噪算法,对于方向性语音干扰也有很好的抑制效果。  相似文献   

4.
曾光  侯嘉 《通信技术》2011,(11):41-43
为了消除android系统电话免提通话时产生的声学回声,利用静音检测(VAD)机制,在android系统开源代码软件asterisk模块中,加入声学回声消除算法。通过不断比较来话音和去话音数据,判断是否为声学回声并进行白噪声替换,测试结果表明在一般的通话环境中,可以消除正常语音通话时90%以上的回声,实现半双工通信,适合于嵌入式android终端设备的开发。  相似文献   

5.
为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法。将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决。实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性。应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率。  相似文献   

6.
为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法.将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决.实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性.应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率.  相似文献   

7.
话音通信中的非连续传输技术   总被引:6,自引:1,他引:5  
非连续传输技术(DTX)主要用在无线通信中,通过在无声期不传或少传语音参数来减少信道传输流量和同信道干扰,提高语音质量,增加系统容量。从系统的角度分析了DTX涉及到的话音活动检测和舒适音产生等技术,这种技术普遍适用于一类基于CELP的话音编解码算法。  相似文献   

8.
王君 《移动通信》2015,(Z1):153-156
通过网络仿真技术对专网通信网络承载不同业务的性能进行了分析,确定了语音指挥系统和音视频编解码模式在IP网络传输实际需要的占用带宽和资源。首先构建了专网通信网络的网络拓扑,并将其映射到OPNET仿真环境中,对网络进行了参数配置以及统计量的收集。在网络上分别进行话音和数据这2种业务的仿真,由仿真结果分析得出,网络在运行话音业务时的实时性和稳定性方面都优于数据业务,根据业务的不同特点进行合理的设计和配置。  相似文献   

9.
该文提出了一种基于概率密度并联距离的话音激活检测算法。算法根据语音信号和噪声信号的Mel域子带能量概率密度的不同特性,引入并联距离定义构造判决函数,通过判断该函数的值来进行语音激活检测。实验结果表明,在不同信噪比情况下,该文算法性能优于G.729B VAD算法。  相似文献   

10.
夏丙寅  鲍长春 《信号处理》2013,29(10):1336-1345
为提高传统噪声估计方法对噪声强度突变的跟踪能力,本文在最小值控制递归平均 (MCRA) 方法基础上提出了噪声估计加速方法。该方法首先检测功率谱的突变,在检测到突变后设定具有自适应长度的拖尾段,并在拖尾段中利用对数似然比、谱熵和平均幅度差函数进行话音活动性检测(VAD),而后结合噪声估计与功率谱最小值比例等辅助参数判定是否对噪声估计进行强制更新。ITU-T G. 160测试结果表明,噪声估计加速算法的引入未对噪声强度平稳情况下的语音增强算法性能产生影响,但显著降低了噪声强度突变时的收敛时间,并在很大程度上抑制了噪声估计收敛段中的音乐噪声。   相似文献   

11.
Voice activity detection (VAD) is used to detect speech and non-speech periods from observed speech signals. It is an important front-end technique for many speech technology applications. Many VAD methods have been proposed. However most of them have been applied under clean or noisy conditions. Only a few methods have been proposed for reverberant conditions, particularly under noisy reverberant conditions. We therefore need to understand the ill effects of noise and reverberation on speech to design an accurate and robust method of VAD under noisy reverberant conditions. The ill effects of noise and reverberation for speech can be regarded as the modulation transfer function (MTF) under noisy and reverberant conditions. Therefore, our study is based on the MTF concept to reduce the ill effects of noise and reverberation on speech, and propose a robust VAD method that we obtained in this study. Noise reduction and dereverberation were first applied to the temporal power envelope of the speech signal to restore the temporal power envelope with this method. Then, power thresholding as a VAD decision was designed based on the restored temporal power envelope. A method of estimating the signal to noise ratio (SNR) was proposed to accurately estimate the SNR in the noise reduction stage. Experiments under both artificial and realistic noisy reverberant conditions were carried out to evaluate the performance of the proposed method of VAD and it was compared with conventional VAD methods. The results revealed that the proposed method significantly outperformed the conventional methods under artificial and realistic noisy reverberant conditions.  相似文献   

12.
Discontinuous transmission based on speech/pause detection represents a valid solution to improve the spectral efficiency of new generation wireless communication systems. In this context, robust voice activity detection (VAD) algorithms are required, as traditional solutions present a high misclassification rate in the presence of the background noise typical of mobile environments. This paper presents a voice detection algorithm which is robust to noisy environments, thanks to a new methodology adopted for the matching process. More specifically, the VAD proposed is based on a pattern recognition approach in which the matching phase is performed by a set of six fuzzy rules, trained by means of a new hybrid learning tool. A series of objective tests performed on a large speech database, varying the signal-to-noise ratio (SNR), the types of background noise, and the input signal level, showed that, as compared with the VAD standardized by ITU-T in Recommendation G.729 annex B, the fuzzy VAD, on average, achieves an improvement in reduction both of the activity factor of about 25% and of the clipping introduced of about 43%. Informal listening tests also confirm an improvement in the perceived speech quality  相似文献   

13.
In this paper, the first real-time implementation and perceptual evaluation of a singular value decomposition (SVD)-based optimal filtering technique for noise reduction in a dual microphone behind-the-ear (BTE) hearing aid is presented. This evaluation was carried out for a speech weighted noise and multitalker babble, for single and multiple jammer sound source scenarios. Two basic microphone configurations in the hearing aid were used. The SVD-based optimal filtering technique was compared against an adaptive beamformer, which is known to give significant improvements in speech intelligibility in noisy environment. The optimal filtering technique works without assumptions about a speaker position, unlike the two-stage adaptive beamformer. However this strategy needs a robust voice activity detector (VAD). A method to improve the performance of the VAD was presented and evaluated physically. By connecting the VAD to the output of the noise reduction algorithms, a good discrimination between the speech-and-noise periods and the noise-only periods of the signals was obtained. The perceptual experiments demonstrated that the SVD-based optimal filtering technique could perform as well as the adaptive beamformer in a single noise source scenario, i.e., the ideal scenario for the latter technique, and could outperform the adaptive beamformer in multiple noise source scenarios.  相似文献   

14.
Traditionally, most of voice activity detection (VAD) methods are based on speech features such as spectrum, temporal energy, and periodicity. The robustness of these features plays a critical role on the performance of VAD. However, since these features are always directly generated from observed signal, the robustness of these features would be significantly degraded in non-stationary noise environments, especially at low level signal-to-noise ratio (SNR) condition. This paper proposes a kind of robust feature for VAD based on sparse representation with an optimized learned dictionary. To do so, a speech dictionary and a noise dictionary are first learned from speech corpus and noise corpus, respectively. Then an optimization algorithm is designed to reduce the mutual coherence between the two learned dictionaries. After that the proposed feature is generated from the optimized dictionary-based sparse representation, and a VAD method is derived from the proposed feature. The proposed method is evaluated over seven types of noise and four types of SNR level, experimental results show that the optimized dictionary is important for enhancing the robustness of the proposed method, and the proposed method performs well under non-stationary noise, especially at low level SNR condition.  相似文献   

15.
柳燕  鲍长春 《信号处理》2006,22(1):57-60
本文提出了一种新的语音激活检测算法,这种方法基于竞争神经网络,主要应用了自组织特征映射网络并结合学习向量量化算法进行实现,并与其它神经网络算法进行了比较。该算法在多种噪声背景下具有较强的鲁棒性,仿真结果表明,这种基于竞争神经网络的算法优于ITU—T G.729B建议的算法。  相似文献   

16.
The performance of the traditional Voice Activity Detection (VAD) algorithms declines sharply in lower Signal-to-Noise Ratio (SNR) environments. In this paper, a feature weighting likelihood method is proposed for noise-robust VAD. The contribution of dynamic features to likelihood score can be increased via the method, which improves consequently the noise robustness of VAD. Divergence based dimension reduction method is proposed for saving computation, which reduces these feature dimensions with smaller divergence value at the cost of degrading the performance a little. Experimental results on Aurora Ⅱ database show that the detection performance in noise environments can remarkably be improved by the proposed method when the model trained in clean data is used to detect speech endpoints. Using weighting likelihood on the dimension-reduced features obtains comparable, even better, performance compared to original full-dimensional feature.  相似文献   

17.
为了有效抑制非平稳背景噪音对语音处理系统的严重干扰,提出了一种基于长短时能量均值的活动语音检测算法。该算法基于两个合理的假设,一个是基于语音隐含成分集的稀疏分解,不但能尽可能地深留含噪语音中的语音信息,还能在一定程度上消除非语音类噪音的干扰;另一个是对上述稀疏分解的语音进行重构,该重构信号中语音段的时域能量高于非语音段的时域能量。在上述两个假设的基础上,采用重构信号的时域能量作为音频特征,以当前帧为中心,并将与其相邻的特定数量帧的短时能量均值作为当前帧的得分值;以当前帧及其之前特定数量帧的长时能量均值怍为判决阈值,进而提出了以当前帧的短时能量均值和长时能量均值大小作为判断条件的活动语音检测算法。买验结果显示,该算法能有效地区分低信噪比(平稳噪音和忙平稳噪音)条件下的语音和非语音片段,并且其性能优于基于单Gaussian分布的似然比算法.  相似文献   

18.
基于多特征的语音端点检测技术研究   总被引:1,自引:0,他引:1  
何彬  柳平  王琦  程行甫  韩林呈 《通信技术》2010,43(11):139-141
针对传统的端点检测技术,如基于能量、过零率等方法,在低信噪比噪声环境下检测性能急剧下降的问题,根据汉语语音发音的特点,提出了一种新的检测方法,该方法结合了Mel频率倒谱系数(MFCC)和能量、过零率、频带方差等多个语音特征。基于多特征融合的模糊判决二次搜索端点检测方法,能有效减少清音、拖尾音的截断,提高端点检测的精度,并对噪声环境具有一定的自适应性。实验结果表明,即使在低信噪比条件下,该方法仍具有较高的准确性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号