首页 | 本学科首页   官方微博 | 高级检索  
     

采用子带长时信号变化特征的稳健语音活动检测
引用本文:蔡铁,唐飞,龙志军.采用子带长时信号变化特征的稳健语音活动检测[J].电视技术,2014,38(19).
作者姓名:蔡铁  唐飞  龙志军
作者单位:1. 深圳市可视媒体处理与传输重点实验室,广东深圳 518172;深圳信息职业技术学院,广东深圳 518172
2. 深圳信息职业技术学院,广东深圳,518172
3. 中兴通讯股份有限公司,广东深圳,518057
基金项目:广东省自然科学基金项目(S2011010003890,S2013010012669),深圳市科技计划项目(JC201105190829A)
摘    要:为提高语音活动检测(VAD)在低信噪比下的准确率,提出了一种基于子带长时信号变化特征的VAD算法.将语音信号转换到频域,并分解为几个不重复的子频带,对这些子带信号分别提取长时信号变化特征,然后采用GMM在线建立语音和非语音模型,以模型的似然比进行VAD判决.实验结果表明,算法在较低的信噪比下能够显著地提高语音活动检测的准确率,且在多种噪声环境和信噪比条件下具有较好的稳健性.应用于语音识别系统的实验表明,该算法能有效提高噪声环境下的语音识别率.

关 键 词:语音信号处理  语音活动检测  长时信号变化  子带  语音识别
收稿时间:2014/4/22 0:00:00
修稿时间:2014/5/27 0:00:00

Robust Voice Activity Detection Using Sub-band Long-term Signal Variability Features
cai tie,tangfei and longzhijun.Robust Voice Activity Detection Using Sub-band Long-term Signal Variability Features[J].Tv Engineering,2014,38(19).
Authors:cai tie  tangfei and longzhijun
Affiliation:Shenzhen Key Laboratory of Visual Media Processing and Transmission;Shenzhen Institute of information technology,Shenzhen Institute of Information Technology,ZTE Corporation
Abstract:To improve the accuracy of voice activity detection (VAD) in low signal-to-noise ratio (SNR) environments, an algorithm based on the sub-band long-term signal variability features is proposed. Each frame of speech signal is first divided into several non-repetitive sub-bands and the long-term signal variability features of each sub-band are then calculated. The speech and non-speech models on an utterance-by-utterance basis are then trained on-line using GMM. A likelihood ratio detector is used for VAD decision at the final step. Experimental results show that the algorithm significantly increases the accuracy of voice activity detection at low SNR and provides better noise robustness in various noisy environments at any SNR. In addition, the speech recognition system utilizing the proposed VAD algorithm can achieve better performance in noisy environments.
Keywords:speech signal processing  voice activity detection  long-term signal variability  sub-band  speech recognition
本文献已被 万方数据 等数据库收录!
点击此处可从《电视技术》浏览原始摘要信息
点击此处可从《电视技术》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号