首页 | 本学科首页   官方微博 | 高级检索  
     

基于子带双特征的自适应保留似然比鲁棒语音检测算法
引用本文:何伟俊, 贺前华, 吴俊峰, 杨继臣. 基于子带双特征的自适应保留似然比鲁棒语音检测算法[J]. 电子与信息学报, 2016, 38(11): 2879-2886. doi: 10.11999/JEIT160157
作者姓名:何伟俊  贺前华  吴俊峰  杨继臣
基金项目:国家自然科学基金 (61571192),广东省公益项目(2015A010103003),中央高校基本科研业务费项目华南理工大学(2015ZM143)
摘    要:为了进一步提高低信噪比下语音激活检测(VAD)的准确率,该文提出一种基于子带双特征的自适应保留似然比鲁棒语音激活检测算法。算法采用子带归一化最大自相关函数与子带归一化平均过零率双重特征设置频率分量似然比的保留权值,同时利用已过去固定时长的VAD判决结果及对应的子带特征参数自适应地估计似然比的保留阈值。实验结果表明,此算法的VAD检测准确率相比原保留似然比算法在10 dB, 0 dB和-10 dB平稳白噪声下分别提高了1.2%, 7.2%和8.1%,在10 dB和0 dB非平稳Babble噪声下分别提高了1.6%和3.4%。当其被用于2.4 kbps低速率声码器系统时,合成语音的感知语音质量评价(PESQ)比原声码器系统在白噪声下提高了0.098~0.153,在Babble噪声下提高了0.157~0.186。

关 键 词:语音激活检测   似然比   低信噪比   子带过零率
收稿时间:2016-02-04
修稿时间:2016-06-27

Adaptively Reserved Likelihood Ratio-based Robust Voice Activity Detection with Sub-band Double Features
HE Weijun, HE Qianhua, WU Junfeng, YANG Jichen. Adaptively Reserved Likelihood Ratio-based Robust Voice Activity Detection with Sub-band Double Features[J]. Journal of Electronics & Information Technology, 2016, 38(11): 2879-2886. doi: 10.11999/JEIT160157
Authors:HE Weijun  HE Qianhua  WU Junfeng  YANG Jichen
Abstract:In order to improve the correct rate of Voice Activity Detection (VAD) in low Signal Noise Ratio (SNR) environment, the paper presents an adaptive reserved likelihood ratio VAD method, which is based on sub-band double features. The method employs sub-band auto correlate function and sub-band zero crossing rate in the process of setting reserved weight. Reserved threshold is estimated adaptively according to the passed VAD results and their sub-band feature parameters. The experiment shows its promising performance in comparison with similar algorithms, the VAD correct rate is improved by 1.2%, 7.2%, and 8.1% respectively in 10 dB, 0 dB, and -10 dB stationary white noisy environment, 1.6% and 3.4% respectively in 10 dB and 0 dB non-stationary Babble noisy environment. The method is also applied to 2.4 kbps low bit rate vocoder and the Perceptual Evaluation of Speech Quality (PESQ) is improved by 0.098~0.153 in white noisy environment, 0.157~0.186 in Babble noisy environment.
Keywords:Voice Activity Detection (VAD)  Likelihood ratio  Low signal noise ratio  Sub-band zero crossing rate
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号