首页 | 本学科首页   官方微博 | 高级检索  
     

SMFCC:一种新的语音信号特征提取方法
引用本文:汪海彬,余正涛,毛存礼,郭剑毅.SMFCC:一种新的语音信号特征提取方法[J].计算机应用,2016,36(6):1735-1740.
作者姓名:汪海彬  余正涛  毛存礼  郭剑毅
作者单位:1. 昆明理工大学 信息工程与自动化学院, 昆明 650500;2. 昆明理工大学 智能信息处理重点实验室, 昆明 650500
基金项目:国家自然科学基金资助项目(61262041,61472168);云南省自然科学基金重点项目(2013FA030)。
摘    要:针对说话人识别系统中存在的有效语音特征提取以及噪声影响的问题,提出了一种新的语音特征提取方法——基于S变换的美尔倒谱系数(SMFCC)。该方法是在传统美尔倒谱系数(MFCC)的基础上利用S变换的二维时频多分辨率特性,以及奇异值分解(SVD)方法的二维时频矩阵有效去噪性,并结合相关统计分析方法最终获得语音特征。采用TIMIT语音数据库,将所提的特征和现有特征进行对比实验。SMFCC特征的等错误率(EER)和最小检测代价(MinDCF)均小于线性预测倒谱系数(LPCC)、MFCC及其结合方法LMFCC,比MFCC的EER和MinDCF08分别下降了3.6%与17.9%。实验结果表明所提方法能够有效去除语音信号中的噪声,提升局部分辨率。

关 键 词:S变换  奇异值分解  基于S变换的美尔倒谱系数  高斯混合模型-通用背景模型  说话人识别  
收稿时间:2015-11-02
修稿时间:2016-01-18

SMFCC: a novel feature extraction method for speech signal
WANG Haibin,YU Zhengtao,MAO Cunli,GUO Jianyi.SMFCC: a novel feature extraction method for speech signal[J].journal of Computer Applications,2016,36(6):1735-1740.
Authors:WANG Haibin  YU Zhengtao  MAO Cunli  GUO Jianyi
Affiliation:1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming Yunnan 650500, China;2. Intelligent Information Processing Key Laboratory, Kunming University of Science and Technology, Kunming Yunnan 650500, China
Abstract:Aiming at the problems of effective feature extraction of speech signal and influence of noise in speaker recognition, a novel method called Mel Frequency Cepstral Coefficients based on S-transform (SMFCC) was proposed for speech feature extraction. The speech features were obtained which were based on traditional Mel Frequency Cepstral Coefficients (MFCC), employed the properties of two-dimensional Time-Frequency (TF) multiresolution in S-transform and effective denoising of two-dimensional TF matrix with Singular Value Decomposition (SVD) algorithm, and combined with other related statistic methods. Based on the TIMIT corpus, the extracted features were compared with the current features by the experiment. The Equal Error Rate (EER) and Minimum Detection Cost Function (MinDCF) of SMFCC were smaller than those of Linear Prediction Cepstral Coefficient (LPCC), MFCC, and LMFCC; especially, the EER and MinDCF08 of SMFCC were decreased by 3.6% and 17.9% respectively compared to MFCC.The experimental results show that the proposed method can eliminate the noise in the speech signal effectively and improve local speech signal feature resolution.
Keywords:S-transform  Singular Value Decomposition (SVD)  Mel Frequency Cepstral Coefficients based on S-transform (SMFCC)  Gaussian Mixture Model-Universal Background Model (GMM-UBM)  speaker recognition  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号