基于发声机理与人耳感知特性的说话人识别 Speaker Recognition Based on Vocal Mechanism and Human Ear Perceptual Characteristic期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于发声机理与人耳感知特性的说话人识别

引用本文：	杜晓青,于凤芹.基于发声机理与人耳感知特性的说话人识别[J].计算机工程,2013(11):197-199,204.

作者姓名：	杜晓青于凤芹

作者单位：	江南大学物联网工程学院,江苏无锡214122

基金项目：	国家自然科学基金资助项H（61075008）

摘要：	Mel频率倒谱系数（MFCC）与线性预测倒谱系数（LPCC）融合算法只能反映语音静态特征，且LPCC对语音低频局部特征描述不足。为此，提出将希尔伯特黄变换（HHT）倒谱系数与相对光谱一感知线性预测倒谱系数（RASTA—PLPCC）融合，得到一种既反映发声机理又体现人耳感知特性的说话人识别算法。HHT倒谱系数体现发声机理，能反映语音动态特性，并更好地描述信号低频局部特征，可改进LPCC的不足。PLPCC体现人耳感知特性，识别性能强于MFCC，用3种融合算法对两者进行融合，将融合特征用于高斯混合模型进行说话人识别。仿真实验结果表明，该融合算法较已有的MFCC与LPCC融合算法识别率提高了8．0％。
关键词：	说话人识别发声机理人耳感知特性希尔伯特黄变换倒谱系数感知线性预测倒谱系数 Relative Spectra滤波
Speaker Recognition Based on Vocal Mechanism and Human Ear Perceptual Characteristic

DU Xiao-qing,YU Feng-qin.Speaker Recognition Based on Vocal Mechanism and Human Ear Perceptual Characteristic[J].Computer Engineering,2013(11):197-199,204.

Authors:	DU Xiao-qing YU Feng-qin

Affiliation:	(School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China)

Abstract:	The fusion algorithm of Mel Frequency Cepstral Coefficient（MFCC） and Linear Prediction Cepstrum Coeficient（LPCC） can only react the static characteristics of the speech and LPCC can not describe the local characteristics of the speech low frequency well. So the fusion of Hilbert-Huang Transform（HHT） cepstrum coefficient and Relative Spectra-Perception Linear Prediction Cepstrum Coefficient（RASTA-PLPCC） is proposed, getting a new speaker recognition algorithm that reflects both vocal mechanism and human ear perceptual characteristics. The HHT cepstrum coefficient reflects the human vocal mechanism, and it can reflect the dynamic characteristics of the speech, as well as better describe the local characteristics of the speech low frequency. PLPCC reflects the human ear perceptual characteristics, whose identification performance is better than the MFCC. Two features are combined with the three fusion algorithms, and the fusion feature is sent into the Gaussian mixture model to do speaker recognition. Simulation results demonstrate that compared with the fusion of LPCC and MFCC, the fusion algorithm gets higher recognition rate, and recognition rate is increased by 8.0%.

Keywords:	speaker recognition vocal mechanism human ear perceptual characteristic Hilbert-Huang Transform（HHT） cepstrumcoefficient perception linear prediction cepstrum coefficient Re lative Spectra filtering
本文献已被维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏