首页 | 本学科首页   官方微博 | 高级检索  
     

多分形谱簇研究及其在说话人识别中的应用
引用本文:周宇欢,张雄伟,付强,徐鑫,王金明.多分形谱簇研究及其在说话人识别中的应用[J].信号处理,2011,27(12):1914-1919.
作者姓名:周宇欢  张雄伟  付强  徐鑫  王金明
作者单位:解放军理工大学指挥自动化学院
基金项目:2009江苏省自然科学基金资助
摘    要:语音是一种复杂的非线性信号,这使得基于线性系统理论发展起来的传统说话人识别技术性能难以进一步提高。本文提出了多分形谱簇分析方法,用于分析语音信号的非线性特征,并应用于短语音(2秒)说话人识别。通过对Cantor集的仿真实验,发现不同标度区能反映出系统不同阶段的生长规律,因此可用一组连续变化的多分形谱分层次地表征系统的分形特性,即多分形谱簇分析方法。然后结合语信号的分形特点,提出一种语音的多分形谱簇特征(Multifractal Spectrum Cluster Feature, MSCF)的提取方法。最后将几种非线性特征与短时谱特征结合用于说话人识别,基于TIMIT数据库50人的实验表明,非线性特征与短时谱特征互补性较强,特别是MSCF与MFCC、LPC特征结合,使得系统的误识率下降到0.8%。 

关 键 词:说话人识别    多分形谱簇    标度区    高斯混合模型
收稿时间:2011-01-25

Research on Multifractal Spectrum Cluster and Its Application in Speaker Recognition
ZHOU Yu-huan , ZHANG Xiong-wei , FU Qiang , XU Xin , WANG Jin-ming.Research on Multifractal Spectrum Cluster and Its Application in Speaker Recognition[J].Signal Processing,2011,27(12):1914-1919.
Authors:ZHOU Yu-huan  ZHANG Xiong-wei  FU Qiang  XU Xin  WANG Jin-ming
Affiliation:Institute of Command Automation, PLAUST
Abstract:Speech is a complicated nonlinear signal, so traditional speaker recognition technology based on the linear theory is difficult to be further improved. Hence, the multifractal spectrum cluster analytical method is proposed, and applied to the analysis of nonlinear characteristic of the speech signal in speaker recognition of short speech. Through extensive experiments for Cantor sets, it is found that sub-scaling ranges, which were neglected by traditional multifractal method, actually reflected the growth pattern in different growth stages. Therefore, in order to fully consider the fractal characteristics contained in different scaling range, the multifractal spectrum cluster analytical method is proposed to describe the multi-level fractal characteristics accurately and comprehensively. Then, according to the characteristic of the speech signal, an extraction method of speaker multifractal spectrum cluster feature (MSCF) is proposed, which could combine with short-term spectral feature in feature layer effectively. Finally, the combinations of several nonlinear features and short-term spectral feature are applied to speaker recognition. Experiment results based on the TIMIT show that nonlinear feature and short-term spectral feature are highly complementary, which make the error rate of speaker recognition system decrease obviously, especially the combination of MSCF, MFCC and LPC can reduce the error rate to 0.8% in short speech speaker recognition. 
Keywords:
本文献已被 万方数据 等数据库收录!
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号