首页 | 本学科首页   官方微博 | 高级检索  
     

说话人识别中的因子分析以及空间拼接
引用本文:郭武,李轶杰,戴礼荣,王仁华.说话人识别中的因子分析以及空间拼接[J].自动化学报,2009,35(9):1193-1198.
作者姓名:郭武  李轶杰  戴礼荣  王仁华
作者单位:1.中国科学技术大学多媒体计算与通信教育部-微软重点实验室 合肥 230027
基金项目:国家自然科学基金(60970161);;多媒体计算与通信教育部-微软重点实验室科研基金资助(07122803)~~
摘    要:联合因子分析可以有效拟合混合高斯模型中的说话人和信道差异, 在说话人识别中得到广泛应用. 一般情况下, 该算法在对说话人和信道两个载荷矩阵进行联合估计时, 说话人残差矩阵无法发挥作用, 信道载荷矩阵的因子数不能提高. 本文提出说话人载荷矩阵、说话人残差载荷矩阵采用串行的训练模式, 在信道载荷矩阵训练中采用矩阵拼接的方法, 能够有效提高识别率; 在NIST SRE 2008年核心测试数据库的五个部分分别达到等错误率3.3%, 5.1%, 5.0%, 5.3%和5.0%.

关 键 词:说话人识别    联合因子分析    本征音因子    说话人确认    期望最大化
收稿时间:2008-6-11
修稿时间:2009-1-3

Factor Analysis and Space Assembling in Speaker Recognition
GUO Wu LI Yi-Jie DAI Li-Rong WANG Ren-Hua .Ministry of Education-Microsoft Key Laboratory of Multimedia Computing , Communication,University of Science , Technology of China,Hefei.Factor Analysis and Space Assembling in Speaker Recognition[J].Acta Automatica Sinica,2009,35(9):1193-1198.
Authors:GUO Wu LI Yi-Jie DAI Li-Rong WANG Ren-Hua Ministry of Education-Microsoft Key Laboratory of Multimedia Computing  Communication  University of Science  Technology of China  Hefei
Affiliation:1.Ministry of Education-Microsoft Key Laboratory of Multimedia Computing and Communication, University of Science and Technology of China, Hefei 230027
Abstract:Factor analysis is a model of the speaker and session variability in Gaussian mixture models and is widely used in text-independent speaker recognition. There exist two issues when the loading matrices of the eigenvoice and eigenchannel are estimated jointly. First, the speaker diagonal matrix (residual) will not take effect; second, the channel factors can not be very large. In this paper, the loading matrices of eigenvoice and the diagonal are calculated serially and different eigenchannel matrices are assembled to form a large channel loading matrix. The performance can be improved by the proposed algorithm. In the NIST speaker recognition evaluation (SRE) 2008 core test corpus, the equal error rates (EERs) of the five sub sessions were 3.3%, 5.1%, 5.0%, 5.3%, and 5.0%.
Keywords:Speaker recognition  joint factor analysis  eigenvoice  speaker verification  expectation maximization
本文献已被 CNKI 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号