首页 | 本学科首页   官方微博 | 高级检索  
     

基于音素解码的语种识别系统联合自适应算法研究
引用本文:邓妍, 张卫强, 刘加. 基于音素解码的语种识别系统联合自适应算法研究. 自动化学报, 2012, 38(4): 652-658. doi: 10.3724/SP.J.1004.2012.00652
作者姓名:邓妍  张卫强  刘加
作者单位:1.清华大学电子工程系清华信息科学与技术国家实验室(筹) 北京 100084
基金项目:国家自然科学基金(60931160443,61005019)资助~~
摘    要:针对真实环境下的语种识别,信道类型和通话内容等非语种方面因素的不同都会造成测试和训练条件的不匹配, 从而影响系统的识别性能.本文以音素识别器后接向量空间模型(Phone recognizer followed by vector space model, PRVSM)为语种识别系统,引入联合自适应算法来解决系统中测试和训练条件的失配问题.研究了三种自适应方法用于系统的不同阶段: 1)基于受约束的最大似然线性回归(Constrained maximum likelihood linear regression, CMLLR)的声学模型自适应; 2)基于全局N元文法的音位特征向量自适应; 3) VSM模型中的支持向量机(Support vector machines, SVM)自适应.在综合采用多种自适应技术后, PRVSM系统的性能有了较大的提高,在NIST LRE 2009测试库上对于30s、10s和3s的测试段, 基于不同音素识别器的PRVSM系统的等错误率(Equal error rate, EER)分别相对降低了18%~23%、12%~20%以及5%~9%.

关 键 词:语种识别   音素识别器后接向量空间模型   联合自适应   受约束的最大似然线性回归   支持向量机自适应
收稿时间:2011-01-17
修稿时间:2011-07-07

Research on Joint Adaptation for Phonotactic Language Recognition
DENG Yan, ZHANG Wei-Qiang, LIU Jia. Research on Joint Adaptation for Phonotactic Language Recognition. ACTA AUTOMATICA SINICA, 2012, 38(4): 652-658. doi: 10.3724/SP.J.1004.2012.00652
Authors:DENG Yan  ZHANG Wei-Qiang  LIU Jia
Affiliation:1. Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084
Abstract:For language recognition in real application,a variety of non-language sources(i.e.,channel,content,etc.) will induce mismatch between training and test utterances,which affects the recognition accuracy.This paper introduces joint adaptation to deal with the mismatch problem for the phone recognition followed by vector space model(PRVSM) system.We investigate three adaptation methods in different stage of the system:1) acoustic model adaptation using constrained maximum likelihood linear regression(CMLLR);2) phonotactic feature adaptation using the universal N-grams;3) adapt-SVM for the vector space model(VSM).The joint adaptation is carried out by combining these methods and significant improvements can be obtained.Experiments on the NIST LRE 2009 evaluation corpus show that there are relative decreases of 18% ~23%,12%~20% and 5%~9% in EER for the 30s,10s and 3s test conditions,respectively.
Keywords:Language recognition  phone recognizer followed by vector space model (PRVSM)  joint adaptation  constrained maximum likelihood linear regression (CMLLR)  adapt-support vector machines (SVM)
本文献已被 CNKI 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号