基于GMM-UBM和GLDS-SVM的英文发音错误检测方法 Automatic Mispronunciation Detection for English Learners by GMM-UBM and GLDS-SVM Methods期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于GMM-UBM和GLDS-SVM的英文发音错误检测方法

引用本文：	李宏言,黄申,王士进,梁家恩,徐波.基于GMM-UBM和GLDS-SVM的英文发音错误检测方法[J].自动化学报,2010,36(2):332-336.

作者姓名：	李宏言黄申王士进梁家恩徐波

作者单位：	1.中国科学院自动化研究所数字内容技术研究中心北京 100190

基金项目：	国家高技术研究发展计划(863计划)(2006AA010103)资助~~

摘要：	将语种和说话人识别的方法应用到英语发音错误检测系统, 提出一种基于广义线性区分序列支持向量机 (Generalized linear discriminant sequence based SVM, GLDS-SVM)的发音错误检测方法. 主要创新点为: 1)提出一种基于状态拼接的特征规整方案, 增强SVM对发音特征的建模能力; 2)提出一种基于多模型融合的模型训练策略, 该策略可以更加充分地利用训练数据, 并在一定程度上解决了由于真实发音错误数据缺乏造成的正负样本不均衡的问题; 3)将GLDS-SVM与基于通用背景模型GMM (Universal background models based GMM, GMM-UBM)的方法进行融合, 以进一步提高发音检错性能. GLDS-SVM和GMM-UBM的融合系统在仿真测试集和真实测试集上的等错误率 (Equal error rate, EER)分别达到9.92%和16.35%. 同时, GLDS-SVM在模型占用空间和运算速度方面均比传统径向基函数 (Radial basic function, RBF)核方法具有明显优势.
关键词：	计算机辅助语言学习自动发音错误检测支持向量机特征规整多模型融合策略
收稿时间：	2009-3-19
修稿时间：	2009-10-21
Automatic Mispronunciation Detection for English Learners by GMM-UBM and GLDS-SVM Methods

LI Hong-Yan HUANG Shen WANG Shi-Jin LIANG Jia-En XU Bo ,.Digital Content Technology Research Center.Automatic Mispronunciation Detection for English Learners by GMM-UBM and GLDS-SVM Methods[J].Acta Automatica Sinica,2010,36(2):332-336.

Authors:	LI Hong-Yan HUANG Shen WANG Shi-Jin LIANG Jia-En XU Bo Digital Content Technology Research Center

Affiliation:	1.Digital Content Technology Research Center, Institute of Automation, Chinese Academy of Sciences, Beijing 100190;2.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190

Abstract:	The paper proposes an efficient generalized linear discriminant sequence based SVM (GLDS-SVM) based mispronunciation detection method. Firstly, in order to enhance the ability of describing pronunciation characteristics, we introduce an improved SVM feature normalization scheme based on state-concatenated operation. Then, we propose a novel multi-model strategy for model training to make full use of samples and solve the problem of data unbalance caused by lack of the actual mispronunciation corpus. Finally, we combine GLDS-SVM with universal background models based GMM (GMM-UBM) to further improve the performance. The fused system by these two methods achieves 9.92% and 16.35% in equal error rate (EER) for simulation set and real set, respectively. Meanwhile, GLDS-SVM processes a higher computation speed and smaller model size than traditional radial basic function (RBF) kernel.

Keywords:	Computer assisted language learning (CALL) automatic mispronunciation detection support vector machine (SVM) feature normalization multi-model fusion strategy
本文献已被 CNKI 等数据库收录！
	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏