基于HMM模型的语音单元边界的自动切分 Automatic Phonetic Segmentation Using HMM Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于HMM模型的语音单元边界的自动切分

引用本文：	王丽娟,曹志刚.基于HMM模型的语音单元边界的自动切分[J].数据采集与处理,2005,20(4):381-384.

作者姓名：	王丽娟曹志刚

作者单位：	清华大学电子工程系,北京,100084;清华大学电子工程系,北京,100084

摘要：	基于隐尔马可夫模型（HMM）的强制对齐方法被用于文语转换系统（TTS）语音单元边界切分.为提高切分准确性,本文对HMM模型的特征选择,模型参数和模型聚类进行优化.实验表明：12维静态Mel频率倒谱系数（MFCC）是最优的语音特征;HMM模型中的状态模型采用单高斯;对于特定说话人的HMM模型,使用分类与衰退树（CART）聚类生成的绑定状态模型个数在3 000左右最优.在英文语音库中音素边界切分的实验中,切分准确率从模型优化前的77.3%提高到85.4%.
关键词：	语音单元边界自动切分隐尔马可夫模型文语转换系统
文章编号：	1004-9037（2005）04-0381-04
收稿时间：	2005-03-21
修稿时间：	2005-05-25
Automatic Phonetic Segmentation Using HMM Model

WANG Li-juan,CAO Zhi-gang.Automatic Phonetic Segmentation Using HMM Model[J].Journal of Data Acquisition & Processing,2005,20(4):381-384.

Authors:	WANG Li-juan CAO Zhi-gang

Affiliation:	Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China

Abstract:	HMM models are widely used in the automatic speech recognition system to segment text-to-speech(TTS) units in the forced alignment mode.To improve the segmentation performance,the optimal acoustic feature selection and the training condition of the HMM model are discussed.Experimental results show that the static 12-D Mel-frequency cepstral coefficient(MFCC) feature is the optimal acoustic feature;the optimal number of Gaussian mixture components per state is 1;the optimal number of tied states after model clustering by the classification and regreession tree(CART) is about 3 000 for speaker-dependent tri-phone HMM models.With optimized parameters,the segmentation accuracy on English test corpus is increased from 77.3% to 85.4%.

Keywords:	acoustic unit boundary automatic segmentation HMM text-to-speech system
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏