基于语速调整和音位属性后验概率的音素识别 A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于语速调整和音位属性后验概率的音素识别

引用本文：	许友亮,张连海,张文林,李永彬.基于语速调整和音位属性后验概率的音素识别[J].信号处理,2012,28(2):295-300.

作者姓名：	许友亮张连海张文林李永彬

作者单位：	信息工程大学信息工程学院,郑州,450002

基金项目：	国家自然科学基金(61175017)

摘要：	基于语音事件检测的自动语音识别是当前研究的热点问题.针对说话人语速变化导致模型适应性差的问题,提出了一种语速自适应调整算法.该算法以语句为单位,采用连续变化的帧长与帧移间隔对语句进行归一化调整,使调整后速率与语料库平均速率一致,减小速率因素对模型训练的影响；另外,通过计算音位属性的后验概率向量间夹角,得到测试集的语速,相比采用训练模型的语速检测方法减轻了系统负担.本文将语速调整算法应用于音位属性的提取,并对音位属性特征进行非线性变换,最后采用隐马尔科夫模型进行建模,实验表明:经过语速调整后,音素的平均持续帧数较为恒定,动态变化范围减小,使得音素识别率提升了1.3％.
关键词：	语速调整音位属性检测隐马尔可夫模型自动语音识别
A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition

XU You-liang , ZHANG Lian-hai , ZHANG Wen-lin , LI Yong-bin.A Speaking Rate Adaptation Technique and phonological Attribute Posterior for Phone Recognition[J].Signal Processing,2012,28(2):295-300.

Authors:	XU You-liang ZHANG Lian-hai ZHANG Wen-lin LI Yong-bin

Affiliation:	(Institute of Information Engineering,Information Engineering University,Zhengzhou 450002)

Abstract:	The event detection-based method has become state of the art technique in Automatic Speech Recognition (ASR ).The differences in speaking rate may impair the adaptation ability of acoustical models,On account of this,A novel adaptation algorithm is proposed in this paper,which adjust the frame and step size in the front end of the system with the cell of one utterance,after adaptation,the speaking rate consistent with the average rate of the speech corpus and decreasing it’ s effect in model training.In addition,this method calculates the angle between vectors of the posterior probability to get the speed of the testing set,which eased the burden of system compared to that by training models.The algorithm was used in the pre-processing before the phonological features detection stage,and then with the nonlinear transformation,we put them as the observation of Hidden Markov Models based phone recognition systems.After the adaptation approach,the average frame of one phone in an utterance becomes constant and the dynamic range decreases,therefore the phoneme classification rate increase about 1.3%.

Keywords:	Speaking Rate Adaptation Phonological Attributes Detection Hidden Markov Models Automatic Speech Recognition
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏