加入调型信息的汉语孤立词识别研究 Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

加入调型信息的汉语孤立词识别研究

引用本文：	王鹏,胡郁,戴礼荣,刘庆峰. 加入调型信息的汉语孤立词识别研究[J]. 中文信息学报, 2010, 24(4): 85-91

作者姓名：	王鹏胡郁戴礼荣刘庆峰

作者单位：	中国科技大学电子工程与信息科学系科大讯飞语音实验室,安徽合肥230027

摘要：	汉语是一种有调语言,因此在汉语语音识别中,调型信息起着非常关键的作用。在现有的隐马尔可夫模型(Hidden Markov Model)框架下,如何有效地利用调型信息是有待研究的问题。现有的汉语语音识别系统中主要采用两种方式来使用调型信息一种是基于Embedded Tone Model,即将调型特征向量与声学特征向量组成一个流去训练模型;一种是Explicit Tone Model,即将调型信息单独建模,再利用此模型优化原有的解码网络。该文将两种方法统一起来,首先利用Embedded Tone Model采用双流而非单流建模得到Nbest备选,再利用Explicit Tone Model对调进行左相关建模并对Nbest得分重新修正以得到识别结果,从而获得性能提升。与传统的无调模型相比,该文方法的识别率的平均绝对提升超过了3.0%,在第三测试集上的绝对提升达到了5.36%。
关键词：	计算机应用中文信息处理计算机应用汉语信息处理汉语语音识别调型信息调型建模双流建模
Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information

WANG Peng,HU Yu,DAI Lirong,LIU Qingfeng. Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information[J]. Journal of Chinese Information Processing, 2010, 24(4): 85-91

Authors:	WANG Peng HU Yu DAI Lirong LIU Qingfeng

Affiliation:	iFly Speech Lab., Department of Electronic Engineering and Information Science,University of Science and Technology of China,Heifei, Anhui 230027,China

Abstract:	Mandarin is a kind of tonal language and the tone information plays a key role in Mandarin speech recognition. Within the framework of HMM (Hidden Markov Model), how to use tone information effectively is an important and open research issue. In the state-of-art Mandarin speech recognition system, there are two ways to apply tone informationthe one is Embedded Tone Model (in which the tone related features are appended to spectral features to form an augmented acoustic feature vectors to train HMM model), the other is Explicit Tone Model ( in which the one modeling is separated from syllable modeling and tone model is applied to optimize existed decoding network). This paper presents a way to combine these two methods to identify the isolated word in Mandarin speech recognition. Firstly, we get the Nbest items with Embedded Tone Model based on two-stream model rather than conventional single-stream model. Then the Explicit Tone Model based left dependent tonal model is established to re-score the Nbest items. The method proposed achieves over 5.0% absolute improvement in average in all test sets and up to 5.36% absolute improvement in NoiseCar test set compared with traditional model without tone information. Key wordscomputer application; Chinese information processing; computer application; Chinese information processing; Mandarin speech recognition ; tone information; tone model; two-stream model

Keywords:	computer application Chinese information processing computer application Chinese information processing Mandarin speech recognition tone information tone model two-stream model
本文献已被万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏