首页 | 本学科首页   官方微博 | 高级检索  
     

加入调型信息的汉语孤立词识别研究
引用本文:王鹏,胡郁,戴礼荣,刘庆峰. 加入调型信息的汉语孤立词识别研究[J]. 中文信息学报, 2010, 24(4): 85-91
作者姓名:王鹏  胡郁  戴礼荣  刘庆峰
作者单位:中国科技大学 电子工程与信息科学系 科大讯飞语音实验室,安徽 合肥230027
摘    要:汉语是一种有调语言,因此在汉语语音识别中,调型信息起着非常关键的作用。在现有的隐马尔可夫模型(Hidden Markov Model)框架下,如何有效地利用调型信息是有待研究的问题。现有的汉语语音识别系统中主要采用两种方式来使用调型信息 一种是基于Embedded Tone Model,即将调型特征向量与声学特征向量组成一个流去训练模型;一种是Explicit Tone Model,即将调型信息单独建模,再利用此模型优化原有的解码网络。该文将两种方法统一起来,首先利用Embedded Tone Model采用双流而非单流建模得到Nbest备选,再利用Explicit Tone Model对调进行左相关建模并对Nbest得分重新修正以得到识别结果,从而获得性能提升。与传统的无调模型相比,该文方法的识别率的平均绝对提升超过了3.0%,在第三测试集上的绝对提升达到了5.36%。

关 键 词:计算机应用  中文信息处理  计算机应用  汉语信息处理  汉语语音识别  调型信息  调型建模  双流建模  

Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information
WANG Peng,HU Yu,DAI Lirong,LIU Qingfeng. Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information[J]. Journal of Chinese Information Processing, 2010, 24(4): 85-91
Authors:WANG Peng  HU Yu  DAI Lirong  LIU Qingfeng
Affiliation:iFly Speech Lab., Department of Electronic Engineering and Information Science,University of Science and
Technology of China,Heifei, Anhui 230027,China
Abstract:Mandarin is a kind of tonal language and the tone information plays a key role in Mandarin speech recognition. Within the framework of HMM (Hidden Markov Model), how to use tone information effectively is an important and open research issue. In the state-of-art Mandarin speech recognition system, there are two ways to apply tone informationthe one is Embedded Tone Model (in which the tone related features are appended to spectral features to form an augmented acoustic feature vectors to train HMM model), the other is Explicit Tone Model ( in which the one modeling is separated from syllable modeling and tone model is applied to optimize existed decoding network). This paper presents a way to combine these two methods to identify the isolated word in Mandarin speech recognition. Firstly, we get the Nbest items with Embedded Tone Model based on two-stream model rather than conventional single-stream model. Then the Explicit Tone Model based left dependent tonal model is established to re-score the Nbest items. The method proposed achieves over 5.0% absolute improvement in average in all test sets and up to 5.36% absolute improvement in NoiseCar test set compared with traditional model without tone information.
Key wordscomputer application; Chinese information processing; computer application; Chinese information processing; Mandarin speech recognition ; tone information; tone model; two-stream model
Keywords:computer application   Chinese information processing   computer application   Chinese information processing   Mandarin speech recognition    tone information   tone model   two-stream model
 
        
 
        
 
        
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号