首页 | 本学科首页   官方微博 | 高级检索  
     

语料资源缺乏的连续语音识别方法的研究
引用本文:伊·达瓦,匂坂芳典,中村哲.语料资源缺乏的连续语音识别方法的研究[J].自动化学报,2010,36(4):550-557.
作者姓名:伊·达瓦  匂坂芳典  中村哲
作者单位:1.日本独立行政法人信息通信技术研究所 京都 日本 619-0288
基金项目:日本独立行政法人情报通信研究机构多语言高新技术语音–文本处理研究项目资助~~
摘    要:由于少数民族语言有其本身的特点, 不能简单地套用现有的连续语音识别的方法. 本文以蒙古语为例, 研讨了声学和语言模型的建立, 并在日本国际电气通信基础技术研究所的连续语音识别器上实现了蒙古语的语音识别系统. 本文侧重于语言模型的建立, 基于蒙古语黏着性语言特点, 提出用相似词聚类方法建立多类N-gram模型. 实验结果显示, 应用我们提出的语言模型, 识别精度比用传统的词的N-gram识别法提高了5.5%.

关 键 词:蒙古语    黏着语言    相似词分类    连续语语音识别    多类语言模型
收稿时间:2009-2-6
修稿时间:2009-5-4

Investigation of ASR Systems for Resource-deficient Languages
I·Dawa,SAGISAKA Yoshinori,NAKAMURA Satoshi.Investigation of ASR Systems for Resource-deficient Languages[J].Acta Automatica Sinica,2010,36(4):550-557.
Authors:I·Dawa  SAGISAKA Yoshinori  NAKAMURA Satoshi
Affiliation:1.National Institute of Information and Communications Technology (NICT), Kyoto 619-0288, Japan;2.Global Information and Telecommunication Institute (GITI), Waseda University, Tokyo 169-855, Japan;3.Advanced Telecommunications Research Institute International (ATR), Kyoto 619-0288, Japan
Abstract:Because the minority languages in China have their special characteristics, it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods which are used for some major languages, such as Chinese, English, Japanese, etc. In this paper, we take Mongolian (a resource-deficient language) as an example and build the acoustic and language models for applying the ATRASR system. In this paper, we specially focus on the language modeling aspect by considering the special characteristics of the Mongolian. We trained a multi-class N-gram language model based on similar word clustering. By applying the proposed language model, the system could improve the performance by 5.5% compared with the conventional word N-gram.
Keywords:Mongolian language  agglutinative language  similar word clustering  continuous speech recognition  multi-class N-gram model
本文献已被 CNKI 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号