首页 | 本学科首页   官方微博 | 高级检索  
     

资源稀缺蒙语语音识别研究
引用本文:张爱英,倪崇嘉.资源稀缺蒙语语音识别研究[J].计算机科学,2017,44(10):318-322.
作者姓名:张爱英  倪崇嘉
作者单位:山东财经大学系统科学与信息处理研究所 济南250014,山东财经大学系统科学与信息处理研究所 济南250014
基金项目:本文受国家自然科学基金(61305027),山东省自然科学基金(ZR2011FQ024),山东省高等学校科技计划(J17KB160)资助
摘    要:随着语音识别技术的发展,资源稀缺语言的语音识别系统的研究吸引了更广泛的关注。以蒙语为目标识别语言,研究了在资源稀缺的情况下(如仅有10小时的带标注的语音)如何利用其他多语言信息提高识别系统的性能。借助基于多语言深度神经网络的跨语言迁移学习和基于多语言深度Bottleneck神经网络的抽取特征可以获得更具有区分度的声学模型。通过搜索引擎以及网络爬虫的定向抓取获得大量的网页数据,有助于获得文本数据,以增强语言模型的性能。融合多个不同识别结果以进一步提高识别精度。与基线系统相比,多种系统融合的识别绝对错误率减少12%。

关 键 词:资源稀缺  多语言深度神经网络  Web语言模型
收稿时间:2016/9/29 0:00:00
修稿时间:2017/1/17 0:00:00

Research on Low-resource Mongolian Speech Recognition
ZHANG Ai-ying and NI Chong-jia.Research on Low-resource Mongolian Speech Recognition[J].Computer Science,2017,44(10):318-322.
Authors:ZHANG Ai-ying and NI Chong-jia
Affiliation:Institute of System Science and Information Processing,Shandong University of Finance and Economics,Jinan 250014,China and Institute of System Science and Information Processing,Shandong University of Finance and Economics,Jinan 250014,China
Abstract:With the development of speech recognition technology,the research on low-resource speech recognition has gained extensive attention.Taking the Mongolian as the target language,we studied how to use the multilingual information to improve the performance of speech recognition in the low-resource condition,for example,only 10 hours of transcribed speech data are used for acoustic modeling.More discriminative acoustic model can be gotten by using cross-lingual transfer of multilingual deep neural network and multilingual deep bottleneck features.Large amount of web pages can be gotten by using the web search engine and Web crawler,which can help to get large amount of text data for improving the performance of language model.It can further improve the recognition results by fusing different number of recognition results from different recognizers.Comparing the fusion recognition result with the recognition result of baseline system,there are nearly 12% absolute word error rate (WER) reductions.
Keywords:Low-resource  Multilingual deep neural network  Web based language model
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号