首页 | 本学科首页   官方微博 | 高级检索  
     

基于拼音约束联合学习的汉语语音识别
引用本文:梁仁凤,余正涛,高盛祥,黄于欣,郭军军,许树理.基于拼音约束联合学习的汉语语音识别[J].中文信息学报,2022,36(10):167-172.
作者姓名:梁仁凤  余正涛  高盛祥  黄于欣  郭军军  许树理
作者单位:1.昆明理工大学 信息工程与自动化学院,云南 昆明 650500;
2.昆明理工大学 云南省人工智能重点实验室,云南 昆明 650500
基金项目:国家自然科学基金(61732005, U21B2027, 61972186);云南高新技术产业发展项目(201606);云南省重大科技专项计划(202103AA080015, 202002AD080001-5);云南省基础研究计划(202001AS070014);云南省学术和技术带头人后备人才(202105AC160018)
摘    要:当前的语音识别模型在英语、法语等表音文字中已取得很好的效果。然而,汉语是一种典型的表意文字,汉字与语音没有直接的对应关系,但拼音作为汉字读音的标注符号,与汉字存在相互转换的内在联系。因此,在汉语语音识别中利用拼音作为解码时的约束,可以引入一种更接近语音的归纳偏置。该文基于多任务学习框架,提出一种基于拼音约束联合学习的汉语语音识别方法,以端到端的汉字语音识别为主任务,以拼音语音识别为辅助任务,通过共享编码器,同时利用汉字与拼音识别结果作为监督信号,增强编码器对汉语语音的表达能力。实验结果表明,相比基线模型,该文提出的方法取得了更优的识别效果,词错误率降低了2.24%。

关 键 词:端到端  汉语语音识别  联合学习  拼音  
收稿时间:2021-02-21

Chinese Speech Recognition Based on Pinyin Constraint and Joint Learning
LIANG Renfeng,YU Zhengtao,GAO Shengxiang,HUANG Yuxin,GUO Junjun,XU Shuli.Chinese Speech Recognition Based on Pinyin Constraint and Joint Learning[J].Journal of Chinese Information Processing,2022,36(10):167-172.
Authors:LIANG Renfeng  YU Zhengtao  GAO Shengxiang  HUANG Yuxin  GUO Junjun  XU Shuli
Affiliation:1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China;2.Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
Abstract:In contrast to phonetic languages achieving good performance of Automatic Speech Recognition (ASR) like English and France, Chinese is a logographic language without direct association with its pronunciation. To employ Pinyin which is the symbol system for the pronunciation of Chinese words, we propose an Automatic Speech Recognition method using Pinyin as a constraint for the decoding via multi-task learning framework. We introduce both Pinyin and Chinese character supervising signal to enhance the Chinese speech representing ability in the shared encoder, with Chinese character target ASR as the primary task and Pinyin target ASR as the auxiliary task. Experiments show that the proposed model gains a better recognition result with 2.24% reduction of the word error rate (WER).
Keywords:end-to-end  Chinese speech recognition  joint learning  Pinyin  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号