基于拼音约束联合学习的汉语语音识别 Chinese Speech Recognition Based on Pinyin Constraint and Joint Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于拼音约束联合学习的汉语语音识别

引用本文：	梁仁凤,余正涛,高盛祥,黄于欣,郭军军,许树理.基于拼音约束联合学习的汉语语音识别[J].中文信息学报,2022,36(10):167-172.

作者姓名：	梁仁凤余正涛高盛祥黄于欣郭军军许树理

作者单位：	1.昆明理工大学信息工程与自动化学院,云南昆明 650500; 2.昆明理工大学云南省人工智能重点实验室,云南昆明 650500

基金项目：	国家自然科学基金(61732005, U21B2027, 61972186);云南高新技术产业发展项目(201606);云南省重大科技专项计划(202103AA080015, 202002AD080001-5);云南省基础研究计划(202001AS070014);云南省学术和技术带头人后备人才(202105AC160018)

摘要：	当前的语音识别模型在英语、法语等表音文字中已取得很好的效果。然而,汉语是一种典型的表意文字,汉字与语音没有直接的对应关系,但拼音作为汉字读音的标注符号,与汉字存在相互转换的内在联系。因此,在汉语语音识别中利用拼音作为解码时的约束,可以引入一种更接近语音的归纳偏置。该文基于多任务学习框架,提出一种基于拼音约束联合学习的汉语语音识别方法,以端到端的汉字语音识别为主任务,以拼音语音识别为辅助任务,通过共享编码器,同时利用汉字与拼音识别结果作为监督信号,增强编码器对汉语语音的表达能力。实验结果表明,相比基线模型,该文提出的方法取得了更优的识别效果,词错误率降低了2.24%。
关键词：	端到端汉语语音识别联合学习拼音
收稿时间：	2021-02-21
Chinese Speech Recognition Based on Pinyin Constraint and Joint Learning

LIANG Renfeng,YU Zhengtao,GAO Shengxiang,HUANG Yuxin,GUO Junjun,XU Shuli.Chinese Speech Recognition Based on Pinyin Constraint and Joint Learning[J].Journal of Chinese Information Processing,2022,36(10):167-172.

Authors:	LIANG Renfeng YU Zhengtao GAO Shengxiang HUANG Yuxin GUO Junjun XU Shuli

Affiliation:	1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China;2.Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan 650500, China

Abstract:	In contrast to phonetic languages achieving good performance of Automatic Speech Recognition (ASR) like English and France, Chinese is a logographic language without direct association with its pronunciation. To employ Pinyin which is the symbol system for the pronunciation of Chinese words, we propose an Automatic Speech Recognition method using Pinyin as a constraint for the decoding via multi-task learning framework. We introduce both Pinyin and Chinese character supervising signal to enhance the Chinese speech representing ability in the shared encoder, with Chinese character target ASR as the primary task and Pinyin target ASR as the auxiliary task. Experiments show that the proposed model gains a better recognition result with 2.24% reduction of the word error rate (WER).

Keywords:	end-to-end Chinese speech recognition joint learning Pinyin

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏