首页 | 本学科首页   官方微博 | 高级检索  
     


Tibetan Multi-Dialect Speech and Dialect Identity Recognition
Authors:Yue Zhao  Jianjian Yue  Wei Song  Xiaona Xu  Xiali Li  Licheng Wu  Qiang Ji
Affiliation:School of Information and Engineering, Minzu University of China, Beijing, 100081, China. Rensselaer Polytechnic Institute, JEC 7004, Troy NY 12180-3590, USA.
Abstract:Tibetan language has very limited resource for conventional automatic speech recognition so far. It lacks of enough data, sub-word unit, lexicons and word inventories for some dialects. And speech content recognition and dialect classification have been treated as two independent tasks and modeled respectively in most prior works. But the two tasks are highly correlated. In this paper, we present a multi-task WaveNet model to perform simultaneous Tibetan multi-dialect speech recognition and dialect identification. It avoids processing the pronunciation dictionary and word segmentation for new dialects, while, in the meantime, allows training speech recognition and dialect identification in a single model. The experimental results show our method can simultaneously recognize speech content for different Tibetan dialects and identify the dialect with high accuracy using a unified model. The dialect information used in output for training can improve multi-dialect speech recognition accuracy, and the low-resource dialects got higher speech content recognition rate and dialect classification accuracy by multi-dialect and multi-task recognition model than task-specific models.
Keywords:Tibetan multi-dialect speech recognition   dialect identification   multi-task learning   wavenet model.
点击此处可从《计算机、材料和连续体(英文)》浏览原始摘要信息
点击此处可从《计算机、材料和连续体(英文)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号