首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度学习的语言模型研究进展
引用本文:王乃钰,叶育鑫,刘露,凤丽洲,包铁,彭涛. 基于深度学习的语言模型研究进展[J]. 软件学报, 2021, 32(4): 1082-1115
作者姓名:王乃钰  叶育鑫  刘露  凤丽洲  包铁  彭涛
作者单位:吉林大学计算机科学与技术学院,吉林长春 130012;吉林大学计算机科学与技术学院,吉林长春 130012;符号计算与知识工程教育部重点实验室(吉林大学),吉林长春 130012;吉林大学软件学院,吉林长春 130012;符号计算与知识工程教育部重点实验室(吉林大学),吉林长春 130012;Department of Computer Science, University of Illinois at Chicago, Chicago 60607, USA
基金项目:国家自然科学基金(61872163,61806084);吉林省教育厅项目(JJKH20190160KJ)
摘    要:语言模型旨在对语言的内隐知识进行表示,作为自然语言处理的基本问题,一直广受关注.基于深度学习的语言模型是目前自然语言处理领域的研究热点,通过预训练-微调技术展现了内在强大的表示能力,并能够大幅提升下游任务性能.围绕语言模型基本原理和不同应用方向,以神经概率语言模型与预训练语言模型作为深度学习与自然语言处理结合的切入点,...

关 键 词:语言模型  预训练  深度学习  自然语言处理  神经语言模型
收稿时间:2020-05-03
修稿时间:2020-09-01

Language Models Based on Deep Learning: A Review
WANG Nai-Yu,YE Yu-Xin,LIU Lu,FENG Li-Zhou,BAO Tie,PENG Tao. Language Models Based on Deep Learning: A Review[J]. Journal of Software, 2021, 32(4): 1082-1115
Authors:WANG Nai-Yu  YE Yu-Xin  LIU Lu  FENG Li-Zhou  BAO Tie  PENG Tao
Affiliation:College of Computer Science and Technology, Jilin University, Changchun 130012, China;College of Computer Science and Technology, Jilin University, Changchun 130012, China;Key Laboratory of Symbol Computation and Knowledge Engineering for Ministry of Education, Jilin University, Changchun 130012, China;College of Software, Jilin University, Changchun 130012, China;Key Laboratory of Symbol Computation and Knowledge Engineering for Ministry of Education, Jilin University, Changchun 130012, China;Department of Computer Science, University of Illinois at Chicago, Chicago 60607, USA
Abstract:Language model, to express implicit knowledge of language, has been widely concerned as a basic problem of natural language processing in which the current research hotspot is the language model based on deep learning. Through pre-training and fine-tuning techniques, language models show their inherently power of representation, also improve the performance of downstream tasks greatly. Around the basic principles and different application directions, this paper takes the neural probability language model and the pre-training language model as a pointcut for combining deep learning and natural language processing. We introduce the application as well as challenges of neural probability and pre-training model, which is based on the basic concepts and theories of language model. Then the existing neural probability, pre-training language model include their methods are compared and analyzed. In addition, we elaborate on the training methods of pre-training language model from two aspects of new training tasks and improved network structure. Meanwhile the current research directions of pre-training model in scale compression, knowledge fusion, multi-modality and cross-language are summarized and evaluated. Finally, this paper sums up the bottleneck of language model in natural language processing application, afterwards prospects for possible future research priorities.
Keywords:language model|pre-training|deep learning|natural language processing|neural language model
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号